Core API¶

Bulk insert and Count rows
Smart Single / Bulk Insert
Smart Single / Bulk Update
Smart Single Bulk Upsert
Selecting Shortcuts
Deleteing Short cuts

In this section, we demonstrate the simplified version with sqlalchemy_mate to manipulate data using core API.

First, let’s define a table to get start, everything looks normal so far.

import sqlalchemy as sa

metadata = sa.MetaData()

t_users = sa.Table(
    "users", metadata,
    sa.Column("id", sa.Integer, primary_key=True),
    sa.Column("name", sa.String, nullable=True)
)

# For syntax testing, you could use sqlite
# But you could see significant performance improvement in main stream
# sql database for bulk inserts
engine = sa.create_engine("sqlite:///:memory:")

metadata.create_all(engine)

Bulk insert and Count rows ¶

We want to insert 3 random user data into the database and do some basic query:

import random

user_data_list = [
    dict(id=random.randint(1, 1000),
    for _ in range(3)
]

With sqlalchemy_mate

import sqlalchemy_mate as sam

# do bulk insert
sam.inserting.smart_insert(engine, t_users, user_data_list)

# returns number of row in a table.
# should returns 3
row_counts = sam.selecting.count_row(engine, t_users)

# return one row by primary key values
# {"id": 1, "name": None}
user = sam.selecting.by_pk(engine, t_users, 1)

Without sqlalchemy_mate

with engine.connect() as connection:
    # do bulk insert
    connection.execute(t_users.insert(), user_data_list)

    # returns number of row in a table.
    # should returns 3
    row_counts = connection.execute(sa.select(sa.func.count()).select_from(t_users)).one()[0]

   # return one row by primary key values
    # {"id": 1, "name": None}
    user = connection.execute(sa.select(t_users).where(t_users.c.id==1)).one()

Now the syntax sugar looks like just so so, let’s move to the next example.

Smart Single / Bulk Insert ¶

Now we already have 3 items in database, let’s try to insert 1,000 users to the table.

user_data_list = [
    dict(id=id_)
    for id_ in range(1, 1000+1)
]

With sqlalchemy_mate

# Core insert logic = 1 line
import time

start_time = time.process_time()
op_count, ins_count = sam.inserting.smart_insert(engine, t_users, user_data_list)
elapsed = time.process_time() - start_time
print(op_count) # 60 bulk INSERT sql command fired to database
print(ins_count) # 997 data inserted
print(elapsed) # 0.019572 in local postgres database
row_counts = sam.selecting.count_row(engine, t_users)
print(row_counts) # now we have 1000 rows

Without sqlalchemy_mate

# Core insert logic = 7 line
import time
from sqlalchemy.exc import IntegrityError

start_time = time.process_time()
with engine.connect() as connection:
    ins = t_users.insert()
    for user_data in user_data_list:
        try:
            connection.execute(ins, user_data)
        except IntegrityError:
            pass
    elapsed = time.process_time() - start_time
    print(elapsed) # 0.181163
    row_counts = connection.execute(sa.select(sa.func.count()).select_from(t_users)).one()[0]
    print(row_counts)

sqlachemy_mate is significantly faster than native sqlalchemy. Because it smartly split big dataset into smaller pack, hence the total number of INSERT sql actually fired to database is greatly reduced. In this test case, sqlclhemy_mate is 10x faster with a Postgres DB on local, in real use case it could save more times because they are remote user.

Smart Single / Bulk Update ¶

A common update use case is to locate a row by primary key, and update non primary key fields.

With sqlalchemy_mate

# update
# before, it is {"id": 1, "name": None}
print(sam.selecting.by_pk(engine, t_users, 1))

# do single update
user_data = dict(id=1, name="Alice")
sam.updating.update_all(engine, t_users, user_data)

# after, it is {"id": 1, "name": None}
print(sam.selecting.by_pk(engine, t_users, 1))

# do multiple update
user_data_list = [
    dict(id=1, name="Alice"),
    dict(id=2, name="Bob"),
    dict(id=3, name="Cathy"),
]
sam.updating.update_all(engine, t_users, user_data_list)

Without sqlalchemy_mate

# do single update
with engine.connect() as connection:
    connection.execute(t_users.update().where(t_users.c.id==1).values(name="Alice"))

# do multiple update
user_data_list = [
    dict(id=1, name="Alice"),
    dict(id=2, name="Bob"),
    dict(id=3, name="Cathy"),
]
with engine.connect() as connection:
    for user in user_data_list:
        connection.execute(t_users.update().where(t_users.c.id==user["id"]).values(**user)

Smart Single Bulk Upsert ¶

Upsert is a database dependent feature that not available in all sql system. upsert_all`() function made upsert generally available to all SQL system and super easy to use. Internally there’s an optimization that collect “to insert” items and bulk insert them fast.

With sqlalchemy_mate

# prepare your data
user_data_list = [
    dict(id=999, name="Alice"),
    dict(id=1000, name="Bob"),
    dict(id=1001, name="Cathy"),
    dict(id=1002, name="David"),
]

# use ``upsert_all`` method
sam.updating.upsert_all(engine, t_users, user_data_list)