Release v0.0.10 (What’s new?).

Documentation Status https://travis-ci.org/MacHu-GWU/sqlalchemy_mate-project.svg?branch=master https://codecov.io/gh/MacHu-GWU/sqlalchemy_mate-project/branch/master/graph/badge.svg https://img.shields.io/pypi/v/sqlalchemy_mate.svg https://img.shields.io/pypi/l/sqlalchemy_mate.svg https://img.shields.io/pypi/pyversions/sqlalchemy_mate.svg https://img.shields.io/badge/STAR_Me_on_GitHub!--None.svg?style=social
https://img.shields.io/badge/Link-Document-blue.svg https://img.shields.io/badge/Link-API-blue.svg https://img.shields.io/badge/Link-Source_Code-blue.svg https://img.shields.io/badge/Link-Install-blue.svg https://img.shields.io/badge/Link-GitHub-blue.svg https://img.shields.io/badge/Link-Submit_Issue-blue.svg https://img.shields.io/badge/Link-Request_Feature-blue.svg https://img.shields.io/badge/Link-Download-blue.svg

Welcome to sqlalchemy_mate Documentation

A library extend sqlalchemy module, save you from writing 50% database.

Features

Read Database Credential Safely

Put your database connection credential in your source code is always a BAD IDEA.

sqlalchemy_mate provides several options to allow loading credential easily.

If you want to read db secret from other source, such as Bash Scripts that having lots of export DB_PASSWORD="xxx", AWS Secret Manager, AWS Key Management System (KMS), please take a look at my another project pysecret.

From json file

You can put your credential in a json file somewhere in your $HOME directory, and let sqlalchemy_mate smartly load from it.

You need to specify two things:

  1. path to json file.
  2. field path to the data. If your connect info is nested deeply in the json, you can use the dot notation json path to point to it.

content of json:

{
    "credentials": {
        "db1": {
            "host": "example.com",
            "port": 1234,
            "database": "test",
            "username": "admin",
            "password": "admin",
        },
        "db2": {
            ...
        }
    }
}

code:

from sqlalchemy_mate import EngineCreator

ec = EngineCreator.from_json(
    json_file="path-to-json-file",
    json_path="credentials.db1", # dot notation json path
)
engine = ec.create_postgresql_psycopg2()

Default data fields are host, port, database, username, password.

If your json schema is different, you need to add the key_mapping to specify the field name mapping:

ec = EngineCreator.from_json(
    json_file="...",
    json_path="...",
    key_mapping={
        "host": "your-host-field",
        "port": "your-port-field",
        "database": "your-database-field",
        "username": "your-username-field",
        "password": "your-password-field",
    }
)

From $HOME/.db.json

You can put lots of database connection info in a .db.json file in your $HOME directory.

from sqlalchemy_mate import EngineCreator

ec = EngineCreator.from_home_db_json(identifier="db1")
engine = ec.create_postgresql_psycopg2()

$HOME/.db.json assumes flat json schema, but you can use dot notation json path for identifier to adapt any json schema:

{
    "identifier1": {
        "host": "example.com",
        "port": 1234,
        "database": "test",
        "username": "admin",
        "password": "admin",
    },
    "identifier2": {
        ...
    }
}

From json file on AWS S3

This is similar to from_json, but the json file is stored on AWS S3.

from sqlalchemy_mate import EngineCreator
ec = EngineCreator.from_s3_json(
    bucket_name="my-bucket", key="db.json",
    json_path="identifier1",
    aws_profile="my-profile",
)
engine = ec.create_redshift()

From Environment Variable

You can put your credentials in Environment Variable. For example:

export DB_DEV_HOST="..."
export DB_DEV_PORT="..."
export DB_DEV_DATABASE="..."
export DB_DEV_USERNAME="..."
export DB_DEV_PASSWORD="..."
from sqlalchemy_mate import EngineCreator
# read from DB_DEV_USERNAME, DB_DEV_PASSWORD, ...
ec = EngineCreator.from_env(prefix="DB_DEV")
engine = ec.create_redshift()

If you want to read database credential safely from cloud, for example, AWS EC2, AWS Lambda, you can use AWS KMS to decrypt your credentials

# leave aws_profile=None if you are on cloud
ec = EngineCreator.from_env(prefix="DB_DEV", kms_decrypt=True, aws_profile="xxx")
engine = ec.create_redshift()

Smart Insert

In bulk insert, if there are some rows having primary_key conflict, the classic solution is:

for row in data:
    try:
        engine.execute(table.insert(), row)
    except sqlalchemy.sql.IntegrityError:
        pass

It is like one-by-one insert, which is super slow.

sqlalchemy_mate uses smart_insert strategy to try with smaller bulk insert, which has higher probabily to work. As a result, total number of commits are greatly reduced.

With sql expression:

from sqlalchemy_mate import inserting
engine = create_engine(...)
t_users = Table(
    "users", metadata,
    Column("id", Integer),
    ...
)
# lots of data
data = [{"id": 1, "name": "Alice}, {"id": 2, "name": "Bob"}, ...]
# the magic function
inserting.smart_insert(engine, t_users, data)

With ORM:

from sqlalchemy_mate import ExtendedBase
Base = declarative_base()
class User(Base, ExtendedBase): # inherit from ExtendedBase
    ...
# lots of users
data = [User(id=1, name="Alice"), User(id=2, name="Bob"), ...]
# the magic method
User.smart_insert(engine_or_session, data) # That's it

Smart Update / Upsert

Automatically update value by primary key.

# in SQL expression
#
from sqlalchemy_mate import updating
data = [{"id": 1, "name": "Alice}, {"id": 2, "name": "Bob"}, ...]
updating.update_all(engine, table, data)
updating.upsert_all(engine, table, data)
# in ORM
#
data = [User(id=1, name="Alice"), User(id=2, name="Bob"), ...]
User.update_all(engine_or_session, user_list)
User.upsert_all(engine_or_session, user_list)

Install

sqlalchemy_mate is released on PyPI, so all you need is:

$ pip install sqlalchemy_mate

To upgrade to latest version:

$ pip install --upgrade sqlalchemy_mate

About the Author

(\ (\
( -.-)o    I am a lovely Rabbit!
o_(")(")

Sanhe Hu is a very active Python Developer Since 2010. Research area includes Machine Learning, Big Data Infrastructure, Block Chain, Business Intelligent, AWS, Distributive System. Love photography, outdoor, arts, game, and also the best Python.