Database configuration

To have SQL scripts as tasks, you must configure a database client. There are two available clients: ploomber.clients.SQLAlchemyClient and ploomber.clients.DBAPIClient, we recommend using the sqlalchemy client if your database is supported because it is compatible with more types of SQL tasks (e.g., ploomber.tasks.SQLDump, which dumps data into a local file).

Using SQLAlchemyClient

Ensure that you can connect to the database using sqlalchemy:

from sqlalchemy import create_engine

engine = create_engine('DATABASE_URI')

DATABASE_URI depends on the type of database. sqlalchemy supports a wide range of databases; you can find a list in their documentation, while others come in third-party packages (e.g., Snowflake).

If create_engine is successful, ensure you can query your database:

with engine.connect() as conn:
    results = conn.execute('SELECT * FROM some_table LIMIT 10')

If the query works, you can initialize a SQLAlchemyClient with the same DATABASE_URI:

from ploomber.clients import SQLAlchemyClient

client = SQLAlchemyClient('DATABASE_URI')

Using Snowflake

Here’s some sample code to configure Snowflake:

# install snowflake-sqlalchemy
pip install snowflake-sqlalchemy

Build your URL with the helper function:

from snowflake.sqlalchemy import URL

params = dict(user='user',
              password='pass',
              account='acct',
              warehouse='warehouse',
              database='db',
              schema='schema',
              role='role')

client = SQLAlchemyClient(URL(**params))

If using OAuth instead of user/password authentication, you need to include the token:

import json
import requests # pip install requests
from snowflake.sqlalchemy import URL

def get_snowflake_token(username, password, account):
    headers = {'content-type': 'application/x-www-form-urlencoded'}
    data = {
        'grant_type': 'password',
        'scope': 'SESSION:ROLE-ANY',
        'username: username,
        'password': password,
        'client_id: f'https://{account}.snowflakecomputing.com',
    }
    response = requests.post(oauth_url, data=data, headers=headers,
                             verify=False)

    return str(json.loads(response.text)['access_token']).strip()

token = get_snowflake_token('user', 'password', 'account')

params = dict(user='user',
              account='acct',
              warehouse='warehouse',
              database='db',
              schema='schema',
              role='role',
              authentication='oauth',
              token=token)

client = SQLAlchemyClient(URL(**params))

Using DBAPIClient

DBAPIClient takes a function that returns a DBAPI compatible connection and parameters to initialize such connection.

Here’s an example with SQLite:

from ploomber.clients import DBAPIClient
import sqlite3

client = DBAPIClient(sqlite3.connect, dict(database='my.db'))

Under the hood, Ploomber calls sqlite3.connect(database='my.db').

Another example, this time using Snowflake:

from ploomber.clients import DBAPIClient
import snowflake.connector

params = dict(user='USER', password='PASS', account='ACCOUNT')
client = DBAPIClient(snowflake.connector.connect, params)

Configuring the client in pipeline.yaml

Check out the SQL pipelines to learn how to configure the database client in your pipeline.yaml file.