Python API

This section lists the available classes and functions in the Python API. If you’re writing pipelines with the Spec API (e.g., pipeline.yaml file), you won’t interact with this API directly. However, you may still want to learn about ploomber.spec.DAGSpec if you need to load your pipeline as a Python object.

For code examples using the Python API, click here.


DAG([name, clients, executor])

A collection of tasks with dependencies


A subclass of ploomber.OnlineDAG to provider a simpler interface for online DAGs whose terminal task calls model.predict.


Execute partial DAGs in-memory.


An object to customize DAG behavior

InMemoryDAG(dag[, return_postprocessor])

Converts a DAG to a DAG-like object that performs all operations in memory (products are not serialized).


Task(product, dag[, name, params])

Abstract class for all Tasks

PythonCallable(source, product, dag[, name, …])

Run a Python callable (e.g.

NotebookRunner(source, product, dag[, name, …])

Run a Jupyter notebook using papermill.

SQLScript(source, product, dag[, name, …])

Execute a script in a SQL database to create a relation or view

SQLDump(source, product, dag[, name, …])

Dumps data from a SQL SELECT statement to a file(s)

SQLTransfer(source, product, dag[, name, …])

Transfers data from a SQL database to another (Note: this relies on pandas, only use it for small to medium size datasets)

SQLUpload(source, product, dag[, name, …])

Upload data to a SQL database from a parquet or a csv file.

PostgresCopyFrom(source, product, dag[, …])

Efficiently copy data to a postgres database using COPY FROM (faster alternative to SQLUpload for postgres).

ShellScript(source, product, dag[, name, …])

Execute a shell script in a shell

DownloadFromURL(source, product, dag[, …])

Download a file from a URL (uses urllib.request.urlretrieve)

Link(product, dag, name)

A dummy Task used to “plug” an external Product to a pipeline, this task is always considered up-to-date

Input(product, dag, name)

A dummy task used to represent input provided by the user, it is always considered outdated.



Abstract class for all Products

File(identifier[, client])

A file (or directory) in the local filesystem


A product that represents a SQL relation but has no metadata

PostgresRelation(identifier[, client])

A PostgreSQL relation

SQLiteRelation(identifier[, client])

A SQLite relation

GenericSQLRelation(identifier[, client])

A GenericProduct whose identifier is a SQL relation, uses SQLite as metadata backend

GenericProduct(identifier[, client])

GenericProduct is used when there is no specific Product implementation.



Abstract class for all clients

DBAPIClient(connect_fn, connect_kwargs[, …])

A client for a PEP 249 compliant client library

SQLAlchemyClient(uri[, split_source, …])

Client for connecting with any SQLAlchemy supported database

ShellClient([run_template, …])

Client to run command in the local shell


DAGSpec(data[, env, lazy_import, reload, …])

A DAG spec is a dictionary with certain structure that can be converted to a DAG using DAGSpec.to_dag().



A function decorated with @with_env that starts and environment during the execution of a function.


A function decorated with @load_env will be called with the current environment in an env keyword argument


Return the current environment


serializer([extension_mapping, fallback, …])

Decorator for serializing functions

serializer_pickle(obj, product)

A serializer that pickles everything

unserializer([extension_mapping, fallback, …])

Decorator for unserializing functions


An unserializer that unpickles everything


SourceLoader([path, module])

Load source files using a jinja2.Environment