Serialization

Note

This is a quick reference, for an in-depth tutorial, click here.

By default, tasks receive a product argument and must take care of serializing their outputs at the passed location. Serialization allows tasks to return their outputs and delegate serialization to a dedicated function.

For example, your task may look like this:

def my_function():
    # no need to serialize here, simply return the output
    return [1, 2, 3]

Important

Serialization only works on function tasks.

And your serializer may look like this:

@serializer(fallback='joblib', defaults=['.csv', '.txt'])
def my_serializer(obj, product):
    pass

Resources

  1. A complete example.

  2. An example showing tasks with a variable number of output files.

  3. Serialization User Guide (explains the API step-by-step).