Machine Learning¶
Ploomber has many features specifically tailored to accelerate Machine Learning workflows.
Tip
Check out our sklearn-evaluation library. It contains a large collection of Machine Learning evaluation plots, an experiment tracker, and many other features!
Data cleaning and feature engineering¶
Data cleaning and feature engineering are highly iterative processes, Ploomber accelerates them via incremental builds, which allow you to introduce changes to your pipeline and bring results up-to-date without having to re-compute everything from scratch.
Experiment tracking¶
Ploomber also plays nicely with experiment trackers, allowing you to train hundreds of models and track the results.
Example: Integration with MLflow
pip install ploomber
ploomber examples -n templates/mlflow -o ploomber-mlflow
Parallel experiments¶
To help you find the best performing model, Ploomber allows you to parallelize Machine Learning experiments.
Example: Running a grid of experiments in parallel
pip install ploomber
ploomber examples -n cookbook/grid -o grid
Example: Model selection with nested cross-validation
pip install ploomber
ploomber examples -n cookbook/nested-cv -o nested-cv
Large-scale model training¶
If one machine isn’t enough, you can parallelize training jobs in a cluster by exporting your pipeline to any of our supported platforms (Kubernetes, Airflow, and AWS Batch).
Deployment¶
Once you find the best performing model, you can deploy it for batch processing or as an online API.