320 likes | 345 Views
A platform for the Complete Machine Learning Lifecycle. Corey Zumar. March 27 th , 2019. Outline. Overview of ML development challenges How MLflow tackles these challenges MLflow components Demo How to get started. Machine Learning Development is Complex. ML Lifecycle. Delta.
E N D
A platform for the Complete Machine Learning Lifecycle Corey Zumar March 27th, 2019
Outline • Overview of ML development challenges • How MLflow tackles these challenges • MLflow components • Demo • How to get started
ML Lifecycle Delta Data Prep μ μ λ λ θ θ Raw Data Training Tuning Tuning ModelExchange Scale Scale Scale Scale Deploy Governance
Custom ML Platforms Facebook FBLearner, Uber Michelangelo, Google TFX • Standardize the data prep / training / deploy loop:if you work with the platform, you get these! • Limited to a few algorithms or frameworks • Tied to one company’s infrastructure Can we provide similar benefits in an open manner?
Introducing Open machine learning platform • Works with any ML library & language • Runs the same way anywhere (e.g. any cloud) • Designed to be useful for 1 or 1000+ person orgs
Standard packaging format for reproducible ML runs • Folder of code + data files with a “MLproject” description file MLflow Components Tracking Projects Models Record and queryexperiments: code,configs, results, …etc Packaging formatfor reproducible runs on any platform General model format that supports diversedeployment tools
Key Concepts in Tracking Parameters: key-value inputs to your code Metrics: numeric values (can update over time) Artifacts: arbitrary files, including data and models Source: training code that ran Version: version of the training code Tags and Notes: any additional information
MLflow Tracking Tracking APIs (REST, Python, Java, R) UI Tracking Server API
Standard packaging format for reproducible ML runs • Folder of code + data files with a “MLproject” description file MLflow Tracking import mlflow with mlflow.start_run(): mlflow.log_param("layers", layers) mlflow.log_param("alpha", alpha) # train model mlflow.log_metric("mse", model.mse()) mlflow.log_artifact("plot", model.plot(test_df)) mlflow.tensorflow.log_model(model) Tracking Record and queryexperiments: code,configs, results, …etc
MLflow backend stores • Entity Store • FileStore (local filesystem) • SQLStore (via SQLAlchemy) • REST Store • Artifact Repository • S3 backed store • Azure Blob storage • Google Cloud storage • DBFS artifact repo
Demo Goal: Classify hand-drawn digits • Instrument Keras training code with MLflow tracking APIs • Run training code as an MLflow Project • Deploy an MLflow Model for real-time serving
MLflow Projects Motivation Diverse set of training tools Result: ML code is difficult to productionize. Diverse set of environments
MLflow Projects Local Execution Project Spec Config Code Remote Execution Dependencies Data
MLflow Projects Packaging format for reproducible ML runs • Any code folder or GitHub repository • Optional MLproject file with project configuration Defines dependencies for reproducibility • Conda (+ R, Docker, …) dependencies can be specified in MLproject • Reproducible in (almost) any environment Execution APIfor running projects • CLI / Python / R / Java • Supports local and remote execution
Example MLflow Project conda_env: conda.yamlentry_points: main:parameters:training_data: path lambda: {type: float, default: 0.1}command: python main.py {training_data} {lambda} my_project/├── MLproject│ │ │ │ │├── conda.yaml├── main.py└── model.py ... $ mlflow run git://<my_project>
Demo Goal: Classify hand-drawn digits • Instrument Keras training code with MLflow tracking APIs • Run training code as an MLflow Project • Deploy an MLflow Model for real-time serving
MLflow Models Motivation Inference Code Batch & Stream Scoring Serving Tools ML Frameworks
MLflow Models Inference Code Model Format Flavor 1 Flavor 2 Batch & Stream Scoring Standard for ML models Serving Tools ML Frameworks
MLflow Models Packaging format for ML Models • Any directory with MLmodel file Defines dependencies for reproducibility • Conda environment can be specified in MLmodel configuration Model creation utilities • Save models from any framework in MLflow format Deployment APIs • CLI / Python / R / Java
Example MLflow Model run_id: 769915006efd4c4bbd662461time_created: 2018-06-28T12:34flavors:tensorflow:saved_model_dir: estimatorsignature_def_key: predictpython_function:loader_module: mlflow.tensorflow my_model/├── MLmodel│ │ │ │ │└ estimator/ ├─ saved_model.pb └─ variables/ ... Usable with Tensorflow tools / APIs Usable with any Python tool mlflow.tensorflow.log_model(...)
Model Flavors Example PyTorch mlflow.pytorch.log_model() Train a model predict = mlflow.pyfunc.load_pyfunc(…) predict(input_dataframe) Flavor 1: Pyfunc Model Format Flavor 2: PyTorch model = mlflow.pytorch.load_model(…) with torch.no_grad(): model(input_tensor)
Model Flavors Example predict = mlflow.pyfunc.load_pyfunc(…) predict(input_dataframe)
Demo Goal: Classify hand-drawn digits • Instrument Keras training code with MLflow tracking APIs • Run training code as an MLflow Project • Deploy an MLflow Model for real-time serving
Get started with MLflow pip install mlflowto get started Find docs & examples at mlflow.org tinyurl.com/mlflow-slack
0.9.0 Release MLflow 0.9.0 was released this week! Major features: • Tracking server supports SQL via SQLAlchemy • Pluggable tracking server backends • Docker environments for Projects • Custom python models
Ongoing MLflow Roadmap • UI scalability improvements (1.0) • X-coordinate logging for metrics & batched logging (1.0) • Fluent API for Java and Scala (1.0) • Packaging projects with build steps (1.0+) • Better environment isolation when loading models (1.0) • Improved model schemas (1.0)