Authors
Kevin T. Chu <[email protected]>
MLflow Tracking is the only component of MLflow that is needed for general research projects. The other components of MLflow may be useful for projects involving data science and machine learning models.
MLflow Tracking facilitates support for recording experiment configuration parameters and results. Below is a short set of instructions for setting up MLflow experiment tracking within a Jupyter notebook.
-
Near the beginning of the Jupyter notebook, include a cell to set up MLflow Tracking.
# --- Set up MLflow Tracking # Set experiment mlflow.set_experiment(experiment_name) # Ensure that previous run (possibly failed) has been terminated by MLflow. if mlflow.active_run(): mlflow.end_run() # Initialize dictionary for experiment results mlflow_results = {}
Note. For situations where it is useful to group experiments by date or time, the
utils
Python module provides theget_experiment_name()
function to faciliate consistent generation of date and time stamped experiment names. -
Before running the experiment, include a cell to record all of the experiment parameters.
# --- Record experiment parameters mlflow.log_param("some-parameter", some_parameter) mlflow.log_param("another-parameter", another_parameter)
Note. MLflow Tracking automatically includes a timestamp for each run of an experiment to facilitate comparison of different runs of an experiment using the same set of configuration parameters.
-
Throughout the Jupyter notebook, add results to
mlflow_results
and/or record individual results (saved as MLflow "metrics").# Add a result to `mlflow_results`. This result will be saved at the end of # the Jupyter notebook mlflow_results["some-result"] = some_result # Record an individual result (as an MLflow "metric") mlflow.log_metric("another-result") = another_result
-
After the experiment is completed, include a cell to record the results.
# --- Record experiment results mlflow.log_dict(mlflow_results, "results.json")
-
At the end of the Juypter notebook, include a cell to end the MLflow run.
# --- End current MLflow run mlflow.end_run()
MLflow Tracking provides support for reviewing and comparing experiments. It is particularly useful when comparing results across multiple runs of the same experiment with different parameter settings. For basic research projects, the following short set of steps should be sufficient to viewing MLflow Tracking results.
-
Change to the directory containing the
mlruns
directory (usually the project root directory).$ cd PROJECT_DIR
-
Start a local MLflow Tracking UI server.
$ mlflow ui
Note. The MLflow tracking server can also be used to serve experiment tracking data.
$ mlflow server
-
View results in a web browser by opening the URL provided displayed to the console. By default, the Mlflow tracking server listens at
localhost:5000
.