Skip to content

Commit 03ad805

Browse files
authored
Merge pull request #45 from JuliaAI/default-logger
Update the readme examples to include tuning and setting global logger
2 parents 76352d5 + 4f92cc2 commit 03ad805

File tree

2 files changed

+93
-14
lines changed

2 files changed

+93
-14
lines changed

.github/workflows/CI.yml

+5-2
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,10 @@ jobs:
6060
JULIA_NUM_THREADS: '2'
6161
MLFLOW_TRACKING_URI: "http://localhost:5000/api"
6262
- uses: julia-actions/julia-processcoverage@v1
63-
- uses: codecov/codecov-action@v3
63+
- uses: codecov/codecov-action@v4
6464
with:
65-
files: lcov.info
65+
token: ${{ secrets.CODECOV_TOKEN }}
66+
fail_ci_if_error: false
67+
verbose: true
68+
6669

README.md

+88-12
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,8 @@
66

77
[ci-dev]: https://github.com/pebeto/MLJFlow.jl/actions/workflows/CI.yml
88
[ci-dev-img]: https://github.com/pebeto/MLJFlow.jl/actions/workflows/CI.yml/badge.svg?branch=dev "Continuous Integration (CPU)"
9-
[codecov-dev]: https://codecov.io/github/JuliaAI/MLJFlow.jl?branch=dev
10-
[codecov-dev-img]: https://codecov.io/gh/JuliaAI/MLJFlow.jl/branch/dev/graphs/badge.svg?branch=dev "Code Coverage"
9+
[codecov-dev]: https://codecov.io/github/JuliaAI/MLJFlow.jl
10+
[codecov-dev-img]: https://codecov.io/github/JuliaAI/MLJFlow.jl/graph/badge.svg?token=TBCMJOK1WR "Code Coverage"
1111

1212
[MLJ](https://github.com/alan-turing-institute/MLJ.jl) is a Julia framework for
1313
combining and tuning machine learning models. MLJFlow is a package that extends
@@ -22,7 +22,7 @@ metrics, log parameters, log artifacts, etc.).
2222
This project is part of the GSoC 2023 program. The proposal description can be
2323
found [here](https://summerofcode.withgoogle.com/programs/2023/projects/iRxuzeGJ).
2424
The entire workload is divided into three different repositories:
25-
[MLJ.jl](https://github.com/alan-turing-institute/MLJ.jl),
25+
[MLJ.jl](https://github.com/alan-turing-institute/MLJ.jl),
2626
[MLFlowClient.jl](https://github.com/JuliaAI/MLFlowClient.jl) and this one.
2727

2828
## Features
@@ -33,14 +33,14 @@ The entire workload is divided into three different repositories:
3333
- [x] Provides a wrapper `Logger` for MLFlowClient.jl clients and associated
3434
metadata; instances of this type are valid "loggers", which can be passed to MLJ
3535
functions supporting the `logger` keyword argument.
36-
36+
3737
- [x] Provides MLflow integration with MLJ's `evaluate!`/`evaluate` method (model
3838
**performance evaluation**)
3939

4040
- [x] Extends MLJ's `MLJ.save` method, to save trained machines as retrievable MLflow
4141
client artifacts
4242

43-
- [ ] Provides MLflow integration with MLJ's `TunedModel` wrapper (to log **hyper-parameter
43+
- [x] Provides MLflow integration with MLJ's `TunedModel` wrapper (to log **hyper-parameter
4444
tuning** workflows)
4545

4646
- [ ] Provides MLflow integration with MLJ's `IteratedModel` wrapper (to log **controlled
@@ -60,8 +60,8 @@ shell/console, run `mlflow server` to launch an mlflow service on a local server
6060
Refer to the [MLflow documentation](https://www.mlflow.org/docs/latest/index.html) for
6161
necessary background.
6262

63-
We assume MLJDecisionTreeClassifier is in the user's active Julia package
64-
environment.
63+
**Important.** For the examples that follow, we assume `MLJ`, `MLJDecisionTreeClassifier`
64+
and `MLFlowClient` are in the user's active Julia package environment.
6565

6666
```julia
6767
using MLJ # Requires MLJ.jl version 0.19.3 or higher
@@ -73,7 +73,7 @@ instance. The experiment name and artifact location are optional.
7373
```julia
7474
logger = MLJFlow.Logger(
7575
"http://127.0.0.1:5000/api";
76-
experiment_name="MLJFlow test",
76+
experiment_name="test",
7777
artifact_location="./mlj-test"
7878
)
7979
```
@@ -89,25 +89,54 @@ model = DecisionTreeClassifier(max_depth=4)
8989
Now we call `evaluate` as usual but provide the `logger` as a keyword argument:
9090

9191
```julia
92-
evaluate(model, X, y, resampling=CV(nfolds=5), measures=[LogLoss(), Accuracy()], logger=logger)
92+
evaluate(
93+
model,
94+
X,
95+
y,
96+
resampling=CV(nfolds=5),
97+
measures=[LogLoss(), Accuracy()],
98+
logger=logger,
99+
)
93100
```
94101

95102
Navigate to "http://127.0.0.1:5000" on your browser and select the "Experiment" matching
96103
the name above ("MLJFlow test"). Select the single run displayed to see the logged results
97104
of the performance evaluation.
98105

99106

107+
### Logging outcomes of model tuning
108+
109+
Continuing with the previous example:
110+
111+
```julia
112+
r = range(model, :max_depth, lower=1, upper=5)
113+
tmodel = TunedModel(
114+
model,
115+
tuning=Grid(),
116+
range = r;
117+
resampling=CV(nfolds=9),
118+
measures=[LogLoss(), Accuracy()],
119+
logger=logger,
120+
)
121+
122+
mach = machine(tmodel, X, y) |> fit!
123+
```
124+
125+
Return to the browser page (refreshing if necessary) and you will find five more
126+
performance evaluations logged, one for each value of `max_depth` evaluated in tuning.
127+
128+
100129
### Saving and retrieving trained machines as MLflow artifacts
101130

102131
Let's train the model on all data and save the trained machine as an MLflow artifact:
103132

104133
```julia
105134
mach = machine(model, X, y) |> fit!
106-
run = MLJBase.save(logger, mach)
135+
run = MLJ.save(logger, mach)
107136
```
108137

109-
Notice that in this case `MLJBase.save` returns a run (and instance of `MLFlowRun` from
110-
MLFlowClient.jl).
138+
Notice that in this case `MLJBase.save` returns a run (an instance of `MLFlowRun` from
139+
MLFlowClient.jl).
111140

112141
To retrieve an artifact we need to use the MLFlowClient.jl API, and for that we need to
113142
know the MLflow service that our `logger` wraps:
@@ -129,3 +158,50 @@ We can predict using the deserialized machine:
129158
```julia
130159
predict(mach2, X)
131160
```
161+
162+
### Setting a global logger
163+
164+
Set `logger` as the global logging target by running `default_logger(logger)`. Then,
165+
unless explicitly overridden, all loggable workflows will log to `logger`. In particular,
166+
to *suppress* logging, you will need to specify `logger=nothing` in your calls.
167+
168+
So, for example, if we run the following setup
169+
170+
```julia
171+
using MLJ
172+
173+
# using a new experiment name here:
174+
logger = MLJFlow.Logger(
175+
"http://127.0.0.1:5000/api";
176+
experiment_name="test global logging",
177+
artifact_location="./mlj-test"
178+
)
179+
180+
default_logger(logger)
181+
182+
X, y = make_moons(100) # a table and a vector with 100 rows
183+
DecisionTreeClassifier = @load DecisionTreeClassifier pkg=DecisionTree
184+
model = DecisionTreeClassifier()
185+
```
186+
187+
Then the following is automatically logged
188+
189+
```julia
190+
evaluate(model, X, y)
191+
```
192+
193+
But the following is *not* logged:
194+
195+
196+
```julia
197+
evaluate(model, X, y; logger=nothing)
198+
```
199+
200+
To save a machine when a default logger is set, one can use the following syntax:
201+
202+
```julia
203+
mach = machine(model, X, y) |> fit!
204+
MLJ.save(mach)
205+
```
206+
207+
Retrieve the saved machine as described earlier.

0 commit comments

Comments
 (0)