MLContext to create them all #1098

Zruty0 · 2018-09-28T23:14:09Z

During one of the in-person API reviews we agreed that it would be a good idea to have a single object MLContext that would serve as a 'factory of everything' (similar to the HTTP context / DB context in the .NET world).

MLContext will explicitly implement IHostEnvironment, so you can create all the existing estimators by giving the context as the first argument.
MLContext will have properties BinaryClassification, Regression, Clustering etc. for canonical ML tasks (the ones that are currently classes in themselves), complete with Evaluate and all corresponding trainers.
It will have extension methods for non-canonical tasks like recommendation or anomaly detection etc.
It will have properties Transformation, Filtering, Loading to instantiate all known transform estimators, filters and data readers (again via extension methods).
It will have a pair of methods SaveModel and LoadModel that handle model serialization.

/cc @KrzysztofCwalina @TomFinley @eerhardt @markusweimer @asthana86

The text was updated successfully, but these errors were encountered:

Ivanidzo4ka · 2018-09-29T00:03:56Z

MLContext will explicitly implement IHostEnvironment, so you can create all the existing estimators by giving the context as the first argument.

We don't like pass IHostEnvironment objects because they bloated, so let's create even more bloated object.

Can we have something like this instead:

     public sealed class MLContext
    {
        public readonly IHostEnvironment Env;
        public MLContext(IHostEnvironment env = null)
        {
            if (env == null)
                env = new ConsoleEnvironment();
            Env = env;
        }
    }

and just modify constructors in Estimators to accept MLContext?

TomFinley · 2018-09-29T20:10:39Z

We don't like pass IHostEnvironment objects because they bloated, so let's create even more bloated object.

It's not quite so bad as all that. One of the details here you will see is that @Zruty0 has said, originally suggested by @eerhardt, is that this hypothetical object will explicitly implement the interface. Further, the numerous extensions to IHostEnvironment that are useful only to component authors (most significantly, those in Contracts.cs) would be in a namespace generally used only by component authors, or otherwise rendered invisible to users. On the other hand, the user-facing properties themselves being on the class and not part of the interface, would not be visible to component authors.

So despite being a single object kinda, it has a "dual nature" that reflects the dual usage of ML.NET, on the one hand a tool for exploiting ML by a user, as well as a tool into which one can plug ones own components. All without having to have parallel "user context" vs. "component context" object, which seems like an elegant solution.

Of course, the idea of these property objects raises the specter that we're creating a "factory of everything" object which concerns me somewhat, but I think the pattern works here.

TomFinley · 2018-09-30T06:42:32Z

"MLContext to create them all,"
One ML.Context to find them.
One ML.Context to tool them all,
and in intellisense bind them.

asthana86 · 2018-09-30T18:37:23Z

We touched on this a little during our conversation. Is the logger a type of property as well in addition to BinaryClassification, Regression etc. In EF, The DbContext.Database.Log property can be set to a delegate for any method that takes a string e.g.:

            using (DbContext ctx = new DbContext(myconnectionstring))
            {
                //Regular console app
                ctx.Database.Log = Console.Write;

                //ASP.NET environment
                ctx.Database.Log = message => Trace.WriteLine(message);

                //Write to a log-file
                ctx.Database.Log = message => File.AppendText("C:\\mylog.txt").WriteLine(message);
            }

This then allows the user to choose whether wants to write to console or Trace (in case of ASP.NET) or a file logger. This will also avoid any confusion on what LocalEnvironment, ConsoleEnvironment would mean for our users.

In addition to this being consistent with EF and ASP.NET, which users already know and some consistency with API names which we have now with MLContext will make this a bit more .NETTY.

Just my thoughts.

Zruty0 · 2018-10-01T18:29:11Z

It's a good point @asthana86 . We have AddListener/RemoveListener for this, so it might be a matter of minor massaging of the API.

shauheen added this to the 1018 milestone Oct 5, 2018

shauheen added the enhancement New feature or request label Oct 5, 2018

shauheen assigned Zruty0 Oct 5, 2018

shauheen added the API Issues pertaining the friendly API label Oct 5, 2018

shauheen mentioned this issue Oct 5, 2018

Creating both environment and context is annoying #1045

Closed

Zruty0 mentioned this issue Oct 13, 2018

ML Context to create them all #1252

Merged

Zruty0 closed this as completed in #1252 Oct 18, 2018

TomFinley mentioned this issue Nov 7, 2018

Saving a DataView to a file should be simpler and not through to the LocalEnvironment class #1553

Closed

TomFinley mentioned this issue Nov 30, 2018

Rename types inside MLContext as Catalogs #1796

Closed

TomFinley mentioned this issue Jan 10, 2019

Refactoring of Constructors #2100

Open

abgoswam mentioned this issue Jan 31, 2019

Creation of components through MLContext and cleanup (KeyToValue, ValueToKey, OneHotEncoding) #2340

Merged

TomFinley mentioned this issue Feb 13, 2019

Registering subhosts when creating new catalog entries advances the pseudo random number generator #2523

Closed

This was referenced Feb 22, 2019

Internalization of TensorFlowUtils.cs and refactored TensorFlowCatalog. #2672

Merged

GetColumn method is extension for IDataView but not mlContext. #2473

Closed

singlis mentioned this issue Mar 22, 2019

Getting started with ML .NET with in-memory data is *painful*. #3037

Closed

TomFinley mentioned this issue Mar 26, 2019

Added an extension method for saving statically typed model (#1286) #2924

Closed

ghost locked as resolved and limited conversation to collaborators Mar 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MLContext to create them all #1098

MLContext to create them all #1098

Zruty0 commented Sep 28, 2018

Ivanidzo4ka commented Sep 29, 2018 •

edited

Loading

TomFinley commented Sep 29, 2018

TomFinley commented Sep 30, 2018

asthana86 commented Sep 30, 2018 •

edited

Loading

Zruty0 commented Oct 1, 2018

MLContext to create them all #1098

MLContext to create them all #1098

Comments

Zruty0 commented Sep 28, 2018

Ivanidzo4ka commented Sep 29, 2018 • edited Loading

TomFinley commented Sep 29, 2018

TomFinley commented Sep 30, 2018

asthana86 commented Sep 30, 2018 • edited Loading

Zruty0 commented Oct 1, 2018

Ivanidzo4ka commented Sep 29, 2018 •

edited

Loading

asthana86 commented Sep 30, 2018 •

edited

Loading