Skip to content

Removed Read methods taking string parameters from TextLoader (#1797) #2323

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 4, 2019

Conversation

PaulTFreedman
Copy link
Contributor

@PaulTFreedman PaulTFreedman commented Jan 30, 2019

Unsure about the naming of LocalPathReader but I kept it the same due to the requirement that it be namespaced to Microsoft.ML - DataViewExtensions already exists and is in the Microsoft.ML.Data namespace.

Fixes #1797.

@dnfclas
Copy link

dnfclas commented Jan 30, 2019

CLA assistant check
All CLA requirements met.

@codecov
Copy link

codecov bot commented Jan 30, 2019

Codecov Report

Merging #2323 into master will increase coverage by 0.1%.
The diff coverage is 100%.

@@            Coverage Diff            @@
##           master    #2323     +/-   ##
=========================================
+ Coverage   71.15%   71.25%   +0.1%     
=========================================
  Files         779      785      +6     
  Lines      140311   140785    +474     
  Branches    16047    16088     +41     
=========================================
+ Hits        99838   100319    +481     
+ Misses      36021    36012      -9     
- Partials     4452     4454      +2
Flag Coverage Δ
#Debug 71.25% <100%> (+0.1%) ⬆️
#production 67.61% <100%> (+0.03%) ⬆️
#test 85.3% <ø> (+0.21%) ⬆️

@@ -11,6 +11,7 @@
<ProjectReference Include="..\..\src\Microsoft.ML.LightGBM\Microsoft.ML.LightGBM.csproj" />
<ProjectReference Include="..\..\src\Microsoft.ML.PCA\Microsoft.ML.PCA.csproj" />
<ProjectReference Include="..\..\src\Microsoft.ML.StandardLearners\Microsoft.ML.StandardLearners.csproj" />
<ProjectReference Include="..\..\src\Microsoft.ML.StaticPipe\Microsoft.ML.StaticPipe.csproj" />
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Jan 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StaticPipe [](start = 78, length = 10)

This is not a good sign.
In current moment we prefer to separate our StaticPipe project from rest of the code.

I would suggest to split LocalPathReader into two classes. One should stay next to TextLoader.cs and have just Read method without <TShape>.
Second should remain in StaticPipe project, and have Read method with <TShape>.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear on your meaning, when you say with/without "." you're talking about calling something like reader.Read(...) vs just Read(...)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for confusion, I forgot what web interface doesn't gracefully handles tags, and you need to escape them all the time.
I was talking about public static DataView<TShape> Read<TShape> that thing should remain in StaticPipe project.

If I correctly understand you code and intention you removing public IDataView Read(string path) from TextLoader but since LocalPathReader contains extension method on IDataReader<IMultiStreamSource> reader all our code continue to work since we fall back to that extension.

This is great, it just that extension shouldn't be part of StaticPipe project. I would prefer it to remain in Microsoft.ML.Data project.
In same time second extension method which works on top of DataReader<IMultiStreamSource, TShape> should remain to be inside StaticPipe project.

Does my explanation make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, the earlier message makes a lot more sense now you've added in the tags! I've updated the pull request, does it look better now?

@@ -1312,10 +1312,6 @@ public void Save(ModelSaveContext ctx)

public IDataView Read(IMultiStreamSource source) => new BoundLoader(this, source);

public IDataView Read(string path) => Read(new MultiFileSource(path));

public IDataView Read(params string[] path) => Read(new MultiFileSource(path));
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Jan 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we prefer to have two methods because params is a bit slow one, and for performance we prefer to have one method for single file case (which is like 99%) and one for case with multiple file paths. #Closed

@Ivanidzo4ka
Copy link
Contributor

@PaulTFreedman thank you for your contribution! I left few comments, can you take care of them?

@Ivanidzo4ka
Copy link
Contributor

using Microsoft.Data.DataView;

please add our standard header to the file.

// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.
// See the LICENSE file in the project root for more information.

Refers to: src/Microsoft.ML.Data/DataLoadSave/DataReaderExtensions.cs:1 in 0a4cb1f. [](commit_id = 0a4cb1f, deletion_comment = False)

using Microsoft.ML.Data;
using Microsoft.ML.StaticPipe;

namespace Microsoft.ML
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ML [](start = 20, length = 2)

I would leave it in ML.StaticPipe namespace.

@Ivanidzo4ka
Copy link
Contributor

using Microsoft.ML.Data;

please add header (see same comments for details)


Refers to: src/Microsoft.ML.StaticPipe/LocalPathReader.cs:1 in 0a4cb1f. [](commit_id = 0a4cb1f, deletion_comment = False)

Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Few small things left.

@PaulTFreedman
Copy link
Contributor Author

I'm not sure why it fails the code coverage part, it happens locally too but the output doesn't give me any useful error message. I can try to debug when I have time but if someone else has any ideas that would be very helpful.

@Ivanidzo4ka
Copy link
Contributor

@PaulTFreedman it looks fine now. At least I can see updated post from coverage service, and all builds in passed state. I just need to find second person to review this change. I'll make sure someone will take a look during Monday.

Copy link
Contributor

@artidoro artidoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@artidoro
Copy link
Contributor

artidoro commented Feb 4, 2019

Thank you!

@artidoro artidoro merged commit c544d51 into dotnet:master Feb 4, 2019
@ghost ghost locked as resolved and limited conversation to collaborators Mar 25, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants