-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Removed Read methods taking string parameters from TextLoader (#1797) #2323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2323 +/- ##
=========================================
+ Coverage 71.15% 71.25% +0.1%
=========================================
Files 779 785 +6
Lines 140311 140785 +474
Branches 16047 16088 +41
=========================================
+ Hits 99838 100319 +481
+ Misses 36021 36012 -9
- Partials 4452 4454 +2
|
@@ -11,6 +11,7 @@ | |||
<ProjectReference Include="..\..\src\Microsoft.ML.LightGBM\Microsoft.ML.LightGBM.csproj" /> | |||
<ProjectReference Include="..\..\src\Microsoft.ML.PCA\Microsoft.ML.PCA.csproj" /> | |||
<ProjectReference Include="..\..\src\Microsoft.ML.StandardLearners\Microsoft.ML.StandardLearners.csproj" /> | |||
<ProjectReference Include="..\..\src\Microsoft.ML.StaticPipe\Microsoft.ML.StaticPipe.csproj" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StaticPipe [](start = 78, length = 10)
This is not a good sign.
In current moment we prefer to separate our StaticPipe project from rest of the code.
I would suggest to split LocalPathReader into two classes. One should stay next to TextLoader.cs and have just Read method without <TShape>
.
Second should remain in StaticPipe project, and have Read method with <TShape>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear on your meaning, when you say with/without "." you're talking about calling something like reader.Read(...)
vs just Read(...)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for confusion, I forgot what web interface doesn't gracefully handles tags, and you need to escape them all the time.
I was talking about public static DataView<TShape> Read<TShape>
that thing should remain in StaticPipe project.
If I correctly understand you code and intention you removing public IDataView Read(string path)
from TextLoader but since LocalPathReader
contains extension method on IDataReader<IMultiStreamSource> reader
all our code continue to work since we fall back to that extension.
This is great, it just that extension shouldn't be part of StaticPipe project. I would prefer it to remain in Microsoft.ML.Data project.
In same time second extension method which works on top of DataReader<IMultiStreamSource, TShape>
should remain to be inside StaticPipe project.
Does my explanation make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, the earlier message makes a lot more sense now you've added in the tags! I've updated the pull request, does it look better now?
@@ -1312,10 +1312,6 @@ public void Save(ModelSaveContext ctx) | |||
|
|||
public IDataView Read(IMultiStreamSource source) => new BoundLoader(this, source); | |||
|
|||
public IDataView Read(string path) => Read(new MultiFileSource(path)); | |||
|
|||
public IDataView Read(params string[] path) => Read(new MultiFileSource(path)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we prefer to have two methods because params is a bit slow one, and for performance we prefer to have one method for single file case (which is like 99%) and one for case with multiple file paths. #Closed
@PaulTFreedman thank you for your contribution! I left few comments, can you take care of them? |
please add our standard header to the file.
Refers to: src/Microsoft.ML.Data/DataLoadSave/DataReaderExtensions.cs:1 in 0a4cb1f. [](commit_id = 0a4cb1f, deletion_comment = False) |
using Microsoft.ML.Data; | ||
using Microsoft.ML.StaticPipe; | ||
|
||
namespace Microsoft.ML |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ML [](start = 20, length = 2)
I would leave it in ML.StaticPipe namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. Few small things left.
…ataReader extensions (dotnet#1797)
I'm not sure why it fails the code coverage part, it happens locally too but the output doesn't give me any useful error message. I can try to debug when I have time but if someone else has any ideas that would be very helpful. |
@PaulTFreedman it looks fine now. At least I can see updated post from coverage service, and all builds in passed state. I just need to find second person to review this change. I'll make sure someone will take a look during Monday. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! |
Unsure about the naming of
LocalPathReader
but I kept it the same due to the requirement that it be namespaced toMicrosoft.ML
-DataViewExtensions
already exists and is in theMicrosoft.ML.Data
namespace.Fixes #1797.