Skip to content

Null Reference Exception when Concatenating with a single value #3061

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
singlis opened this issue Mar 22, 2019 · 2 comments · Fixed by #3809
Closed

Null Reference Exception when Concatenating with a single value #3061

singlis opened this issue Mar 22, 2019 · 2 comments · Fixed by #3809
Assignees
Labels
bug Something isn't working P0 Priority of the issue for triage purpose: IMPORTANT, needs to be fixed right away.

Comments

@singlis
Copy link
Member

singlis commented Mar 22, 2019

Issue

Discovered from #3037, a user can call Concatenate and specify a single string. When this happens, a NullReference exception is thrown. Here is the code sample:

        EstimatorChain()
            .Append(mlContext.Transforms.Conversion.ConvertType("Features", "Price", DataKind.Double))
            .Append(mlContext.Transforms.Conversion.ConvertType("Label", "Area", DataKind.Double))
            .Append(mlContext.Transforms.Concatenate("Features"))  // This causes the error, should be ("Features", "Features")
            .AppendCacheCheckpoint(mlContext)
            .Append(mlContext.Regression.Trainers.Sdca("Label", "Features"))
            , mlContext

Here is the callstack:

>	Microsoft.ML.Core.dll!Microsoft.ML.SchemaShape.Column.GetTypeString() Line 111	C#
 	Microsoft.ML.Data.dll!Microsoft.ML.Trainers.TrainerEstimatorBase<Microsoft.ML.Data.RegressionPredictionTransformer<Microsoft.ML.Trainers.LinearRegressionModelParameters>, Microsoft.ML.Trainers.LinearRegressionModelParameters>.CheckInputSchema(Microsoft.ML.SchemaShape inputSchema) Line 111	C#
 	Microsoft.ML.Data.dll!Microsoft.ML.Trainers.TrainerEstimatorBase<Microsoft.ML.Data.RegressionPredictionTransformer<Microsoft.ML.Trainers.LinearRegressionModelParameters>, Microsoft.ML.Trainers.LinearRegressionModelParameters>.GetOutputSchema(Microsoft.ML.SchemaShape inputSchema) Line 83	C#
 	Microsoft.ML.Data.dll!Microsoft.ML.Data.EstimatorChain<Microsoft.ML.Data.RegressionPredictionTransformer<Microsoft.ML.Trainers.LinearRegressionModelParameters>>.GetOutputSchema(Microsoft.ML.SchemaShape inputSchema) Line 83	C#
 	Microsoft.ML.Data.dll!Microsoft.ML.Data.EstimatorChain<Microsoft.ML.Data.RegressionPredictionTransformer<Microsoft.ML.Trainers.LinearRegressionModelParameters>>.Fit(Microsoft.ML.IDataView input) Line 60	C#
 	ConsoleApp32.dll!Program.main(string[] argv) Line 33	F#

The problem is that a NullReference exception looks like a bug and its not obvious to the user on what is the cause of the problem.

Expected

We should instead notify the user that:

  1. A bad argument was passed in
  2. That its the Concatenate transform that has the bad argument

Solution A

We simply check the length of the name array that is passed to Concatenate and throw the correct exception.

Solution B

Another possible solution is to change the behavior so that when one column is specified for Concatenate, the name is treated as the source and destination -- so this:

            .Append(mlContext.Transforms.Concatenate("Features"))

would be the same as this:

            .Append(mlContext.Transforms.Concatenate("Features", "Features"))

cc @glebuk for additional feedback

@singlis singlis added the bug Something isn't working label Mar 22, 2019
@TomFinley
Copy link
Contributor

Great, thanks for finding this @singlis. Solution A is correct and solution B is incorrect.

@abgoswam abgoswam self-assigned this Mar 26, 2019
@DevLob-zz
Copy link

i think also this is relevant

var textFeaturesProcessPipeline = mlContext.Transforms.Text.FeaturizeText(DefaultColumnNames.Features, algorithmData.TextFeatures, textFeaturizingOption);

when algorithmData.TextFeatures is empty list it through null exception with me

it should not throw this exception and internally check if it null or zero length

@ganik ganik added the P0 Priority of the issue for triage purpose: IMPORTANT, needs to be fixed right away. label May 23, 2019
@ghost ghost locked as resolved and limited conversation to collaborators Mar 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working P0 Priority of the issue for triage purpose: IMPORTANT, needs to be fixed right away.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants