-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce Value_Type.Null
, fix edge cases for read_many
#11737
Conversation
c57057b
to
22a2371
Compare
Changes affect Snowflake so I'm running the Extra suite:
I expect it to fail because it is failing on develop. Yet I want to see if there are no more failures than what is expected. |
@@ -628,6 +626,9 @@ make_custom_cast column target_value_type type_mapping = | |||
result = Ref.new Nothing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this logic hard to follow. The return value would be either Nothing
or the return value of result.put expr
-- but isn't the the return value of Ref.put
the old value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return value of put
is discarded.
The function returns ref.get
at the very end.
I agree it is not ideal. This is a hack to get multiline if without adding a ton of indentation to the else
branches. There's just too many branches here (and while sometimes too many branches is bad, here it is just intrinsic to this function) to make it readable.
Once we get multiline if
(#6408) I will be more than happy to replace this with just:
if cond1 then
...
else if cond2 then
...
else
Nothing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think I should rewrite it with increasing indentation?
I'd rather keep it as is as IMO it is readable once the reasons are explained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, it makes more sense now.
distribution/lib/Standard/Table/0.0.0-dev/src/Internal/Value_Type_Helpers.enso
Show resolved
Hide resolved
@@ -628,6 +626,9 @@ make_custom_cast column target_value_type type_mapping = | |||
result = Ref.new Nothing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, it makes more sense now.
if bucket_1.is_nothing || bucket_2.is_nothing then False else | ||
bucket_1 == bucket_2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if bucket_1.is_nothing || bucket_2.is_nothing then False else | |
bucket_1 == bucket_2 | |
bucket_1 == bucket_2 |
Since we know either_type_is_not_ordered
is false here.
@@ -29,6 +31,10 @@ public ColumnStorage apply( | |||
return boolStorage.makeNegated(); | |||
} | |||
|
|||
if (storage.getType() instanceof NullType) { | |||
return new NullStorage(Math.toIntExact(storage.getSize())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason getSize() can return long
but NullStorage
can only accept int
? Is an exception here possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getSize
is the Enso-facing one so it returns long
because that is the default storage for integers in Enso.
All of our storages take int
size because most of them are backed by arrays (Null is inded an exception) so they cannot hold more than int
can index.
Thanks for pointing this out, making me rethink this - long getSize()
is just wrapper for int size()
that 'translates' to the more directly Enso-friendly long
value, but I should have just used size()
here and avoided the conversion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm after further checking - this function gets ColumnStorage
which only has the long
returning one in its interface...
I guess I'll just keep the toIntExact
conversion. It shouldn't happen as long as our storages are int
based. Once they aren't - the other code will need to be updated anyway. Does that seem ok?
# Conflicts: # std-bits/table/src/main/java/org/enso/table/aggregations/Sum.java # std-bits/table/src/main/java/org/enso/table/data/column/builder/Builder.java # std-bits/table/src/main/java/org/enso/table/data/column/builder/TypedBuilderImpl.java # std-bits/table/src/main/java/org/enso/table/data/column/operation/cast/ToBigDecimalConverter.java # std-bits/table/src/main/java/org/enso/table/data/column/operation/cast/ToBigIntegerConverter.java # std-bits/table/src/main/java/org/enso/table/data/column/operation/cast/ToBooleanStorageConverter.java # std-bits/table/src/main/java/org/enso/table/data/column/operation/cast/ToDateStorageConverter.java # std-bits/table/src/main/java/org/enso/table/data/column/operation/cast/ToDateTimeStorageConverter.java # std-bits/table/src/main/java/org/enso/table/data/column/operation/cast/ToIntegerStorageConverter.java # std-bits/table/src/main/java/org/enso/table/data/column/operation/cast/ToTextStorageConverter.java # std-bits/table/src/main/java/org/enso/table/data/column/operation/cast/ToTimeOfDayStorageConverter.java # std-bits/table/src/main/java/org/enso/table/data/table/Column.java
# Conflicts: # std-bits/table/src/main/java/org/enso/table/data/column/builder/Builder.java # std-bits/table/src/main/java/org/enso/table/data/column/builder/ObjectBuilder.java # std-bits/table/src/main/java/org/enso/table/data/table/Column.java
public static Column fromRepeatedItem( | ||
String name, Value item, int repeat, ProblemAggregator problemAggregator) { | ||
if (repeat < 0) { | ||
throw new IllegalArgumentException("Repeat count must be non-negative."); | ||
} | ||
|
||
Object converted = Polyglot_Utils.convertPolyglotValue(item); | ||
|
||
if (converted == null) { | ||
var builder = Builder.getForType(AnyObjectType.INSTANCE, repeat, problemAggregator); | ||
builder.appendNulls(repeat); | ||
return new Column(name, builder.seal()); | ||
} | ||
|
||
StorageType storageType = StorageType.forBoxedItem(converted); | ||
Builder builder = Builder.getForType(storageType, repeat, problemAggregator); | ||
Context context = Context.getCurrent(); | ||
for (int i = 0; i < repeat; i++) { | ||
builder.append(converted); | ||
context.safepoint(); | ||
} | ||
|
||
return new Column(name, builder.seal()); | ||
return new Column(name, Storage.fromRepeatedItem(item, repeat, problemAggregator)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jdunkerley I've allowed myself to overwrite some of your cleanups from the other PR in this particular method with a fully custom Storage::fromRepeatedItem
.
It also relies on Builder.getForType
for the generic case but also has 'specializations' that create a more compact storage directly (NullStorage
and LongConstantStorage
that are constructed in O(1) instead of O(N)).
I think we do want this 'optimization'. In the future we can even add more ConstantStorage
kinds.
The major question is if Storage
class is the right place for this method. I imagine once we start adding more backends it may not necessarily be. But for now I think it is relatively good place. So I'd keep it here and move it when it makes sense.
Does that seem OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes fine to do. Will rebase and cope :)
New Extra tests run after the new changes: |
public static Column fromRepeatedItem( | ||
String name, Value item, int repeat, ProblemAggregator problemAggregator) { | ||
if (repeat < 0) { | ||
throw new IllegalArgumentException("Repeat count must be non-negative."); | ||
} | ||
|
||
Object converted = Polyglot_Utils.convertPolyglotValue(item); | ||
|
||
if (converted == null) { | ||
var builder = Builder.getForType(AnyObjectType.INSTANCE, repeat, problemAggregator); | ||
builder.appendNulls(repeat); | ||
return new Column(name, builder.seal()); | ||
} | ||
|
||
StorageType storageType = StorageType.forBoxedItem(converted); | ||
Builder builder = Builder.getForType(storageType, repeat, problemAggregator); | ||
Context context = Context.getCurrent(); | ||
for (int i = 0; i < repeat; i++) { | ||
builder.append(converted); | ||
context.safepoint(); | ||
} | ||
|
||
return new Column(name, builder.seal()); | ||
return new Column(name, Storage.fromRepeatedItem(item, repeat, problemAggregator)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes fine to do. Will rebase and cope :)
Pull Request Description
Value_Type.Null
. #6281Important Notes
Checklist
Please ensure that the following checklist has been satisfied before submitting the PR:
Scala,
Java,
TypeScript,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.
or the Snowflake database integration, a run of the Extra Tests has been scheduled.