-
-
Notifications
You must be signed in to change notification settings - Fork 143
[AVRO] #589: Fix schema not including base class for records with subclasses #593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AVRO] #589: Fix schema not including base class for records with subclasses #593
Conversation
952cbb7
to
f43adf5
Compare
f43adf5
to
ee0c73c
Compare
…ct or interface) into union of types.
Sounds good. Was about to suggest that due to |
ee0c73c
to
99311ae
Compare
I have problem to choose most obvious handling of duplicates in unionSchemas (inside
|
avro/src/main/java/com/fasterxml/jackson/dataformat/avro/schema/RecordVisitor.java
Outdated
Show resolved
Hide resolved
Would it make sense to first use a I may be missing/misunderstanding something, so apologies if above makes no sense. |
ef29a98
to
c540e07
Compare
bff3e2b
to
88fa1dc
Compare
…tly in @JsonSubTypes or @union annotations.
I am done. Have a look at it PLS. |
* _typeSchema points to Fruit.class without subtypes record schema | ||
* | ||
* FIXME: When _typeSchema is not null, then _overridden must be true, therefore (_overridden == true) can be replaced with (_typeSchema != null), | ||
* but it might be considered API change cause _overridden has protected access modifier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do that as a follow-up for 2.20 (2.x branch)
@MichalFoksa Thank you! Had a quick look and things look better. Will try to properly review later tonight, get it merged hopefully. |
// using hashCode() for equality check). | ||
// Set ensures that each subType schema is once in resulting union. | ||
// IdentityHashMap is used because it is using reference-equality. | ||
final Set<Schema> unionSchemas = Collections.newSetFromMap(new IdentityHashMap<>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if we have identity-based Set
, where the only match is pure identity, why not just use ArrayList
instead of Set
to add
/addAll
to? Or do we get literal same Schema
instances somehow?
Making that chance will not fail any of added tests at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or do we get literal same Schema instances somehow?
Yes. It happens when a class is references multiple times in hierarchy (maybe by mistake).
In the example below Helium is subtype of Element and also subtype of Gas.
Where Gas schema is union of
- Gas (itself),
- Helium
- Oxygen.
Element schema is union of all type in
- Gas schema (Gas, Helium and Oxygen)
- Helium
When you collect all types togetjer without identity check you would get invalid union of
- Gas
- Helium
- Oxygen
- Helium again
because Helium is twice.
See use case 4 "Class referenced twice in @JsonSubTypes hierarchy**" above or PolymorphicTypeAnnotationsTest.class_is_referenced_twice_in_hierarchy_test
test.
Change unionSchemas
to ArrayList and the test will fail with
Failed to generate
AvroSchema
forElementInterface
, problem: Duplicate in union:Helium
@JsonSubTypes({
Type( Gas.class )
Type( Helium.class ) }) <---- First Helium occurrence
+---------------------+
| Element |
| ( interface ) |
+---------------------+
▲ ▲ @JsonSubTypes({
| | Type( Helium.class ) <---- Second Helium occurrence
| | Type( Oxygen.class ) })
| +---------------------+
| | Gas |
| +---------------------+
| ▲
| |
| +--------+--------+
| | |
+-----------------+ +-----------------+
| Helium | | Oxygen |
+-----------------+ +-----------------+
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok yes, right, but: I think duplicates are actually caught by alreadySeenClasses
-- so I don't think unionSchemas
prevent any duplicates. So that one could -- I think? -- be simple ArrayList
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both duplicate checks are needed, each check is on different level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But like I said, changing Set to List did not fail any of the tests. So something is odd.
I think I'll try to do that again just to make sure I did not imagine it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:( Hmmm, just tested:
final ArrayList<Schema> unionSchemas = new ArrayList<>();
Fails PolymorphicTypeAnnotationsTest#class_is_referenced_twice_in_hierarchy_test
test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I must have been hallucinating. Apologies for noise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no problem, it happens.
Merged all the way to 3.x, things working. Phew! Next PR could be for minor changes for 2.20 (2.x), making fields final. |
Supported cases:
1. Concrete class annotated with @JsonSubTypes
When a concrete (non-abstract) class is annotated with @JsonSubTypes, then Avro type of the annotated class is part of resulting union.
Fruit schema:
Annotated class, Fruit, is part of the schema:
This was not supported case, annotated class was missing in union.
2. Abstract class annotated with @JsonSubTypes
When an abstract class is annotated with @JsonSubTypes, then abstract class is not in union.
AbstractFruit schema:
Annotated abstract base class is not part of the schema:
This was supported case.
3. Abstract class somewhere in the middle of @JsonSubTypes hierarchy
When an abstract class is somewhere along @JsonSubTypes hierarchy, it does not end up in union.
Vehicle schema:
AbstractWaterVehicle class is not in union.
This was not supported case, AbstractWaterVehicle would be part of union
4. Class referenced twice in @JsonSubTypes hierarchy
When a class is referenced twice in @JsonSubTypes hierarchy, it occurs only once in union.
ElementInterface schema:
Helium
class is not duplicated in union.5. Base class explicitly in @JsonSubTypes or @union annotations
When a class is subclass of itself.
Both cases above would lead to endless loop and StackOverflowError.
See
PolymorphicTypeAnnotationsTest.class
.