You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When enabling .withAutoSchemaUpdate(true) and .ignoreUnknownValues() with BigQueryIO.write() method, unexpected behavior happens causing records with nested arrays to fail being written to BQ. An empty record is added into the array, causing schema validation issues. This behavior does not occur without these two settings enabled.
Expected behavior:
Records should be mapped without additional empty objects being added to array
Actual Behavior:
Records in arrays are unexpectedly populated with an empty record, resulting in schema validation failures during write operations.
Steps to reproduce:
Use BigQueryIO.write() with: .withAutoSchemaUpdate(true) and .ignoreUnknownValues() (Beam version is 2.59.0)
Write records containing array data with the schema mentioned above
Observe failures due to an empty record being inserted into arrays.
(...)
"errorMessage": "Field value of id cannot be empty. on field external_ids.",
(...)
"stringifiedData": {
"external_ids": [
{
"id": "fd7da837-13cb-4b67-8ced-220a27130e34",
"name": "some_custom_id"
},
{}
],
}
(...)
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components
Component: Python SDK
Component: Java SDK
Component: Go SDK
Component: Typescript SDK
Component: IO connector
Component: Beam YAML
Component: Beam examples
Component: Beam playground
Component: Beam katas
Component: Website
Component: Infrastructure
Component: Spark Runner
Component: Flink Runner
Component: Samza Runner
Component: Twister2 Runner
Component: Hazelcast Jet Runner
Component: Google Cloud Dataflow Runner
The text was updated successfully, but these errors were encountered:
The issue may stem from line 181, where unknownFields are assigned to an empty TableRow and passed to a function which converts TableRow to Proto Message.
What happened?
When enabling .withAutoSchemaUpdate(true) and .ignoreUnknownValues() with BigQueryIO.write() method, unexpected behavior happens causing records with nested arrays to fail being written to BQ. An empty record is added into the array, causing schema validation issues. This behavior does not occur without these two settings enabled.
Expected behavior:
Records should be mapped without additional empty objects being added to array
Actual Behavior:
Records in arrays are unexpectedly populated with an empty record, resulting in schema validation failures during write operations.
Steps to reproduce:
Example big querry nested record field:
This is what response would look like:
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components
The text was updated successfully, but these errors were encountered: