Skip to content
This repository has been archived by the owner on Nov 25, 2023. It is now read-only.

Validation error - unexpected #8

Closed
ian-lewis-d opened this issue Jun 28, 2023 · 4 comments
Closed

Validation error - unexpected #8

ian-lewis-d opened this issue Jun 28, 2023 · 4 comments

Comments

@ian-lewis-d
Copy link

ian-lewis-d commented Jun 28, 2023

Hi,

I am extracting data from MySql (tap-mysql) and using your version of target-s3-parquet to load my data into a bucket in S3.

There is a validation step which is choking on the data and I'm not clear why. Can you explain what causes this unexpected error and how it can be avoided.

2023-06-28T11:11:58.334221Z [info     ]     raise error                cmd_type=elb consumer=True name=target-s3-parquet producer=False stdio=stderr string_id=target-s3-parquet
2023-06-28T11:11:58.334345Z [info     ] jsonschema.exceptions.ValidationError: 0.1 is not a multiple of 1e-06 cmd_type=elb consumer=True name=target-s3-parquet producer=False stdio=stderr string_id=target-s3-parquet
2023-06-28T11:11:58.334466Z [info     ]                                cmd_type=elb consumer=True name=target-s3-parquet producer=False stdio=stderr string_id=target-s3-parquet
2023-06-28T11:11:58.335246Z [info     ] Failed validating 'multipleOf' in schema['properties']['interest_rate_percentage']: cmd_type=elb consumer=True name=target-s3-parquet producer=False stdio=stderr string_id=target-s3-parquet
2023-06-28T11:11:58.335369Z [info     ]     {'inclusion': 'available', cmd_type=elb consumer=True name=target-s3-parquet producer=False stdio=stderr string_id=target-s3-parquet
2023-06-28T11:11:58.335489Z [info     ]      'multipleOf': 1e-06,      cmd_type=elb consumer=True name=target-s3-parquet producer=False stdio=stderr string_id=target-s3-parquet
2023-06-28T11:11:58.336377Z [info     ]      'type': ['null', 'number']} cmd_type=elb consumer=True name=target-s3-parquet producer=False stdio=stderr string_id=target-s3-parquet
2023-06-28T11:11:58.336533Z [info     ]                                cmd_type=elb consumer=True name=target-s3-parquet producer=False stdio=stderr string_id=target-s3-parquet
2023-06-28T11:11:58.336663Z [info     ] On instance['interest_rate_percentage']: cmd_type=elb consumer=True name=target-s3-parquet producer=False stdio=stderr string_id=target-s3-parquet
@ian-lewis-d
Copy link
Author

I took a look around and it seems the jsonschema library implementation is broken for Floats and the way to go is to use Decimal. Discussion here.
https://gitlab.com/meltano/target-csv/-/issues/3

@jkausti
Copy link
Owner

jkausti commented Jun 28, 2023

Hi @ian-lewis-d, thanks for raising an issue. This validation step is part of the code generated by the Meltano SDK. However, it seems as though it can be fixed in the target's code, as has been done in this pr for target-postgres.

I'll have a look at it and see what I can do!

@jkausti
Copy link
Owner

jkausti commented Jul 1, 2023

Hi @ian-lewis-d , I will close this issue now as it seems as though it is not possible to fix this issue in the target's code, and should be fixed in the meltano-sdk code instead. Another alternative was to use a mapper an cast the floats to a decimal or some other datatype that is not affected by the floating point inaccuracy.

@jkausti jkausti closed this as completed Jul 1, 2023
@ian-lewis-d
Copy link
Author

Thanks @jkausti

I will investigate the meltano-sdk and will also look at using a mapper to 'fix' the data.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants