You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I have setup the "Datahub integration" to pull data from one env into the other.
It fails right at the beginning:
[2025-02-11 14:27:07,266] INFO {datahub.cli.ingest_cli:150} - DataHub CLI version: 1!0.15.0+docker
[2025-02-11 14:27:07,738] INFO {datahub.ingestion.run.pipeline:272} - Sink configured successfully. DataHubRestEmitter: configured to talk to https://XXXX/api/gms with token: XXX
[2025-02-11 14:27:10,042] INFO {datahub.ingestion.run.pipeline:297} - Source configured successfully.
[2025-02-11 14:27:10,043] INFO {datahub.cli.ingest_cli:131} - Starting metadata ingestion
[2025-02-11 14:27:10,044] INFO {datahub.ingestion.source.datahub.datahub_source:64} - Ingesting DataHub metadata up until 2025-02-11 14:27:10.044630+00:00
[2025-02-11 14:27:10,331] INFO {datahub.ingestion.source.datahub.datahub_source:108} - Fetching database aspects starting from 1970-01-01 00:00:00+00:00
aspect,
version
) as t
WHERE 1=1
AND (removed = false or removed is NULL)
ORDER BY
createdon,
urn,
aspect,
version
]
[parameters: {'exclude_aspects': ['globalSettingsInfo', 'testResults', 'dataHubIngestionSourceKey', 'dataHubIngestionSourceInfo', 'dataHubSecretKey', 'datahubIngestionCheckpoint', 'datahubIngestionRunSummary', 'globalSettingsKey', 'dataHubSecretValue'], 'since_createdon': '1970-01-01 00:00:00.000000'}]
(Background on this error at: https://sqlalche.me/e/14/f405)
Traceback (most recent call last):
File "/datahub-ingestion/.venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context
self.dialect.do_execute(
File "/datahub-ingestion/.venv/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute
cursor.execute(statement, parameters)
psycopg2.errors.SyntaxError: syntax error at or near "ARRAY"
LINE 23: AND mav.aspect NOT IN ARRAY['globalSettingsI...
^
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/metadata-ingestion/src/datahub/ingestion/run/pipeline.py", line 465, in run
for wu in itertools.islice(
File "/metadata-ingestion/src/datahub/ingestion/api/source_helpers.py", line 148, in auto_workunit_reporter
for wu in stream:
File "/metadata-ingestion/src/datahub/ingestion/source/datahub/datahub_source.py", line 76, in get_workunits_internal
yield from self._get_database_workunits(
File "/metadata-ingestion/src/datahub/ingestion/source/datahub/datahub_source.py", line 111, in _get_database_workunits
for i, (mcp, createdon) in enumerate(mcps):
File "/metadata-ingestion/src/datahub/ingestion/source/datahub/datahub_database_reader.py", line 198, in get_aspects
for row in orderer(rows):
File "/metadata-ingestion/src/datahub/ingestion/source/datahub/datahub_database_reader.py", line 40, in __call__
for row in rows:
File "/metadata-ingestion/src/datahub/ingestion/source/datahub/datahub_database_reader.py", line 189, in _get_rows
yield from self.execute_server_cursor(self.query, params)
File "/metadata-ingestion/src/datahub/ingestion/source/datahub/datahub_database_reader.py", line 160, in execute_server_cursor
result = conn.execute(query, params)
File "/datahub-ingestion/.venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1365, in execute
return self._exec_driver_sql(
File "/datahub-ingestion/.venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1669, in _exec_driver_sql
ret = self._execute_context(
File "/datahub-ingestion/.venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1943, in _execute_context
self._handle_dbapi_exception(
File "/datahub-ingestion/.venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2124, in _handle_dbapi_exception
util.raise_(
File "/datahub-ingestion/.venv/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 210, in raise_
raise exception
File "/datahub-ingestion/.venv/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1900, in _execute_context
self.dialect.do_execute(
File "/datahub-ingestion/.venv/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.SyntaxError) syntax error at or near "ARRAY"
LINE 23: AND mav.aspect NOT IN ARRAY['globalSettingsI...
^
[SQL:
SELECT *
FROM (
SELECT
mav.urn,
mav.aspect,
mav.metadata,
mav.systemmetadata,
mav.createdon,
mav.version,
removed
FROM metadata_aspect_v2 as mav
LEFT JOIN (
SELECT
*,
JSON_EXTRACT(metadata, '$.removed') as removed
FROM metadata_aspect_v2
WHERE aspect = 'status'
AND version = 0
) as sd ON sd.urn = mav.urn
WHERE 1 = 1
AND mav.version = 0
AND mav.aspect NOT IN %(exclude_aspects)s
AND mav.createdon >= %(since_createdon)s
ORDER BY
createdon,
urn,
aspect,
version
) as t
WHERE 1=1
AND (removed = false or removed is NULL)
ORDER BY
createdon,
urn,
aspect,
version
]
[parameters: {'exclude_aspects': ['globalSettingsInfo', 'testResults', 'dataHubIngestionSourceKey', 'dataHubIngestionSourceInfo', 'dataHubSecretKey', 'datahubIngestionCheckpoint', 'datahubIngestionRunSummary', 'globalSettingsKey', 'dataHubSecretValue'], 'since_createdon': '1970-01-01 00:00:00.000000'}]
(Background on this error at: https://sqlalche.me/e/14/f405)
[2025-02-11 14:27:10,579] INFO {datahub.cli.ingest_cli:144} - Finished metadata ingestion
Pipeline finished with at least 4 failures; produced 0 events in 0.41 seconds.
System details (please complete the following information):
DataHub Version Tag [v1.0-rc1] as target and source, using latest CLI.
Both DBs are Aurora serverless v2 with postgres flavor.
The text was updated successfully, but these errors were encountered:
Describe the bug
I have setup the "Datahub integration" to pull data from one env into the other.
It fails right at the beginning:
System details (please complete the following information):
The text was updated successfully, but these errors were encountered: