Skip to content

feat(messaging): add pgQueue as alternative messaging transport to Kafka#17446

Merged
david-leifker merged 1 commit into
masterfrom
feat/pgqueue-messaging
Jun 1, 2026
Merged

feat(messaging): add pgQueue as alternative messaging transport to Kafka#17446
david-leifker merged 1 commit into
masterfrom
feat/pgqueue-messaging

Conversation

@david-leifker
Copy link
Copy Markdown
Collaborator

@david-leifker david-leifker commented May 14, 2026

Introduce a PostgreSQL-based queue (pgQueue) as an alternative messaging transport, allowing DataHub to operate without Kafka/Elasticsearch dependencies for smaller deployments.

Key changes:

  • pgQueue store with partitioned message tables, consumer offsets, and retention via pg_partman
  • Messaging transport abstraction layer (MessagingTransport conditions, KafkaMessagingEnabled/Disabled) for seamless switching between Kafka and pgQueue
  • Consumer/processor refactoring: split Kafka listeners from processors to enable pgQueue poll-based consumption
  • SqlSetup framework for managing PostgreSQL schema upgrades
  • Python pgQueue client, ingestion sink, and DataHub Actions event source
  • Docker postgres image with pg_cron and pg_partman extensions

Design Documentation

pgQueue Design Documentation

Check List

  • CI Smoke Tests (Regression w/Kafka) Run
  • CI Smoke Tests (No Kafka) [PR label smoke:quickstartPgDebug enabled] Run

Related PRs:

@david-leifker david-leifker marked this pull request as draft May 14, 2026 16:25
@github-actions github-actions Bot added ingestion PR or Issue related to the ingestion of metadata docs Issues and Improvements to docs product PR or Issue related to the DataHub UI/UX devops PR or Issue related to DataHub backend & deployment smoke_test Contains changes related to smoke tests labels May 14, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Linear: PFP-3906

Comment thread metadata-ingestion/src/datahub/pgqueue/repository.py
@alwaysmeticulous
Copy link
Copy Markdown

alwaysmeticulous Bot commented May 14, 2026

✅ Meticulous spotted 0 visual differences across 1629 screens tested: view results.

Meticulous evaluated ~10 hours of user flows against your PR.

Expected differences? Click here. Last updated for commit 01f5bb4 feat(messaging): add pgQueue as alternative messaging transport to Kafka. This comment will update as new commits are pushed.

@datahub-connector-tests
Copy link
Copy Markdown

datahub-connector-tests Bot commented May 14, 2026

Connector Tests Results

Connector tests failed for commit c9e0345

View full test logs →

To skip connector tests, add the skip-connector-tests label (org members only).

Autogenerated by the connector-tests CI pipeline.

@github-actions
Copy link
Copy Markdown
Contributor

Linear: PFP-3932

@github-actions
Copy link
Copy Markdown
Contributor

Your PR has been assigned to @david-leifker (david.leifker) for review (PFP-3906).

@david-leifker
Copy link
Copy Markdown
Collaborator Author

david-leifker commented May 27, 2026

  • Use a database schema migration tool or version the schema of the database
  • Revisit tags/labels and impl agnostic metrics
  • Is there a kafka API implementation which relies on a separate datastore
  • pgmq - compare and contrast
  • evaluate https://github.com/tansu-io/tansu

Comment thread datahub-upgrade/src/main/java/com/linkedin/datahub/upgrade/sqlsetup/SqlSetup.java Outdated
Comment thread metadata-io/src/main/java/com/linkedin/metadata/entity/ebean/EbeanAspectDao.java Outdated
Comment thread metadata-ingestion/src/datahub/pgqueue/repository.py
Comment thread metadata-ingestion/src/datahub/pgqueue/repository.py
Comment thread metadata-ingestion/src/datahub/pgqueue/repository.py Outdated
Comment thread metadata-ingestion/src/datahub/pgqueue/repository.py Outdated
Comment thread metadata-ingestion/src/datahub/pgqueue/repository.py Outdated
Comment thread metadata-ingestion/src/datahub/pgqueue/repository.py
Comment thread metadata-ingestion/src/datahub/pgqueue/repository.py
Comment thread metadata-ingestion/src/datahub/pgqueue/repository.py
Comment thread datahub-upgrade/src/main/java/com/linkedin/datahub/upgrade/sqlsetup/SqlSetup.java Outdated
Align Playwright CI with smoke-test compose profile and image build task
so pg/smoke-labeled PRs do not pull missing consumer images.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

devops PR or Issue related to DataHub backend & deployment docs Issues and Improvements to docs ingestion PR or Issue related to the ingestion of metadata pending-submitter-merge product PR or Issue related to the DataHub UI/UX smoke:quickstartPg smoke_test Contains changes related to smoke tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants