Skip to content

feat(ingestion/montecarlo): add Monte Carlo connector#17685

Open
alokr-dhub wants to merge 9 commits into
masterfrom
feat/montecarlo-connector
Open

feat(ingestion/montecarlo): add Monte Carlo connector#17685
alokr-dhub wants to merge 9 commits into
masterfrom
feat/montecarlo-connector

Conversation

@alokr-dhub
Copy link
Copy Markdown
Contributor

Summary

  • Adds a new DataHub ingestion source for Monte Carlo, a data observability platform
  • Ingests Monte Carlo monitors and custom SQL rules as DataHub Assertions (CUSTOM type), with native Monte Carlo type/resource/dimension preserved in customProperties
  • Ingests Monte Carlo alerts and incidents as AssertionRunEvent failures, making observability status visible on the dataset Validation tab
  • MCON → dataset URN resolution via getTable + user-supplied connection_to_platform_map, with per-MCON caching

Changes

Area Files
Source src/datahub/ingestion/source/montecarlo/
Integration tests tests/integration/montecarlo/
Unit tests tests/unit/montecarlo/
Docs docs/sources/montecarlo/
Platform bootstrap bootstrap_mcps/data-platforms.yaml, bootstrap_mcps.yaml
Logo datahub-web-react/src/images/montecarlologo.png
Registry autogenerated/connector_registry/datahub.json
Dependencies setup.py, constraints.txt, pyproject.toml, uv.lock

Test plan

  • Unit tests: MCON resolution, assertion building, run-event construction
  • Integration tests: golden-file comparison for OSS and cloud-entity paths
  • Docs: ### Overview, ### Prerequisites, ### Capabilities, ### Limitations, ### Troubleshooting all present

🤖 Generated with Claude Code

Ingests Monte Carlo monitors, custom SQL rules, and alert/incident
run events as DataHub Assertions so the Validation tab on a dataset
reflects Monte Carlo's observability coverage and incident history.

Key implementation details:
- Monitors and custom SQL rules → CUSTOM AssertionInfo
- Alerts / incidents → AssertionRunEvent (FAILURE)
- MCON resolved to DataHub dataset URN via getTable + connection_to_platform_map
- Results cached per MCON to minimise API calls
- Results are tagged with native Monte Carlo type, resource ID and
  data-quality dimension via customProperties

Adds:
- Source: src/datahub/ingestion/source/montecarlo/
- Integration tests with golden-file fixtures
- Unit tests for MCON resolution, assertion building, run events
- Docs: docs/sources/montecarlo/ (Overview, Prerequisites,
  Capabilities, Limitations, Troubleshooting)
- Platform bootstrap entry and logo
- Connector registered in datahub.json registry

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added ingestion PR or Issue related to the ingestion of metadata product PR or Issue related to the DataHub UI/UX devops PR or Issue related to DataHub backend & deployment labels Jun 2, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 2, 2026

Codecov Report

❌ Patch coverage is 89.71292% with 43 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
.../src/datahub/ingestion/source/montecarlo/client.py 86.48% 15 Missing ⚠️
.../src/datahub/ingestion/source/montecarlo/source.py 78.26% 15 Missing ⚠️
...c/datahub/ingestion/source/montecarlo/assertion.py 91.42% 9 Missing ⚠️
...tahub/ingestion/source/montecarlo/mcon_resolver.py 94.11% 4 Missing ⚠️

📢 Thoughts on this report? Let us know!

@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 2, 2026

Bundle Report

Changes will increase total bundle size by 48.2kB (0.21%) ⬆️. This is within the configured threshold ✅

Detailed changes
Bundle name Size Change
datahub-react-web-esm 23.55MB 48.2kB (0.21%) ⬆️

Affected Assets, Files, and Routes:

view changes for bundle: datahub-react-web-esm

Assets Changed:

Asset Name Size Change Total Size Change (%)
assets/index-*.js 2.16kB 8.85MB 0.02%
assets/montecarlologo-*.png (New) 46.04kB 46.04kB 100.0% 🚀

Files in assets/index-*.js:

  • ./src/app/ingestV2/source/builder/constants.ts → Total Size: 8.91kB

  • ./src/app/entityV2/shared/tabs/Dataset/Validations/assertion/profile/summary/utils.tsx → Total Size: 14.79kB

  • ./src/images/montecarlologo.png → Total Size: 50 bytes

  • ./src/app/entityV2/shared/tabs/Dataset/Validations/CustomAssertionDescription.tsx → Total Size: 1.15kB

  • ./src/app/entityV2/shared/tabs/Dataset/Validations/DatasetAssertionLogicModal.tsx → Total Size: 447 bytes

  • ./src/app/ingest/source/builder/sources.json → Total Size: 43.21kB

  • ./src/app/ingestV2/source/builder/sources.json → Total Size: 48.22kB

@maggiehays maggiehays added the needs-review Label for PRs that need review from a maintainer. label Jun 2, 2026
@alokr-dhub alokr-dhub marked this pull request as draft June 3, 2026 05:43
Add Monte Carlo to the frontend connector registry (sources.json for both
ingestV2 and the legacy ingest builder) so it appears in the "Create Source"
gallery, and map its platform URN to the bundled logo in PLATFORM_URN_TO_LOGO
so the icon renders in the gallery and source views.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@alokr-dhub alokr-dhub marked this pull request as ready for review June 4, 2026 07:59
@alokr-dhub alokr-dhub marked this pull request as draft June 4, 2026 07:59
…onnector

# Conflicts:
#	metadata-ingestion/uv.lock
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

devops PR or Issue related to DataHub backend & deployment ingestion PR or Issue related to the ingestion of metadata needs-review Label for PRs that need review from a maintainer. product PR or Issue related to the DataHub UI/UX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants