Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(powerbi): Report to Dashboard lineage #12451

Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
c2e0dac
fix(model): fixes DashboardContainsDashboard relationship in Dashboar…
sgomezvillamor Jan 22, 2025
cad0b37
Update updating-datahub.md
sgomezvillamor Jan 23, 2025
7c92a0e
Merge branch 'master' into feature/cus-3571-fix-model-DashboardInfo-D…
sgomezvillamor Jan 23, 2025
0fa4114
Merge branch 'master' into feature/cus-3571-fix-model-DashboardInfo-D…
sgomezvillamor Jan 23, 2025
a0327ac
handle report reference for a given tile
sgomezvillamor Jan 23, 2025
6479072
Merge branch 'master' into feature/cus-3571-fix-model-DashboardInfo-D…
sgomezvillamor Jan 24, 2025
43072d5
fix chardEdges destination type validation in DashboardPatchBuilder +…
sgomezvillamor Jan 24, 2025
18a539e
Merge branch 'feature/cus-3571-fix-model-DashboardInfo-DashboardConta…
sgomezvillamor Jan 27, 2025
3c33063
fix missing "dashboards" field in DashboardInfoTemplate
sgomezvillamor Jan 27, 2025
00adc98
Merge branch 'master' into feature/cus-3571-fix-model-DashboardInfo-D…
sgomezvillamor Jan 27, 2025
11bc937
Merge branch 'feature/cus-3571-fix-model-DashboardInfo-DashboardConta…
sgomezvillamor Jan 27, 2025
97d91d9
add child Dashboards to PowerBI reports, if any
sgomezvillamor Jan 27, 2025
6f0f29b
provide official docs in the code comment
sgomezvillamor Feb 3, 2025
6db9ef6
Merge branch 'master' into feature/cus-3571/feat-powerbi-report-to-da…
sgomezvillamor Feb 3, 2025
843133e
Merge branch 'master' into feature/cus-3571/feat-powerbi-report-to-da…
sgomezvillamor Feb 4, 2025
1d131e3
code comment
sgomezvillamor Feb 5, 2025
a1f77b8
Merge branch 'master' into feature/cus-3571/feat-powerbi-report-to-da…
sgomezvillamor Feb 5, 2025
bf21f7d
lint fix
sgomezvillamor Feb 5, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,7 @@ class Constant:
ACTIVE = "Active"
SQL_PARSING_FAILURE = "SQL Parsing Failure"
M_QUERY_NULL = '"null"'
REPORT_WEB_URL = "reportWebUrl"


@dataclass
Expand Down
31 changes: 28 additions & 3 deletions metadata-ingestion/src/datahub/ingestion/source/powerbi/powerbi.py
Original file line number Diff line number Diff line change
Expand Up @@ -582,8 +582,11 @@
if tile.dataset is not None and tile.dataset.webUrl is not None:
custom_properties[Constant.DATASET_WEB_URL] = tile.dataset.webUrl

if tile.report is not None and tile.report.id is not None:
custom_properties[Constant.REPORT_ID] = tile.report.id
if tile.report_id is not None:
custom_properties[Constant.REPORT_ID] = tile.report_id

Check warning on line 586 in metadata-ingestion/src/datahub/ingestion/source/powerbi/powerbi.py

View check run for this annotation

Codecov / codecov/patch

metadata-ingestion/src/datahub/ingestion/source/powerbi/powerbi.py#L585-L586

Added lines #L585 - L586 were not covered by tests

if tile.report is not None and tile.report.webUrl is not None:
custom_properties[Constant.REPORT_WEB_URL] = tile.report.webUrl

Check warning on line 589 in metadata-ingestion/src/datahub/ingestion/source/powerbi/powerbi.py

View check run for this annotation

Codecov / codecov/patch

metadata-ingestion/src/datahub/ingestion/source/powerbi/powerbi.py#L588-L589

Added lines #L588 - L589 were not covered by tests

return custom_properties

Expand Down Expand Up @@ -1053,6 +1056,7 @@
report: powerbi_data_classes.Report,
chart_mcps: List[MetadataChangeProposalWrapper],
user_mcps: List[MetadataChangeProposalWrapper],
dashboard_edges: List[EdgeClass],
) -> List[MetadataChangeProposalWrapper]:
"""
Map PowerBi report to Datahub dashboard
Expand All @@ -1074,6 +1078,7 @@
charts=chart_urn_list,
lastModified=ChangeAuditStamps(),
dashboardUrl=report.webUrl,
dashboards=dashboard_edges,
)

info_mcp = self.new_mcp(
Expand Down Expand Up @@ -1167,8 +1172,28 @@
ds_mcps = self.to_datahub_dataset(report.dataset, workspace)
chart_mcps = self.pages_to_chart(report.pages, workspace, ds_mcps)

# find all dashboards with a Tile referencing this report
downstream_dashboards_edges = []
for d in workspace.dashboards.values():
if any(t.report_id == report.id for t in d.tiles):
dashboard_urn = builder.make_dashboard_urn(

Check warning on line 1179 in metadata-ingestion/src/datahub/ingestion/source/powerbi/powerbi.py

View check run for this annotation

Codecov / codecov/patch

metadata-ingestion/src/datahub/ingestion/source/powerbi/powerbi.py#L1176-L1179

Added lines #L1176 - L1179 were not covered by tests
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems worrying that these lines weren't covered by the tests? as per the codecov report

also, to make sure I understand - the lineage is tile -> dashboard -> report?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems worrying that these lines weren't covered by the tests? as per the codecov report

this may be a sort of false positive, because my update on default_mock_response.json was specifically tailored to force this code path

also, to make sure I understand - the lineage is tile -> dashboard -> report?

Lineage is represented with a contains relationship. Any PowerBI Dashboard having a PowerBI Tile which references a parent PowerBI Report is modelled as follows:

PowerBI Report (DataHub Dashboard) -- contains --> PowerBI Dashboard (DataHub Dashboard) // this is the new addtion
PowerBI Dashboard (DataHub Dashboard) --contains --> PowerBI Tile (DataHub Chart) // this one was already supported

This was confirmed with users to match their expectations; this is how is modelled in PowerBI Lineage.

platform=self.__config.platform_name,
platform_instance=self.__config.platform_instance,
name=d.get_urn_part(),
)
edge = EdgeClass(

Check warning on line 1184 in metadata-ingestion/src/datahub/ingestion/source/powerbi/powerbi.py

View check run for this annotation

Codecov / codecov/patch

metadata-ingestion/src/datahub/ingestion/source/powerbi/powerbi.py#L1184

Added line #L1184 was not covered by tests
destinationUrn=dashboard_urn,
sourceUrn=None,
created=None,
lastModified=None,
properties=None,
)
downstream_dashboards_edges.append(edge)

Check warning on line 1191 in metadata-ingestion/src/datahub/ingestion/source/powerbi/powerbi.py

View check run for this annotation

Codecov / codecov/patch

metadata-ingestion/src/datahub/ingestion/source/powerbi/powerbi.py#L1191

Added line #L1191 was not covered by tests

# Let's convert report to datahub dashboard
report_mcps = self.report_to_dashboard(workspace, report, chart_mcps, user_mcps)
report_mcps = self.report_to_dashboard(

Check warning on line 1194 in metadata-ingestion/src/datahub/ingestion/source/powerbi/powerbi.py

View check run for this annotation

Codecov / codecov/patch

metadata-ingestion/src/datahub/ingestion/source/powerbi/powerbi.py#L1194

Added line #L1194 was not covered by tests
workspace, report, chart_mcps, user_mcps, downstream_dashboards_edges
)

# Now add MCPs in sequence
mcps.extend(ds_mcps)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -286,11 +286,13 @@ class CreatedFrom(Enum):
id: str
title: str
embedUrl: str
dataset: Optional["PowerBIDataset"]
dataset_id: Optional[str]
report: Optional[Report]
report_id: Optional[str]
createdFrom: CreatedFrom

dataset: Optional["PowerBIDataset"]
report: Optional[Report]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are report_id and report set at the same time? if not, it might be worth putting a comment that says the latter is filled in at a later time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ids are set on a first traversal and objects in a later one
I will add a comment


def get_urn_part(self):
return f"charts.{self.id}"

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -337,41 +337,6 @@ def get_tiles(self, workspace: Workspace, dashboard: Dashboard) -> List[Tile]:
-tiles), there is no information available on pagination

"""

def new_dataset_or_report(tile_instance: Any) -> dict:
"""
Find out which is the data source for tile. It is either REPORT or DATASET
"""
report_fields = {
Constant.REPORT: (
self.get_report(
workspace=workspace,
report_id=tile_instance.get(Constant.REPORT_ID),
)
if tile_instance.get(Constant.REPORT_ID) is not None
else None
),
Constant.CREATED_FROM: Tile.CreatedFrom.UNKNOWN,
}

# reportId and datasetId are exclusive in tile_instance
# if datasetId is present that means tile is created from dataset
# if reportId is present that means tile is created from report
# if both i.e. reportId and datasetId are not present then tile is created from some visualization
if tile_instance.get(Constant.REPORT_ID) is not None:
report_fields[Constant.CREATED_FROM] = Tile.CreatedFrom.REPORT
elif tile_instance.get(Constant.DATASET_ID) is not None:
report_fields[Constant.CREATED_FROM] = Tile.CreatedFrom.DATASET
else:
report_fields[Constant.CREATED_FROM] = Tile.CreatedFrom.VISUALIZATION

title: Optional[str] = tile_instance.get(Constant.TITLE)
_id: Optional[str] = tile_instance.get(Constant.ID)
created_from: Any = report_fields[Constant.CREATED_FROM]
logger.info(f"Tile {title}({_id}) is created from {created_from}")

return report_fields

tile_list_endpoint: str = self.get_tiles_endpoint(
workspace, dashboard_id=dashboard.id
)
Expand All @@ -393,8 +358,18 @@ def new_dataset_or_report(tile_instance: Any) -> dict:
title=instance.get(Constant.TITLE),
embedUrl=instance.get(Constant.EMBED_URL),
dataset_id=instance.get(Constant.DATASET_ID),
report_id=instance.get(Constant.REPORT_ID),
dataset=None,
**new_dataset_or_report(instance),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for cleaning this up

are there any entries in Constant that can be removed now?

report=None,
createdFrom=(
# In the past we considered that only one of the two report_id or dataset_id would be present
# but we have seen cases where both are present. If both are present, we prioritize the report.
Tile.CreatedFrom.REPORT
if instance.get(Constant.REPORT_ID)
else Tile.CreatedFrom.DATASET
if instance.get(Constant.DATASET_ID)
else Tile.CreatedFrom.VISUALIZATION
),
)
for instance in tile_dict
if instance is not None
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -625,13 +625,26 @@
dashboard.tiles = self._get_resolver().get_tiles(
workspace, dashboard=dashboard
)
# set the dataset for tiles
# set the dataset and the report for tiles
for tile in dashboard.tiles:
# In Power BI, dashboards, reports, and datasets are tightly scoped to the workspace they belong to.
# https://learn.microsoft.com/en-us/power-bi/collaborate-share/service-new-workspaces
if tile.report_id:
tile.report = workspace.reports.get(tile.report_id)
if tile.report is None:
self.reporter.info(

Check warning on line 635 in metadata-ingestion/src/datahub/ingestion/source/powerbi/rest_api_wrapper/powerbi_api.py

View check run for this annotation

Codecov / codecov/patch

metadata-ingestion/src/datahub/ingestion/source/powerbi/rest_api_wrapper/powerbi_api.py#L632-L635

Added lines #L632 - L635 were not covered by tests
title="Missing Report Lineage For Tile",
message="A Report reference that failed to be resolved. Please ensure that 'extract_reports' is set to True in the configuration.",
context=f"workspace-name: {workspace.name}, tile-name: {tile.title}, report-id: {tile.report_id}",
)
# However, semantic models (aka datasets) can be shared accross workspaces
# https://learn.microsoft.com/en-us/fabric/admin/portal-workspace#use-semantic-models-across-workspaces
# That's why the global 'dataset_registry' is required
if tile.dataset_id:
tile.dataset = self.dataset_registry.get(tile.dataset_id)
if tile.dataset is None:
self.reporter.info(
title="Missing Lineage For Tile",
title="Missing Dataset Lineage For Tile",
message="A cross-workspace reference that failed to be resolved. Please ensure that no global workspace is being filtered out due to the workspace_id_pattern.",
context=f"workspace-name: {workspace.name}, tile-name: {tile.title}, dataset-id: {tile.dataset_id}",
)
Expand All @@ -653,10 +666,10 @@
for dashboard in workspace.dashboards.values():
dashboard.tags = workspace.dashboard_endorsements.get(dashboard.id, [])

# fill reports first since some dashboard may reference a report
fill_reports()
if self.__config.extract_dashboards:
fill_dashboards()

fill_reports()
fill_dashboard_tags()
self._fill_independent_datasets(workspace=workspace)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,9 @@
"aspect": {
"json": {
"customProperties": {
"createdFrom": "Dataset",
"datasetId": "05169CD2-E713-41E6-9600-1D8066D95445"
"createdFrom": "Report",
"datasetId": "05169CD2-E713-41E6-9600-1D8066D95445",
"reportId": "5b218778-e7a5-4d73-8187-f10824047715"
},
"title": "test_tile",
"description": "test_tile",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -81,9 +81,10 @@
"aspect": {
"json": {
"customProperties": {
"createdFrom": "Dataset",
"createdFrom": "Report",
"datasetId": "05169CD2-E713-41E6-9600-1D8066D95445",
"datasetWebUrl": "http://localhost/groups/64ED5CAD-7C10-4684-8180-826122881108/datasets/05169CD2-E713-41E6-9600-1D8066D95445/details"
"datasetWebUrl": "http://localhost/groups/64ED5CAD-7C10-4684-8180-826122881108/datasets/05169CD2-E713-41E6-9600-1D8066D95445/details",
"reportId": "5b218778-e7a5-4d73-8187-f10824047715"
},
"title": "test_tile",
"description": "test_tile",
Expand Down
Loading
Loading