Skip to content

Commit

Permalink
PropertyComparisonProcessor: use deduplicated values for determining …
Browse files Browse the repository at this point in the history
…deviations and omissions to avoid duplicated reporting
  • Loading branch information
jmkeil committed Dec 19, 2024
1 parent 0addb4c commit c8e5155
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 6 deletions.
5 changes: 4 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,10 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
## [Unreleased]

### Fixed
* fix `PopulationComparisonProcessor`: fix measurement of **Absolute Coveredness** to not increase in case of a duplicated resource in one dataset without a corresponding resource in any other dataset
* fix `PopulationComparisonProcessor`: fix measurement of **Absolute Coveredness** to not increase in case of a duplicated resource in one dataset without a corresponding resource in any other dataset

### Changed
* changed `PropertyComparisonProcessor`: use deduplicated values for determining deviations and omissions to avoid duplicated reporting

## [3.1.1] - 2024-11-13

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -380,16 +380,16 @@ protected void reportDeviationsAndOmissions() {

protected void reportDeviationsAndOmissionsForDatasetPair(ResourcePair datasetPair) {
for (String variable : variables) {
Map<Resource, Map<RDFNode, Set<Resource>>> resourcesByNonDistinctValueByDataset = resourcesByNonDistinctValueByDatasetByVariable.get(variable);
Map<Resource, Map<RDFNode, Set<Resource>>> resourcesByDistinctValueByDataset = resourcesByDistinctValueByDatasetByVariable.get(variable);
if (theAspect.variableCoveredByDatasets(variable, datasetPair.first, datasetPair.second)) {
Map<RDFNode, Set<Resource>> resourceByNonDistinctValuesOfFirstDataset = resourcesByNonDistinctValueByDataset.get(datasetPair.first);
Map<RDFNode, Set<Resource>> resourceByNonDistinctValuesOfSecondDataset = resourcesByNonDistinctValueByDataset.get(datasetPair.second);
Map<RDFNode, Set<Resource>> resourceByDistinctValuesOfFirstDataset = resourcesByDistinctValueByDataset.get(datasetPair.first);
Map<RDFNode, Set<Resource>> resourceByDistinctValuesOfSecondDataset = resourcesByDistinctValueByDataset.get(datasetPair.second);
for (Resource firstResource : correspondingResourcesByDataset.get(datasetPair.first)) {
for (Resource secondResource : correspondingResourcesByDataset.get(datasetPair.second)) {
Set<RDFNode> uncoveredValuesOfFirstResource =
getUncoveredValuesOfResource(firstResource, secondResource, resourceByNonDistinctValuesOfFirstDataset, resourceByNonDistinctValuesOfSecondDataset);
getUncoveredValuesOfResource(firstResource, secondResource, resourceByDistinctValuesOfFirstDataset, resourceByDistinctValuesOfSecondDataset);
Set<RDFNode> uncoveredValuesOfSecondResource =
getUncoveredValuesOfResource(secondResource, firstResource, resourceByNonDistinctValuesOfSecondDataset, resourceByNonDistinctValuesOfFirstDataset);
getUncoveredValuesOfResource(secondResource, firstResource, resourceByDistinctValuesOfSecondDataset, resourceByDistinctValuesOfFirstDataset);

// deviation: a pair of resources with each having a value not present in the
// other resource
Expand Down

0 comments on commit c8e5155

Please sign in to comment.