Skip to content

sits_merge() should be commutative for regular raster cubes #1493

@rolfsimoes

Description

@rolfsimoes

Describe the bug

For two regular cubes data1 and data2, the following calls should return the same result:

sits_merge(data1, data2)
sits_merge(data2, data1)

This is not guaranteed by the current .merge.regular_case() logic.

In the regular_case, the function first requires both cubes to have the same tiles. After that, there are three relevant paths:

  1. When both cubes have the same bands, .merge_cube_densify() is used. This path combine the timelines of both cubes. This can be commutative only if duplicate records for the same tile, band, and date are considered equivalent.

  2. When the cubes have different bands and the same timeline, data2 is filtered with: setdiff(.cube_bands(data2), .cube_bands(data1)). This makes data1 take precedence when there are common bands. Therefore, if the two cubes have partially overlapping bands, the result may depend on the order of the arguments unless common bands are considered equivalent.

  3. When the cubes have different bands and (slightly) different timelines, .merge_strategy_intersects(data1, data2) is used. This path is not commutative by construction. It uses the timeline of data1 as the reference timeline, filters both cubes according to overlaps, and rewrites the dates of data2 to match dates from data1. Reversing the arguments produce a different result.

This affects the intended use case described in the documentation for merging regularized Sentinel-1 and Sentinel-2 cubes, where different bands and compatible but different timelines are expected.

Additional context

For regular cubes, sits_merge() could enforce commutativity by making the merge rules symmetric as follows:

  1. For the "same bands" path, duplicated records for the same tile, band, and date should be required to be equivalent. If they are not equivalent, the function should fail instead of silently giving precedence to the first argument.

  2. For the "different bands/same timeline" path, common bands should also be required to be equivalent. If common bands are not equivalent, the function should fail instead of letting data1 take precedence.

  3. For the "different bands/different timelines" path, the zipper assumption should be explicit and symmetric. The timelines should be (1) strictly interleaved and (2) have the same length, so that a pairwise rule can be applied. A commutative reference timeline could then be built with a symmetric rule such as the pairwise minimum date (i.e. pmin(timeline1, timeline2)). This avoids choosing either data1 or data2 as the reference timeline.

Metadata

Metadata

Assignees

Labels

Type

No fields configured for Bug.

Projects

Status

No status

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions