Describe the bug
For two regular cubes data1 and data2, the following calls should return the same result:
sits_merge(data1, data2)
sits_merge(data2, data1)
This is not guaranteed by the current .merge.regular_case() logic.
In the regular_case, the function first requires both cubes to have the same tiles. After that, there are three relevant paths:
-
When both cubes have the same bands, .merge_cube_densify() is used. This path combine the timelines of both cubes. This can be commutative only if duplicate records for the same tile, band, and date are considered equivalent.
-
When the cubes have different bands and the same timeline, data2 is filtered with: setdiff(.cube_bands(data2), .cube_bands(data1)). This makes data1 take precedence when there are common bands. Therefore, if the two cubes have partially overlapping bands, the result may depend on the order of the arguments unless common bands are considered equivalent.
-
When the cubes have different bands and (slightly) different timelines, .merge_strategy_intersects(data1, data2) is used. This path is not commutative by construction. It uses the timeline of data1 as the reference timeline, filters both cubes according to overlaps, and rewrites the dates of data2 to match dates from data1. Reversing the arguments produce a different result.
This affects the intended use case described in the documentation for merging regularized Sentinel-1 and Sentinel-2 cubes, where different bands and compatible but different timelines are expected.
Additional context
For regular cubes, sits_merge() could enforce commutativity by making the merge rules symmetric as follows:
-
For the "same bands" path, duplicated records for the same tile, band, and date should be required to be equivalent. If they are not equivalent, the function should fail instead of silently giving precedence to the first argument.
-
For the "different bands/same timeline" path, common bands should also be required to be equivalent. If common bands are not equivalent, the function should fail instead of letting data1 take precedence.
-
For the "different bands/different timelines" path, the zipper assumption should be explicit and symmetric. The timelines should be (1) strictly interleaved and (2) have the same length, so that a pairwise rule can be applied. A commutative reference timeline could then be built with a symmetric rule such as the pairwise minimum date (i.e. pmin(timeline1, timeline2)). This avoids choosing either data1 or data2 as the reference timeline.
Describe the bug
For two regular cubes
data1anddata2, the following calls should return the same result:This is not guaranteed by the current
.merge.regular_case()logic.In the
regular_case, the function first requires both cubes to have the same tiles. After that, there are three relevant paths:When both cubes have the same bands,
.merge_cube_densify()is used. This path combine the timelines of both cubes. This can be commutative only if duplicate records for the same tile, band, and date are considered equivalent.When the cubes have different bands and the same timeline,
data2is filtered with:setdiff(.cube_bands(data2), .cube_bands(data1)). This makesdata1take precedence when there are common bands. Therefore, if the two cubes have partially overlapping bands, the result may depend on the order of the arguments unless common bands are considered equivalent.When the cubes have different bands and (slightly) different timelines,
.merge_strategy_intersects(data1, data2)is used. This path is not commutative by construction. It uses the timeline ofdata1as the reference timeline, filters both cubes according to overlaps, and rewrites the dates ofdata2to match dates fromdata1. Reversing the arguments produce a different result.This affects the intended use case described in the documentation for merging regularized Sentinel-1 and Sentinel-2 cubes, where different bands and compatible but different timelines are expected.
Additional context
For regular cubes,
sits_merge()could enforce commutativity by making the merge rules symmetric as follows:For the "same bands" path, duplicated records for the same tile, band, and date should be required to be equivalent. If they are not equivalent, the function should fail instead of silently giving precedence to the first argument.
For the "different bands/same timeline" path, common bands should also be required to be equivalent. If common bands are not equivalent, the function should fail instead of letting
data1take precedence.For the "different bands/different timelines" path, the zipper assumption should be explicit and symmetric. The timelines should be (1) strictly interleaved and (2) have the same length, so that a pairwise rule can be applied. A commutative reference timeline could then be built with a symmetric rule such as the pairwise minimum date (i.e.
pmin(timeline1, timeline2)). This avoids choosing eitherdata1ordata2as the reference timeline.