Merge Data merges columns I don't want merged

This is the first nasty thing of the type I feared when removing `Variable.make`.

File -> kMeans -> Merge Data. Connect another File -> kMeans and then to the same merge data. In Merge data, set the key to "Row Index". On the output from Merge data I want to have heart_disease data with two additional columns with cluster labels, so I can compare clusterings.

- Before #4076 I added an Edit Domain to rename column in one of the tables to "Clusters 2"
- After #4076 I don't use Edit Domains and I expected to have columns "Clusters (1)" and "Clusters (2)". This however does not happen because attributes are now matched by name and type, so Merge data believes that both tables have the same attribute "Clusters", and it doesn't duplicate it. It takes the column from the first table and ignores the one from the second.

If we decide that, no, it should keep two columns, it would also duplicate all other columns (age, max HR...).

Options:

1. ~~Revert removal of `Variable.make`.~~
2. Do nothing. The user has to rename the columns with duplicated names.
3. Check whether the columns with same names have the same data. If so, keep a single column (and show info?). If they are different, use both columns with renaming as introduced in #4076.
4. Check whether the columns with same names have the same data. If so, keep a single one. If not, show an error and let the user do the renaming.

My 4 cents:

1. No. This problem is small in comparison with those caused by `Variable.make`. Something much worse must happen to reintroduce it
2. No. The user needs to know why (s)he doesn't have two columns.
3. Yes.
4. Probably no. I see no good reason for it. Let us not annoy the user if the widget can do the job reasonably good. The user can still rename is (s)he wants more informative names if (s)he chooses to. Besides, option 3 is already almost implemented in #4076, we just need to add checking the columns. If we go for 4, we'd discard #4076, which would be a shame.

This problem was not caused by #4076. #4076 just didn't (and couldn't have) fixed it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Merge Data merges columns I don't want merged #4077

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Merge Data merges columns I don't want merged #4077

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions