Skip to content

Merge Data merges columns I don't want merged #4077

@janezd

Description

@janezd

This is the first nasty thing of the type I feared when removing Variable.make.

File -> kMeans -> Merge Data. Connect another File -> kMeans and then to the same merge data. In Merge data, set the key to "Row Index". On the output from Merge data I want to have heart_disease data with two additional columns with cluster labels, so I can compare clusterings.

If we decide that, no, it should keep two columns, it would also duplicate all other columns (age, max HR...).

Options:

  1. Revert removal of Variable.make.
  2. Do nothing. The user has to rename the columns with duplicated names.
  3. Check whether the columns with same names have the same data. If so, keep a single column (and show info?). If they are different, use both columns with renaming as introduced in [FIX] Merge data: Rename variables with duplicated names #4076.
  4. Check whether the columns with same names have the same data. If so, keep a single one. If not, show an error and let the user do the renaming.

My 4 cents:

  1. No. This problem is small in comparison with those caused by Variable.make. Something much worse must happen to reintroduce it
  2. No. The user needs to know why (s)he doesn't have two columns.
  3. Yes.
  4. Probably no. I see no good reason for it. Let us not annoy the user if the widget can do the job reasonably good. The user can still rename is (s)he wants more informative names if (s)he chooses to. Besides, option 3 is already almost implemented in [FIX] Merge data: Rename variables with duplicated names #4076, we just need to add checking the columns. If we go for 4, we'd discard [FIX] Merge data: Rename variables with duplicated names #4076, which would be a shame.

This problem was not caused by #4076. #4076 just didn't (and couldn't have) fixed it.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions