MountTag hashing using ":" instead of "\0" separator#5081
Conversation
Using \0 as a separator makes the hash hard to reproduce in environments that do not allow NUL bytes in strings. This affects, for example, NixOS configurations that generate filesystem mount definitions directly rather than using cloud-init. Signed-off-by: Alain Lehmann <alain.lehmann@gmail.com>
| // Both paths are hashed to handle the same location mounted to multiple mount points. | ||
| func MountTag(location, mountPoint string) string { | ||
| sha := sha256.Sum256([]byte(location + "\x00" + mountPoint)) | ||
| sha := sha256.Sum256([]byte(location + ":" + mountPoint)) |
There was a problem hiding this comment.
The hash is expected to be stable across versions, so I don't think we can accept this change.
Why does https://github.com/ciderale/nixos-lima need to compute the hash by itself?
If it is really necessary it may invoke another scripting language such as bash or Python
There was a problem hiding this comment.
Thank you for the quick reply. In NixOS, the aim is to declaratively define the system configuration using the nix language and its module system. My nixos-lima setup can essentially avoid configuration via cloud-init (except for writing the "lima-boot-done" files). Technically, there is a way to call external programs and read them back into the nix configuration, but that should generally be avoided and causes other issues in this MacOS/Linux setup.
My impression was, that the mountTag is an opaque implementation detail as cloud-init gets the tag as configuration and mounts them accordingly. I was more worried, that my implementation now depends on this hashing -- I have a locally patched lima version. Is there a possibility to change that in some (not too far) future version?
Another idea was: Would it make sense to allow to provide a mountTag in the configuration explicitly? The current hash computation would be used if no mountTag was specified. That would be a bit more of a change, but that way, there is no coupling to this hashing.
There was a problem hiding this comment.
I don't quite understand the issue: if you don't use the tag generated by Lima, then it shouldn't matter what kind of tag you calculate in Nix, as long as it is stable. Why do they have to match?
There was a problem hiding this comment.
Lima-vm is "registering" the tags that eventually are "mountable" in the guest -- the tag is kind of the interface.
The NixOS configuration only does the latter. With my lima-nixos configuration, I also generate the lima configuration yaml. However, since the mountTag is currently not configurable in the lima configuration, I have to match the hashing scheme.
With the idea of making the mountTag configurable (in lima.yaml), my nix configuration could define both sides. In that case, there would not be a need to match the hashing.
Making the mountTag (optionally) configurable would be backward compatible, I suppose. The hashing would be used if the configuration has no explicit mountTag. But it is slightly more complex than changing the separator and one still had to use some hashing to configure the mountTag.
May I ask, in which scenario would a change of the hashing (as in this PR) have a noticeable impact? I currently don't see it, but I guess you have a concrete example where it matters.
There was a problem hiding this comment.
The whole point of switching from an index to a hash (c30cc31) was to have a stable mapping across reboots, so fstab entries would remain correct early during the boot, even before cloud-init is being processed. Otherwise you would need 2 reboot cycles when the mounts changed.
Changing the hashing function would break existing instances when you upgrade Lima.
I still don't understand why Nix can't extract the tags from user-data, which should get written even if you don't use cloud-init. You should never need to recompute hash values in the guest; they should be opaque tags. The fact that they are hashes is an implementation detail.
There was a problem hiding this comment.
Let me close this PR, feel free to open a new one for defining the tag in the yaml
There was a problem hiding this comment.
Given that changing the hashing algorithm is not going to break existing instance (they will all remount because all the tags have changed, so there should be no risk of mounting the wrong device), isn't it better not to expose this implementation detail to the end user (well, template author)?
Because now a Nix user would have to manually specify a tag for each mount point.
So unless I misunderstand that changing the hash doesn't break anything, I would prefer the fix from the current PR.
There was a problem hiding this comment.
So unless I misunderstand that changing the hash doesn't break anything
Was it confirmed that it does not leave old entries in fstab?
There was a problem hiding this comment.
In short, my nixos-lima has a unified configuration.nix that generates the lima.yaml configuration and also the NixOS guest configuration.
Sorry, I totally misunderstood that this calculation happens again on the host and not inside the guest.
There was a problem hiding this comment.
Was it confirmed that it does not leave old entries in fstab?
It was not confirmed yet, and I had no chance to try, but it sounded convincing:
- When the hash function changes, all tags change.
- On reboot none of the old tags exist anymore, so all the entries will be removed from
fstab. - During cloud-init all mounts will be re-created using the new tags.
This was a problem before because the tags were re-used and would point to the wrong location. The mounts were not removed during reboot because the tag, e.g. mount1 still existed, it just now pointed somewhere else.
But with a changed hash function, none of the old tags should exist anymore, so fstab should be clean by the time cloud-init starts.
It is like deleting all mounts and adding them all back in a single boot cycle.
Using \0 as a separator makes the hash hard to reproduce in environments that do not allow NUL bytes in strings. This affects, for example, NixOS configurations that generate filesystem mount definitions directly rather than using cloud-init (https://github.com/ciderale/nixos-lima/blob/general-update/modules/lima_mounts.nix#L12). While ":" is a valid file path character, I don't think that this change may cause a problem in practice for the hashed mount tag.