Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support non-UTF8 key comments? #331

Open
Eugeny opened this issue Jan 23, 2025 · 3 comments
Open

Support non-UTF8 key comments? #331

Eugeny opened this issue Jan 23, 2025 · 3 comments

Comments

@Eugeny
Copy link
Contributor

Eugeny commented Jan 23, 2025

Technically the OpenSSH key comment does not have to be in UTF-8. Currently, ssh-key will always fail to parse such keys with no way around it. Would you consider changing the API to use Vec<u8> for the comment instead?

I've already hadmultiple in-the-wild keys reported by Tabby users so far.

Example:

-----BEGIN OPENSSH PRIVATE KEY-----
b3BlbnNzaC1rZXktdjEAAAAABG5vbmUAAAAEbm9uZQAAAAAAAAABAAAAMwAAAAtzc2gtZW
QyNTUxOQAAACD2TvUAi4NJBWXBvDkzbLiIjgjeMzhTz9cUleJn2zRWKAAAAJi1GedGtRnn
RgAAAAtzc2gtZWQyNTUxOQAAACD2TvUAi4NJBWXBvDkzbLiIjgjeMzhTz9cUleJn2zRWKA
AAAEDBi49uVXZzDhN8JohiYkBdezFWbCAw6iCS2JRA4J0ujfZO9QCLg0kFZcG8OTNsuIiO
CN4zOFPP1xSV4mfbNFYoAAAAFHN0YXJfQLLcyPHI8cjxtcS158TUAQ==
-----END OPENSSH PRIVATE KEY-----
@tarcieri
Copy link
Member

The relevant protocol spec is here: https://github.com/openssh/openssh-portable/blob/2d2c068/PROTOCOL.key#L44

It's a string field, so really this is a question of how the ssh-encoding crate handles all string fields everywhere.

The encoding of string fields is specified in RFC 4251 § 5: Data Type Representations Used in the SSH Protocols:

 Arbitrary length binary string.  Strings are allowed to contain
 arbitrary binary data, including null characters and 8-bit
 characters.  They are stored as a uint32 containing its length
 (number of bytes that follow) and zero (= empty string) or more
 bytes that are the value of the string.  Terminating null
 characters are not used.

 Strings are also used to store text.  In that case, US-ASCII is
 used for internal names, and ISO-10646 UTF-8 for text that might
 be displayed to the user.  The terminating null character SHOULD
 NOT normally be stored in the string.  For example: the US-ASCII
 string "testing" is represented as 00 00 00 07 t e s t i n g.  The
 UTF-8 mapping does not alter the encoding of US-ASCII characters.

It does specify they can hold arbitrary binary data, so to fix this properly we would need to stop using String and str entirely and switch to a newtype wrapper (e.g. SshString) which internally uses something like Vec<u8> as you suggest and make all conversions from that type to String/str fallible.

I'll just say for now that isn't something I have time to work on, and there is already quite a backlog of unreviewed PRs on this repo. If it's a change you are interested in seeing through, open a PR but please note I probably won't have time to review it for awhile.

@Eugeny
Copy link
Contributor Author

Eugeny commented Jan 23, 2025

Thanks and no worries - I just wanted to know if you also see this as an acceptable API change.

@tarcieri
Copy link
Member

tarcieri commented Jan 23, 2025

It's an unfortunate/annoying API change but probably for the best to honor how strings are described in RFC4251.

You can perhaps look to types like std::fs::Path(Buf) for API ides, notably something like display as a way to make the strings relatively easily printable as UTF-8 strings (which is also what is described in RFC4251).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants