Clarify UTF encoding between C# strings and Godot Strings#10920
Clarify UTF encoding between C# strings and Godot Strings#10920skyace65 merged 3 commits intogodotengine:masterfrom
Conversation
…coding Clarified that C# System.String uses UTF-16 encoding while Godot String uses UTF-32.
…dot-String-UTF-encoding Update c_sharp_differences.rst with C# string and Godot String UTF encoding
Revising for grammatical fixes in changes. Co-authored-by: A Thousand Ships <96648715+AThousandShips@users.noreply.github.com>
raulsntos
left a comment
There was a problem hiding this comment.
Thanks for contributing to the C# documentation. I think we had in mind something more extensive that clarifies why this can be a problem.
Normally, the encoding is not a problem since we convert between C# strings and Godot strings automatically, so ideally users wouldn't even need to think about it. So mentioning the encoding difference would not be important if that's all there is to say.
We wanted to add a note about this in the documentation because of the problems that it may cause in some APIs. The example given in the discussion from #7612 was TextServer::string_get_word_breaks. This API breaks the text into words and returns an array of character indices, but these indices will be wrong for C# strings in some cases.
For example, for the string "ℌ𝔢𝔩𝔩𝔬 𝔚𝔬𝔯𝔩𝔡" the returned array would be [0, 5, 6, 11]. But those indices don't correspond in the C# string because the characters may take more than a single UTF-16 character. In C# the indices should be [0, 9, 10, 20] or you should use System.Rune instead.
|
Thanks! And congrats on your first merged PR! |
Fix #7682
Updated c_sharp_differences.rst to include the difference between the UTF encoding for C# strings and Godot Strings.