You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We similarly propose a `UTF8Span.CharacterIterator` type that can do grapheme-breaking forwards and backwards.
342
318
343
319
The `CharacterIterator` assumes that the start and end of the `UTF8Span` is the start and end of content.
344
320
345
-
Any scalar-aligned position is a valid place to start or reset the grapheme-breaking algorithm to, though you could get different `Character` output if if resetting to a position that isn't `Character`-aligned relative to the start of the `UTF8Span` (e.g. in the middle of a series of regional indicators).
321
+
Any scalar-aligned position is a valid place to start or reset the grapheme-breaking algorithm to, though you could get different `Character` output if resetting to a position that isn't `Character`-aligned relative to the start of the `UTF8Span` (e.g. in the middle of a series of regional indicators).
346
322
347
323
```swift
348
324
@@ -357,8 +333,9 @@ extension UTF8Span {
357
333
publicstructCharacterIterator: ~Escapable {
358
334
publiclet codeUnits: UTF8Span
359
335
360
-
/// The byte offset of the start of the next `Character`. This is
361
-
/// always scalar-aligned and `Character`-aligned.
336
+
/// The byte offset of the start of the next `Character`. This is always
337
+
/// scalar-aligned. It is always `Character`-aligned relative to the last
338
+
/// call to `reset` (or the start of the span if not called).
362
339
publicvar currentCodeUnitOffset: Int { getprivate(set) }
363
340
364
341
publicinit(_span: UTF8Span)
@@ -827,23 +804,5 @@ Finally, in the future there will likely be some kind of `Container` protocol fo
827
804
828
805
Karoy Lorentey, Karl, Geordie_J, and fclout, contributed to this proposal with their clarifying questions and discussions.
829
806
830
-
<!--
831
-
832
-
Pending questions:
833
-
834
-
1) How should we talk about `_countAndFlags` and the frozenness of `UTF8Span` and its stored properties?
835
-
836
-
We want to be able to communicate to SE what the type is and how it could evolve.
837
-
838
-
Basically, I want to say that this is a trivial 2-word struct whose lifetime is statically managed. Trivial 2-word comes from `@frozen` and listing its stored members in the proposal and statically managed comes from mentioning the `:~Escapable`. This is similar to how the `Span` proposal specified both `@frozen` and the stored members (it did omit `@usableFromInline`).
839
-
840
-
If we are going to talk about the layout in the proposal, then the next question is whether it makes some sense to talk about the custom hand-coded bit interpretation for some of that layout. It is very much ABI and it shows potential evolution directions and constraints. I could see arguments either way.
841
-
842
-
2) Should we have a public unsafe unchecked initializer that skips UTF-8 validation?
843
-
844
-
We'd want the developer to be very sure that it isin fact valid UTF-8. For example, Rust has `from_utf8_unchecked()`.
0 commit comments