-
Notifications
You must be signed in to change notification settings - Fork 7
Swift lovers ‐ maybe it's time we gave up array subscripting
People who are new to Swift and who are not from a C/C++ background, once they have got their head around the idea of a kind of “safe by design” high performance language, often ask about array subscripting. After all this is one of the first areas that it’s still easy to make the language crash.
let myArray = [1,2,3]
print(myArray[3]) // BOOM !
And this isn’t abstract. Any experienced Swift programmer has done this in some way or another.
Why isn’t the “indexing” style subscript we are all used to returning an optional? If we had this from day one, everyone would be used to it and a lot of crashes would have been avoided. Sure there would have been some learning curve for people from C but people from other languages may well have found it natural, and some performance downsides but I’m sure the geniuses who built this compiler and standard library would have minimised them (and I genuinely mean geniuses, having talked to a few of them).
An argumentative soul might even go as far as to say “how can you call this language safe, when it has such a glaring safety hole in it that makes it so easy to accidentally crash programs??”
And I don’t think we should as a community ignore such questions or shy away from them.
I can’t speak for the Swift core people, they have undoubtedly written far more elegant and deep pieces on this subject elsewhere (any links and references to content by core team and major contributors on this would be gratefully received to add to this… add in comments).
So it’s just my 2c…
Firstly, there’s a distinction between superficial safety and true safety and I think it’s important.
Secondly, somewhat controversially, I’d argue IMO array indexing is essentially a legacy feature to be avoided, ideally. And that’s where the second part of this blog comes in, and where I’ll get into some fun code examples. :)
What do I mean regarding safety? I think this is philosophically crucial to understand when you’re learning Swift more deeply, why it is how it is, what certain trade offs are about and (most importantly) how you as a budding senior Swift engineer and community contributor (library writer, blogger, etc.) can add to and increase that wall of safety in your own small ways.
The key thing I think is about predictability and in a way consent. You know what you’re trying to achieve and in your expression of it, in the Swift language, if the compiler feels it’s “not sure what you mean” or it “can’t give you what you asked for” then it decides it is safest to crash the program instead of continuing on in an undefined way with its “best guess”. This is really crucial to understand and compare to other C like languages. There are many far better blog pieces (and videos) about the dangers of undefined behaviour in C. Suffice to say that along with "bad pointer stuff", undefined behaviour is probably the single biggest cause of security holes and difficult to diagnose issues in programs written in the C language family. Crucially, it affects even simple programs. Everyone knows how dangerous multi threaded races can be, but many programmers choose to just avoid threading except where essential. You cannot avoid undefined behaviour in many of these languages. It’s always there lurking.
So I think the designers in the end scratched their head and said “we need some compatibility and performance requirements that make optional return values non ideal… we feel we have no choice but to make array indexing return non optional values, the most important thing we can do to keep safety is assert on array index out of bounds”. It’s one of those questions that will go around forever.
I’d argue, in a way, the question for them was “given that legacy constraints mean we still have to support array indexing/subscripting in the language, keep it performant and avoid optional returns, how do we give the best experience for all these balancing, competing design goals?”
So they settled on a bounds check for all array indexing, and program termination / trap on array index out of bounds.
Also note: performance purists will quietly think, "hang on, you are checking array bounds every time you do an array subscripting call, and I can't opt out of that? it sounds like a performance issue, do I need to sometimes disable that or work around it for performance reasons?" ... luckily the super clever Swift guys (see above) thought of that. The bounds checking is carefully elided by the compiler wherever possible with lots of fancy optimisations. In most cases you won't get an actual bounds check in code but the compiler guarantees there will be one when it is needed, so there is always a theoretical bounds check on all array subscripting, and you can 100% rely on it. The best of both worlds!
Of course, if the check fails, the program is crashed, there's no opportunity for the developer to "save the situation"... no error thrown, no last chances. So you have crashes that frustrate us all and bounds checks that some performance enthusiasts would rather avoid, all due to supporting the (arguably) legacy language feature of array indexing/subscripting...
Which brings me neatly onto my next point…
Yes, I’m serious. Hear me out. 😄
As ever, I think the smart thing to do is to turn it on its head and say “what do you actually need array indexing for?” Or equivalently, “if it was removed from the language tomorrow, what would be impossible or hard to do?”
Because really, if the answer was “uh… nothing I guess” then why not just remove it or deprecate it or add some great big warnings saying “don’t do this in your programs”. We’ve done that in a lot of places for a lot of things over the years. There should be no sacred cows. Many people have butted heads with defined initialisation and (after possible occasional quiet cussing) learned to love it for what it gives you.
And I don’t think valid answers are “because people just expect it” or “because everyone does it don’t they?”
To answer the “why” I tried to think of a few examples where I think you’d naturally reach for array indexing and where it might not be immediately obvious to a newer Swift dwveloper what an alternative would be.
I came up with three scenarios I can think of. I’m very interested if there are more (feel free to comment and link again!)
Note: this is a swift for Arduino program rather than a regular swift playground or whatever so there may be some bits people new to our platform don’t get… also for people who landed on this blog who aren’t interested in writing Swift on embedded microcontrollers, it would be cool if someone converts this to a pure regular big platform/“full fat” version one day, to get wider feedback on the ideas.
import AVR
// data
let myArray = [0,19,34,112,2,1,0]
let display = ["red","blue","red","red","blue","blue","blue"]
// helper functions
func activateFeature(feature: UInt8) {
// some action here, probably in a switch statement
}
// crude debounced pin read
func debouncedPinActions(pin: Pin, action: () -> Bool) {
while true {
if digitalRead(pin: D4) {
// wait until button released
while digitalRead(pin: D4) {}
let success = action()
if !success { break }
// crude debounce
delay(us: 300)
}
}
}
// *** CASE 1 ***
// not preferred (uses subscripting):
for i in 0..<myArray.count {
let element = myArray[i]
print("element is: \(element)")
}
// preferred (no subscripting):
for element in myArray {
print("element is: \(element)")
}
// *** CASE 2 ***
// not preferred (uses subscripting):
for i in 0..<myArray.count {
let element = myArray[i]
let displayColor = display[i]
print("element: \(element) has color: \(displayColor)")
}
// preferred (no subscripting):
for (element, displayColor) in zip(myArray,display) {
print("element: \(element) has color: \(displayColor)")
}
// *** CASE 3 ***
// not preferred (uses subscripting):
var currentFeature = 0
debouncedPinActions(pin: D4) {
let element = myArray[Int(currentFeature)]
activateFeature(feature: element)
currentFeature += 1
return currentFeature < myArray.count
}
// preferred (no subscripting):
var iterator = myArray.makeIterator()
debouncedPinActions(pin: D4) {
if let element = iterator.next() {
activateFeature(feature: element)
return true
} else {
// we have exhausted all the features, the user has pressed the button 7 times
return false
}
}
I hope y’all find this discussion at least interesting and helpful. I came across this issue of course as a swift for Arduino platform team member. As with some other things, what is a nuisance on “the big platforms” becomes a downright existential crisis (no compiler pun intended) on our embedded platforms. You really don’t want your servo motor controller program to randomly crash or hang.
I tried adding optional indexing on our “Microswift standard library” but in the end it was just too big a change and didn’t work properly with the compiler. (Our micro stdlib actually takes a subtly different compromise step, out of bounds indexing on arrays is bounded to upper/lower bounds. On our platform the code snippet at the start of this blog will not crash but return 3… and yes it’s controversial!)
So this set of ideas started as a pragmatic way “how do we get away from this problem altogether forever?” Or maybe as “what might Doug or Dave or Chris do??” ;)