-
-
Notifications
You must be signed in to change notification settings - Fork 10
/
Copy path04_concurrency.slide
2008 lines (1034 loc) · 79 KB
/
04_concurrency.slide
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Introduction to Go
4 - Concurrency: goroutines, channels, select, mutexes
Julien Cretel
https://jub0bs.com
https://bsky.app/profile/jub0bs.com
* Namecheck project: the need for concurrency
In a realistic Namecheck project, you'd add support for many other platforms (X, Stack Overflow, etc.), but that takes time. Instead, let's simulate support for many platforms.
1. In `main.go`, judiciously populate your `[]Checker` slice with twenty pointers to the same `github.GitHub` variable.
2. Similarly, populate your slice of checkers with twenty pointers to the same `bluesky.Bluesky` instance.
3. Run your executable. Is it fast enough?
Since we only care about receiving all the responses, not the order in which we receive them, we could treat platforms, not sequentially, but concurrently!
.image https://raw.githubusercontent.com/egonelbre/gophers/master/vector/superhero/gotham.svg 150 _
* What is concurrency?
Concurrency is the art of composing a program in terms of independent computations.
Go has native support (i.e. in the language and runtime) for concurrency.
Go's concurrency model arguably is one of the language's most attractive features.
.image https://go.dev/talks/2012/waza/gophersimple2.jpg 200 _
* A word of warning about concurrency
In the remainder of this course, you'll realize that concurrent programs are harder to reason about than sequential ones.
Concurrent programs also have many pitfalls, including (but not limited to) *synchronization*bugs*, *goroutine*leaks*, *deadlocks*, and *unbounded*parallelism*.
People claim that Go is easy to master, but what's deceptively easy is to write subtly broken concurrent programs that only have a vague air of correctness.
Don't rush into concurrency. Instead, invest time upfront to gain a deep understanding of it; this initial investment will pay dividends over and over.
.image https://www.globalnerdy.com/wp-content/uploads/2017/08/gopher-this-is-fine.jpg 200 _
: starting a goroutine is deceptively simple: three keystrokes; https://www.youtube.com/watch?v=rFejpH_tAHM&t=880s
* Concurrency metaphors
Disclaimer: many cooking metaphors ahead!
.image https://homemaking.com/wp-content/uploads/2020/04/cooking.jpg 400 _
* Metaphor: a cheeseburger recipe
You can describe a recipe in a sequential manner or in a _concurrent_ manner.
.image img/burger.svg _ 700
If you've got help in the kitchen, you can _parallelize_ and finish the dish faster.
* Concurrency is not parallelism
*Concurrency*
- the art of structuring a program in terms of *independent* computations
- a property of the program's *code*
- fundamentally about program design
*Parallelism*
- the *simultaneous* execution of multiple computations on multiple cores
- a property of a program's *execution*
- motivated by performance needs
.image https://go.dev/blog/store/gophers.jpg 150 _
* Concurrency is valuable regardless of parallelism
Concurrency enables scalability through parallelism, but [[https://www.youtube.com/watch?v=jgVhBThJdXc&t=1925s][parallelism isn't the goal]].
[[https://www.youtube.com/watch?v=jgVhBThJdXc&t=1278s][Writing concurrent code is a valuable act of design]] regardless of whether the program will eventually be run in parallel.
And if well-designed concurrent code _does_ run in parallel, we can reasonably expect performance to scale gracefully as the number of cores increase.
Here are some good conference talks that delve into the useful distinction between concurrency and parallelism:
- [[https://www.youtube.com/watch?v=jgVhBThJdXc&t=22m][Rob Pike & Russ Cox - Go Programming (Google I/O 2010)]]
- [[https://www.youtube.com/watch?v=oV9rvDllKEg][Rob Pike - Concurrency is not Parallelism (2012)]]
- [[https://www.youtube.com/watch?v=PAAkCSZUG1c&t=3m42s][Rob Pike - Go proverbs (Gopherfest 2015)]]
- [[https://research.swtch.com/pcdata][Russ Cox- Storing Data in Control Flow]]
: metaphor: cooking recipe typically has a concurrent component, but whether you can prepare it in parallel depends on the number of cooks/hobs you have!
* Amdahl's law: the limit of parallelism
[[https://en.wikipedia.org/wiki/Amdahl%27s_law][Amdahl's law]] tells us that the potential performance gains enabled by parallelism are bounded by the necessarily sequential portion of the work.
For instance, if at most 90% of the work can be parallelized, parallelism will allow you to carry out the work at most 10x as quickly.
.image img/amdahls_law.png 200 _
Parallelism can only get you so far; but there's more to this story...
: https://www.wolframalpha.com/input?i=plot+1%2F%28%281-0.9%29%2B0.9%2Fx%29%2C+x%3D0..80
* Gunther's Law: the diminishing returns of parallelism
The [[https://en.wikipedia.org/wiki/Neil_J._Gunther#Universal_Law_of_Computational_Scalability][universal law of computational scalability]] (also simply known as Gunther's law) refines Amdahl's law.
It tells us that, due to various forms of overhead, there is an inflexion point past which more parallelism actually leads to diminishing returns and _harms_ performance:
.image img/gunthers_law.png 200 _
Keep Gunther's law in mind. It will help you decide whether throwing more parallelism at the problem is a promising avenue for improving performance.
: overhead linked to communication between goroutines / context switching / coherency / interprocess communication
: https://www.wolframalpha.com/input?i=plot+x%2F%281%2B0.047*%28x-1%29%2B+0.021*x*%28x-1%29%29%2C+x%3D0..50
: values taken from https://speakerdeck.com/drqz/applying-the-universal-scalability-law-to-distributed-systems?slide=59
* Go's concurrency model
Go's approach to concurrent programming is largely inspired by [[https://dl.acm.org/doi/10.1145/359576.359585][Tony Hoare's _Communicating_Sequential_Processes_ (1978)]], a seminal paper ahead of its time.
In CSP, concurrent programs are structured as independent processes that execute sequentially and communicate by passing messages.
.image https://go.dev/talks/2012/waza/gophersimple2.jpg 150 _
[[https://www.youtube.com/watch?v=rFejpH_tAHM&t=880s][The three fundamental mechanisms for concurrency in Go]] are
- goroutines: "lightweight threads" managed by the Go runtime
- channels: bidirectional typed pipes used to for communication between goroutines
- the `select` statement: a construct for waiting on multiple channel communications
: the concept of channel is only present in the book version of CSP, not in the original article
: select stems from Dijkstra's guarded command: https://en.wikipedia.org/wiki/Guarded_Command_Language
: Rob Pike likens select to a railway dispatcher: https://www.youtube.com/watch?v=iTrP_EmGNmw&t=38m55s
* Goroutines
A goroutine is [[https://cacm.acm.org/magazines/2022/5/260357-the-go-programming-language-and-environment/fulltext#body-6][a concurrent unit of work]].
In contrast to traditional threads, goroutines are lightweight: Go programs that have 1000s of goroutines (or more) are common.
The `main` function is itself a goroutine that is automatically created and started by the runtime when the program is executed.
All goroutines execute in the same address space.
Each goroutine gets its own stack, which starts small and grows/shrinks as required.
.image https://go.dev/blog/store/gophers.jpg 200 _
: goroutines != futures/promises/async
: doesn't return anything that will eventually contain the results
* The Go scheduler
The _Go_scheduler_ is a part of the Go runtime that automatically [[https://cacm.acm.org/magazines/2022/5/260357-the-go-programming-language-and-environment/fulltext#body-6][multiplexes goroutines onto operating-system threads]] and manages their execution for you.
It continually observes the run-time behavior of goroutines and decides when to suspend (e.g. as they wait for a network response) and when to resume them.
No need to worry about the intricacies of the Go scheduler yet. For now, simply picture the Go scheduler as a kitchen chef, OS threads as kitchen assistants, and goroutines as food orders.
.image https://rrsg.s3.amazonaws.com/wp-content/uploads/2020/03/25180622/90517519_1171371749860657_3506201698723816143_n.jpg 250 _
: goroutines: user-space / green threads
: goroutines are non-preemptive: they cannot be interrupted
: switching goroutines is much cheaper than switching OS threads
: also OS threads typically take 1Mo of memory compared to 2kb
: OS threads execute on logical processors
* The go keyword
The `go` keyword spawns a new goroutine:
go grindCoffeeBeans(nbBeans)
go frothMilk()
It causes the function call to which it applies to get executed "in the background".
Some important remarks:
- The function's arguments (if any) are evaluated _before_ the goroutine starts.
- `go` statements do not block; they _immediately_ (after all arguments have been evaluated) yield control to the next statement.
- The function's results (if any) are lost, but channels can be used to communicate values between goroutines.
* Starting an anonymous function as a goroutine
Launching an anonymous function as a goroutine is sometimes convenient for grouping statements that should be executed in the background:
go func() {
fmt.Println("Don't mind me!")
fmt.Println("I'm just running in the background...")
}()
Just don't forget to invoke it!
go func() {
fmt.Println("Don't mind me!")
fmt.Println("I'm just running in the background...")
} // missing parentheses
: goroutine terminology: spawning/spawned, parent/child
* Goroutines vs. Unix processes
It's useful to compare goroutines to Unix processes...
The `go` keyword is [[https://www.youtube.com/watch?v=oV9rvDllKEg&t=14m45s][conceptually similar]] to the ampersand operator (`&`) in Unix shells.
#!/usr/bin/env sh
grindCoffeeBeans &
frothMilk &
However, unlike a Unix process, *a*goroutine*has*no*ID*. This limitation results from [[https://go.dev/talks/2014/go4gophers.slide#38][a conscious design decision by the Go team]], who wanted to preclude [[https://en.wikipedia.org/wiki/Thread-local_storage][_thread_locality_]].
Therefore, because a goroutine lacks an ID, it cannot readily be _killed_.
.image https://raw.githubusercontent.com/egonelbre/gophers/master/vector/fairy-tale/sage.svg 150 _
: thread-local storage: data tied to a specific thread
: wouldn't play well with Go's concurrency model
: also mentioned in gopl, but where again?
* Use the go keyword judiciously
Before deciding to spawn a new goroutine, [[https://dave.cheney.net/2016/12/22/never-start-a-goroutine-without-knowing-how-it-will-stop][you should understand exactly under what conditions it will terminate]]. See [[https://go.dev/wiki/CodeReviewComments#goroutine-lifetimes][CodeReviewComments - Goroutine lifetimes]].
Goroutines may be cheap, but they're not free. If your program carelessly spawns goroutines, it is at risk of _leaking_goroutines_, i.e. accumulating "zombie" goroutines that never terminate!
Symptoms of a goroutine leak include memory leaks, performance degradation, and even program crashes! 😱
.image https://www.globalnerdy.com/wp-content/uploads/2017/08/gopher-this-is-fine.jpg 200 _
* Namecheck project: make room for another main.go
We're about to experiment with concurrency inside another `main.go` and within the comfort of our IDE. Later, we'll write a server in that file. Let's set things up!
1. Create a folder named "cli" within the `cmd` folder. Move your `main.go` to `cmd/cli/`.
2. Create a folder named "server" within the `cmd` folder. Create a new `main.go` there.
namecheck
├── bluesky
│ └── bluesky.go
├── cmd
│ ├── cli
│ │ └── main.go
│ └── server
│ └── main.go
├── github
│ ├── github.go
│ └── github_test.go
├── go.mod
├── namecheck.go
└── stub
└── stub.go
* Exercise: concurrent coffee making
Write two functions, `grindCoffeeBeans` and `frothMilk`, that
- have no parameters and return no results,
- simply write an informative message to stdout.
Call them concurrently in the `main` function.
.image https://i.pinimg.com/originals/41/8b/9c/418b9cb037389f7c5b6783ced8a7156b.jpg 300 _
: Q: Which message will be printed first? We can't tell. in an indeterminate order. Up to the scheduler.
* Exercise: concurrent coffee making (first attempt)
.play -edit src/goroutines_without_waitgroup.go /^//START/,/^//END/
Does the program behave as you expected? Why or why not?
* main does not wait!
In general, nothing gets written to stdout because
- the two `go` statements _immediately_ transfer control back to `main`,
- `main` terminates _before_ your two goroutines can terminate normally,
- and when `main` terminates, all goroutines are killed and the program ends.
We must tell `main` to "wait" until the two goroutines terminate normally. Any ideas?
.image https://go.dev/blog/store/gophers.jpg 300 _
* Exercise: concurrent coffee making (with a little sleep at the end)
.play -edit src/goroutines_with_sleep.go /^//START/,/^//END/
There, sleeping a bit at the end of `main` "fixed" it! Incidentally, note that the relative order of execution between the two functions is unpredictable.
However, sleeping a bit at the end of `main` simply isn't good enough. Why not?
: just because goroutine A was started before goroutine B doesn't mean that A will be executed before B
: inefficient at best, nondeterministic at worst
* Sprinkling sleeps on your code is no way to coordinate goroutines
How long should the sleep be, exactly? Impossible to know in advance.
- Wait too little: `main` terminates too quickly for all the other goroutines to terminate.
- Wait too long: `main` ends up needlessly spinning and wasting time.
Sleeping is simply not a reliable and efficient way of coordinating goroutines!
To properly tell the `main` function to wait, we need another mechanism...
.image img/sleeping_guys.jpg 275 _
: there are legitimate use cases for `time.Sleep` in a Go program, but...
* Wait groups
The [[https://pkg.go.dev/sync][`sync` package]] provides a useful mechanism: _wait_groups_.
A wait group is a [[https://en.wikipedia.org/wiki/Barrier_(computer_science)][_barrier_]] mechanism: it allows you to wait for a group of goroutines to terminate before moving on.
Wait groups are particularly useful when you don't known in advance how many goroutines you're going to start.
.image https://raw.githubusercontent.com/egonelbre/gophers/master/vector/arts/upright.svg 200 _
: Java equivalent: CountDownLatch: https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/concurrent/CountDownLatch.html
: e.g. if you're spawning a goroutine for each line of a file that you're reading, or for each element received from a channel
: wait groups not needed in fan-out/fan-in if number of tasks known in advance
: compare https://go.dev/play/p/cnXXJFTUPO1 to https://go.dev/play/p/MHjuGwsP_hJ
* Declaring a wait group
The zero value of type [[https://pkg.go.dev/sync#WaitGroup][`sync.WaitGroup`]] is readily usable:
var wg sync.WaitGroup
A `sync.WaitGroup` is typically declared within a function. Declaring one at the top level is rarely (if ever) useful.
var wg sync.WaitGroup // bad
func main() {
// ...
var wg sync.WaitGroup // good
// ...
}
* How to use a wait group
A `sync.WaitGroup` is little more than a counter safely usable from multiple goroutines.
It has three methods (all of which use a pointer receiver):
func (wg *WaitGroup) Add(delta int)
func (wg *WaitGroup) Done()
func (wg *WaitGroup) Wait()
- `Add` increments the counter by `delta`. Call it _just_before_ spawning a goroutine.
- `Done` decrements the counter by 1. Call it _at_the_very_end_ of each spawned goroutine, typically via a `defer` statement.
- `Wait` blocks until the counter goes back to 0. Call it at the point of the program where you need to wait for the spawned goroutines to terminate.
: now, how not to misuse a wait group
* Exercise: concurrent coffee making (using a wait group)
Use a wait group to make `main` wait until both `grindCoffeeBeans` and `frothMilk` have terminated before `main` itself can terminate.
Tip: you can apply the `go` keyword to an anonymous function.
.image https://i.pinimg.com/originals/41/8b/9c/418b9cb037389f7c5b6783ced8a7156b.jpg 300 _
: if you don't control the functions you want to spawn as goroutines, wrap them in anonymous funcs that capture the waitgroup
* Exercise: concurrent coffee making (using a wait group)
(The implementations of `grindCoffeeBeans` and `frothMilk` remain unchanged.)
.play -edit src/goroutines_with_waitgroup.go /^//START/,/^//END/
If changing the signature of the function is impossible (because you don't own it) or undesirable (because you want to retain the freedom to call it synchronously), you can always wrap the call in an anonymous function and launch the latter as a goroutine.
: quiz: What happens if I remove a wg.Done()? deadlock
: quiz: What happens if I remove a wg.Add(1)? only one of the two goroutine will likely have time execute
* Coalesce Add calls if you want, but beware...
If, during each iteration of a loop of fixed size `n`, you unconditionally start a goroutine, you may replace the multiple `Add` calls within the loop by a single one before the loop:
const n = 16
var wg sync.WaitGroup
wg.Add(n)
for range n {
go func() { defer wg.Done(); fmt.Println("Hi!") }() // unconditional
}
wg.Wait()
Though correct, this approach creates a *refactoring*hazard*. If the goroutines are later started only conditionally, the program will no longer work as expected:
wg.Add(n)
for range n {
if heads() { // function head returns true with probability 0.5
go func() { defer wg.Done(); fmt.Println("Hi!") }() // conditional
}
}
wg.Wait() // possible deadlock here
: refactoring hazard: source of bugs when refactoring
* Do not copy a wait group after first use
A `sync.WaitGroup` value must not be copied after first use.
Therefore, if a function needs to operate on a wait group that is declared outside the function, pass a *pointer* to that wait group (`*sync.WaitGroup`) to the function.
.play -edit src/waitgroup_wrong2.go /^//START/,/^//END/
The `vet` subcommand [[https://go.dev/play/p/2KzpEuzFO-j][alerts you about such programming mistakes]].
* Call Add before starting the corresponding goroutine
Calls to `Add` with a positive delta that occur when the counter is zero must happen [[https://go.dev/ref/mem#model][*before*]] any call to `Wait`. If you fail to follow this rule, your program will contain a *synchronization*bug* (more about that soon).
In practice, this rule typically implies that calls to `Add` must take place in the *spawning* goroutine (i.e. the `main` function in the example below), not within the spawned goroutine.
.play -edit src/waitgroup_wrong.go /^//START/,/^//END/
The `go`vet` subcommand will [[https://github.com/golang/go/issues/18022][soon]] alert you about such programming mistakes.
: add a call to runtime.Gosched before the wait to reveal the presence of a sync bug
: Gosched not to be used for coordinating goroutines
* There must be as many increments as decrements
In the toy example below, a wait group is incremented once but never decremented:
.play -edit src/wg_deadlock.go /^//START/,/^//END/
This programming mistake leads to a *deadlock* (more about that soon).
Here, the Go runtime is able to detect the deadlock and crashes the program.
.image https://www.globalnerdy.com/wp-content/uploads/2017/08/gopher-this-is-fine.jpg 200 _
* Namecheck project: check each user name concurrently (fan out)
1. Extract the body of the for-range loop that iterates on your `[]Checker` slice as a function in your `main` package:
func check(checker Checker, username string)
2. Add a parameter of type `*sync.WaitGroup` to your `check` function.
3. Within the for-range loop, apply the `go` keyword to your `check` function.
4. Use a wait group in order to wait, at the end of the `main` function, for all the goroutines that you've spawned to terminate.
5. Time the program's execution. Does user experience improve?
What if, instead of printing the results, we wanted to somehow aggregate them?
.image https://raw.githubusercontent.com/egonelbre/gophers/master/vector/superhero/gotham.svg 100 _
: use a wg to synchronize the close operation with the multiple send operations
: Remember: the result(s) of the function call to which the `go` keyword is applied are lost.
: We need a mechanism to communicate between goroutines: channels.
* Deadlock
A Go program is said to be in a [[https://en.wikipedia.org/wiki/Deadlock][deadlock]] when two or more goroutines are waiting on one another to unblock them so they can resume execution.
.image img/deadlocked_carrots.jpg 250 _
In the most simple cases, the Go runtime can detect the deadlock and then deliberately emits a fatal error, which puts a brutal end to execution.
In more complex cases, though, you'll get no such welcome help from the runtime; the execution of your program may "hang", seemingly making no progress.
: French: interblocage, étreinte fatale
* Synchronization bugs
A [[https://medium.com/@val_deleplace/does-the-race-detector-catch-all-data-races-1afed51d57fb][_synchronization_bug_]] is a bug that, due to insufficient synchronization between goroutines (and unpredictable timing factor), causes the program to display an undesirably erratic behavior.
.image https://www.globalnerdy.com/wp-content/uploads/2017/08/gopher-this-is-fine.jpg 200 _
What is or isn't a synchronization bug may depend on the semantics of your program.
For instance, in our recent coffee-making example, if we had required `grindCoffeeBeans` to systematically execute before `frothMilk`, our concurrent solution would have contained a synchronization bug.
* Race conditions
At run time, a synchronization bug may (or may not) manifest itself as a [[https://en.wikipedia.org/wiki/Race_condition#In_software][_race_condition_]].
Because race conditions can be fiendishly difficult to reproduce and troubleshoot, you should remain vigilant about not leaving any synchronization bug in your programs.
Remember this useful distinction (similar to that between concurrency and parallelism):
- a synchronization bug is a property of a program's *code*, whereas
- a race condition is a property of a program's *execution*.
.image https://raw.githubusercontent.com/egonelbre/gophers/master/vector/fairy-tale/sage.svg 200 _
: eg: data race reproducible only under heavy load
* Data races: a subtype of race conditions that you cannot ignore
[[https://cacm.acm.org/magazines/2022/5/260357-the-go-programming-language-and-environment/fulltext#body-6][Goroutines share the same memory address space]]: multiple goroutines can access some shared memory without triggering an access violation at the OS level.
However, [[https://go.dev/doc/effective_go.html#goroutines][the onus is on you to be diligent at the application level]]! Go indeed doesn't provide [[https://doc.rust-lang.org/book/ch16-00-concurrency.html][the guarantees about concurrency that Rust does]].
A [[https://en.wikipedia.org/wiki/Race_condition#Data_race][_data_race_]] is a category of race condition that occurs [[https://go.dev/doc/articles/race_detector#Introduction][when two or more goroutines access the same variable concurrently and at least one of those accesses is a write]]. A data race is always symptomatic of a severe synchronization bug!
To be free of such synchronization bugs, your programs [[https://go.dev/ref/mem#advice][*must* serialize such access]]; otherwise, their correctness and security cannot be guaranteed. 😬
.image https://www.globalnerdy.com/wp-content/uploads/2017/08/gopher-this-is-fine.jpg 150 _
: not everyone agrees about the distinction between data races and race conditions
* A synchronization bug and a rare data race
The following (naive) program purports to print either "0" or nothing to the screen:
.play -edit src/data_race.go /^//START/,/^//END/
However, this program suffers from a synchronization bug.
Because the spawned goroutine reads variable `count` while the `main` function updates it without any synchronization, a data race may occur, and "1" may be printed to the screen instead of "0". Can you explain how?
: to smoke out the sync bug, insert a call to runtime.Gosched at the top of the if statement
* A synchronization bug and a rare data race (cont'd)
The `main` function involves two reads of variable `count`, and the spawned goroutine involves one read followed by one write.
But because neither function carries out its work in an [[https://en.wikipedia.org/wiki/Linearizability][_atomic_]] fashion, the `main` function's accesses may interleave with the goroutine's accesses.
As the following diagram (on which time flows downwards) illustrates, if the goroutine's write gets sandwiched between `main`'s two reads, "1" will be printed to the screen.
(main) (func) | count or... (main) (func) | count
| |
read | 0 read | 0
read | 0 read | 0
write | 1 write | 1
read | 1 read | 1
You may only very rarely observe such a data race, but don't be fooled: the program really is "racy". [[https://go.dev/play/p/g3XMqFXVsjr][A more extreme example]] may be sufficient to convince you.
* Go's race detector
Go provides a [[https://go.dev/blog/race-detector][race detector]], which can be activated via the `-race` flag:
$ go run -race main.go
0
==================
WARNING: DATA RACE
Write at 0x00c0000120d8 by goroutine 6:
main.main.func1()
~/Desktop/go-course-beginner/main.go:8 +0x44
Previous read at 0x00c0000120d8 by main goroutine:
main.main()
~/Desktop/go-course-beginner/main.go:10 +0xb0
Goroutine 6 (running) created at:
main.main()
~/Desktop/go-course-beginner/main.go:7 +0xa6
==================
Found 1 data race(s)
exit status 66
Note that, although the race dectector reports any data race that occurs during execution, [[https://go.dev/doc/articles/race_detector][it's not guaranteed to uncover all synchronization bugs!]]
* Exercise: activate the race detector
1. Overwrite your `server/main.go` file with the previous program ([[https://go.dev/play/p/kiSW39zX-o9][playground]]).
2. Run the program with the race detector activated.
3. Explain the resulting data-race report.
.image https://raw.githubusercontent.com/egonelbre/gophers/master/vector/superhero/gotham.svg 200 _
* Don't leave the race detector on in production
Because the race detector [[https://www.youtube.com/watch?v=5erqWdlhQLA&t=1200s][involves a lot of bookkeeping]], the overhead associated with it [[https://go.dev/doc/articles/race_detector#Runtime_Overheads][can be prohibitive]]:
- 5-10x as much memory usage (memory usage may in fact grow without bounds),
- 2-20x as long execution time.
Therefore, [[https://www.youtube.com/watch?v=5erqWdlhQLA&t=2044s][you should only activate the race detector in pre-prod/staging environments]].
Incidentally, if you frequently feel the need to fix synchronization bugs in production environments, you should probably pause to reassess your software-development practices. 😅
For a deep dive into the mechanics of Go's race detector, watch [[https://www.youtube.com/watch?v=5erqWdlhQLA][Kavya Joshi - "go test -race" under the hood (Strange Loop 2016)]].
* Concurrency safety
_Concurrency_safety_ is a property of an operation (usually a function/method call).
An operation is _concurrency-safe_ if carrying it out concurrently (from multiple goroutines) without any extra synchronization does not cause it to behave incorrectly (i.e. does not introduce any synchronization bugs).
[[https://tip.golang.org/doc/comment#func][By convention]], top-level functions should be designed as concurrency-safe.
However, unless a method is documented as concurrency-safe, do not assume that it is!
In fact, some methods even go as far as explicitly documenting their lack of concurrency safety. See, for instance, [[https://pkg.go.dev/regexp#Regexp.Longest][the documentation of method `(*regexp.Regexp).Longest`]].
.image https://raw.githubusercontent.com/egonelbre/gophers/master/vector/fairy-tale/knight.svg 150 _
* Concurrency-safe types
When all the methods on a given type are safe for concurrent use by multiple goroutines, the type itself is said to be _concurrency-safe_. Here are some examples:
- [[https://pkg.go.dev/sync#WaitGroup][Struct type `sync.WaitGroup`]] obviously can safely be used concurrently.
- [[https://pkg.go.dev/log#Logger][Struct type `log.Logger`]] guarantees to serialize access to the underlying `io.Writer`.
- [[https://pkg.go.dev/net/http#Client][Struct type `http.Client`]] allows you to safely send HTTP requests using the same client from multiple goroutines.
- [[https://pkg.go.dev/net/http#CookieJar][Interface `http.CookieJar`]] requires its implementations to be concurrency-safe.
Concurrency-safe types are always documented as such.
.image https://raw.githubusercontent.com/egonelbre/gophers/master/vector/fairy-tale/knight.svg 150 _
* Concurrency-unsafe types
Be careful: most types are _not_ concurrency-safe!
You should not assume a type is concurrency-safe [[https://tip.golang.org/doc/comment#type][unless it's documented as such]].
In fact, some types even go as far as explicitly documenting their lack of concurrency safety. See, for instance, [[https://pkg.go.dev/math/rand/v2#Source][the documentation of `rand.Source`]] (which represents a source of randomness).
.image https://www.globalnerdy.com/wp-content/uploads/2017/08/gopher-this-is-fine.jpg 200 _
* Shared-state concurrency vs. message-passing concurrency
To safely communicate data between goroutines, a concurrent Go program may follow two distinct approaches:
- [[https://wiki.c2.com/?SharedStateConcurrency][_shared-state_concurrency_]], whereby access to data by multiple goroutines is mediated via traditional mechanisms like locks and mutexes; and
- [[https://wiki.c2.com/?MessagePassingConcurrency][_message-passing_concurrency_]], whereby [[https://cacm.acm.org/magazines/2022/5/260357-the-go-programming-language-and-environment/fulltext#body-6][ownership of data is transferred between goroutines via channels]].
Although both approaches are possible (even within a single program), [[https://go.dev/doc/effective_go#sharing][Go's culture favors message passing]].
.image https://raw.githubusercontent.com/egonelbre/gophers/master/vector/fairy-tale/sage.svg 150 _
* Share by communicating
The Go community [[https://go.dev/doc/effective_go#sharing][summarizes its preference for message-passing concurrency]] with the following pithy mantra:
[[https://www.youtube.com/watch?v=PAAkCSZUG1c&t=2m42s][_Do_not_communicate_by_sharing_memory;_instead,_share_memory_by_communicating._]]
Compared to alternatives, channels are indeed [[https://blogtitle.github.io/go-advanced-concurrency-patterns-part-3-channels/][more expressive]], more composable, and less error-prone.
As Rob Pike puts it, [[https://www.youtube.com/watch?v=PAAkCSZUG1c&t=4m20s][channels orchestrate, whereas mutexes merely serialize]]. Check out his demonstration of channels' superiority in [[https://www.youtube.com/watch?v=oV9rvDllKEg&t=18m13s][his elegant load-balancer implementation]].
.image https://raw.githubusercontent.com/egonelbre/gophers/master/vector/friends/stovepipe-hat-front.svg 200 _
: "transfer of ownership" requires discipline in Go (not enforced at compile time like in Rust); in theory, you could send a value down to a channel and still mutate it on the sending side.
* Channels are typed
Channels are typed conduits that allow goroutines to synchronize and transfer data.
A channel whose element type is `T` has type `chan`T`.
You can create channels of arbitrary element types (even `chan`chan`Man`).
However, only values compatible with `T` can be sent to a `chan`T`. For instance, you cannot send an `int` to a `chan`string`.
Channels only exist within the confines of your program; they cannot be used to communicate between different processes, be they all written in Go.
.image https://raw.githubusercontent.com/egonelbre/gophers/master/vector/friends/liberty.svg 200 _
* Naming conventions for channels
Popular names for channels include "ch", either alone or as a suffix:
ch := make(chan string)
taskCh := make(chan Task)
Naming a channel after the plural form of its element type is also common:
tasks := make(chan Task)
.image https://raw.githubusercontent.com/egonelbre/gophers/master/vector/fairy-tale/witch-learning.svg 200 _
* Channels are first-class values
You can pass a channel to a function:
func Aggregate(results chan Result) Result {
// ...
}
You can return a channel from a function:
func Generate(upTo int) chan int {
// ...
}
You can store a channel in a struct type:
type Request struct {
fn func() int // The operation to perform.
c chan bool // The channel to return the result.
}
Channels are comparable.
* Initializing a channel
The zero value of channel types is `nil`. You cannot do much with an uninitialized channel (although [[https://www.youtube.com/watch?v=t9bEg2A4jsw][assigning `nil` to a channel variable is sometimes useful]]).
The built-in function `make` initializes a channel of the specified type:
ch := make(chan string)
`make` also has an optional second parameter (`0` if unspecified), a nonnegative integer that corresponds to the _capacity_ of the resulting channel.
ch := make(chan string, 16)
The capacity of an existing channel cannot be changed.
: if specified capacity < 0, otherwise, you'll get a compilation error or, worse, a panic.
* Channel capacity
A channel's capacity corresponds to the size of its _buffer_, i.e. the maximum number of elements that the channel can contain at any give time.
A channel of zero capacity is said to be _unbuffered_ (or _synchronous_); conversely, a channel of positive capacity is said to be _buffered_.
ints := make(chan int) // unbuffered channel of ints
bools := make(chan bool, 0) // unbuffered channel of bools
strings := make(chan string, 16) // buffered channel of strings
Contrary to what you may believe, unbuffered channels are very useful!
.image https://raw.githubusercontent.com/egonelbre/gophers/master/vector/fairy-tale/sage.svg 200 _
: only mention len and cap in passing
: The built-in function `len` can be used to query the number of elements in the channel at a given time; however, that information rarely is useful, simply because it quickly becomes stale in most programs.
* An imperfect but useful metaphor for a channel: a serving hatch
Think of goroutines as cooks and waiters on either side of a serving hatch,
which represents a channel.
The channel's capacity corresponds to the number of plates that can fit
on the serving hatch's platform at any one time.
.image https://josephdmcnamee.files.wordpress.com/2012/01/serving-hatch1.jpg 350 _
: what does a channel of capacity 0 correspond to?
: An unbuffered channel has a capacity of zero and so it’s already full before any writes
: Far from being useless, an unbuffered channel can be used to create a synchronization point between two goroutines.
: the analogy is imperfect because channels are FIFO and waiters don't necessarily pick up plates in the order in which they arrived
* Channels are FIFO
Channels [[https://go.dev/ref/spec#Channel_types][act as first-in-first-out queues]].
.image https://i.imgflip.com/68zdq9.jpg 300 _
Of course, if multiple goroutines send values to a channel, the order in which values are received is only partial.
See [[https://go.dev/play/p/6kpQq0hhlun][this playground]] (although you may be able to fully understand the code only later).
: the values from the two goroutines may be interleaved
* Choose channel capacity judiciously
[[https://www.youtube.com/watch?v=iTrP_EmGNmw&t=1513s][Favor unbuffered channels unless you can justify the need for a buffered channel.]]
Whatever capacity you opt for, be deliberate about it. An inappropriate capacity can detrimentally affect the correctness and/or performance of your program.
In some cases, an inappropriately buffered channel may lead to a _deadlock_, i.e. goroutines blocking one another and preventing any further progress.
In other cases, an inappropriately buffered channel may lead some goroutines to _leak_ (i.e. turn into zombies), which may ultimately cause a crash.
.image https://www.globalnerdy.com/wp-content/uploads/2017/08/gopher-this-is-fine.jpg 200 _
: increasing buffer size may help reduce pressure on the Go scheduler
* Operations on a channel
Channels support _send_ an _receive_ operations, which are collectively known as _channel_communications_.
You can also _close_ a channel to inform receivers that no more items will be sent to it.
In addition, you can _range_ over a channel, a more derivative operation that consists in continually receiving values from it until the channel is closed and drained.
All those operations are *concurrency-safe*.
.image https://raw.githubusercontent.com/egonelbre/gophers/master/vector/adventure/hiking.svg 250 _
* Sending to a channel
The syntax of a _channel_send_ consists in the name of the channel in question
followed by a left-pointing arrow (`<-`) followed by the value to send.
ch := make(chan string)
// ...
ch <- "superimportant message"
The `gofmt` tool will adds a space on either side of the arrow; don't fight it.
A channel send is a statement, not an expression.
: the <- syntax (and select statement) comes from Newsqueak
: see https://www.youtube.com/watch?v=3DtUzH3zoFo
* Semantics of a send operation on a channel
.image img/send_semantics.svg _ 900
Notes:
- When you send to a channel, the channel better not already be closed!
- Buffered channels accept a limited number of values without a corresponding receive for those values.
- Each send on an _unbuffered_ channel must have a corresponding receive!
: or goroutines will leak
* Receiving from a channel
The syntax of a _channel_receive_ consists in a left-pointing arrow (`<-`) followed by the channel in question.
ch := make(chan string)
// ...
<-ch
The `gofmt` tool removes any whitespace between the arrow the channel; don't fight it.
A channel receive is an expression: you can store the received value in a variable...
v := <-ch
\... or even pass it directly to a function:
fmt.Println(<-ch)
A channel is _not_ a [[https://en.wikipedia.org/wiki/Publish%E2%80%93subscribe_pattern][pub/sub]] topic: a value received from a channel by a goroutine gets removed from the channel; no other goroutines can receive the same channel element.
* Semantics of a receive operation on a channel
.image img/receive_semantics.svg 500 _
* Dispelling the ambiguity on receive
If we receive the zero value of the channel's element type, how can we tell whether the value we're receiving is from a closed and empty channel or whether someone really sent that value to the channel?
To lift that ambiguity, there is a special form of receive operation that yields an additional boolean value:
v, ok := <-ch
The value of variable `ok` is
- `true` if the value received was actually sent to the channel, or
- `false` if the channel is both closed and fully drained (empty).
: similar comma-ok idiom for maps and type assertions
* Exercise: resolving a deadlock
Deadlocks can result from an incautious use of channels. Remember that, depending on the state of a channel, a communication on that channel may block.
1. Run the following program and note that it results in a deadlock:
.play -edit src/deadlock_example.go /^//START/,/^//END/
2. Can you explain why? How would you resolve the issue?
3. Can you think of naive and incorrect attempts at resolving the issue?
.image https://raw.githubusercontent.com/egonelbre/gophers/master/vector/superhero/gotham.svg 150 _
: ask them what happens if we write go fmt.Println(<-ch)
: an incorrect approach: go func() { fmt.Println(<-ch) }() then ch <- 42 ...
: ... because <-ch can execute and main terminate before the Println does
: correct: go func() { ch <- 42}() and in main: fmt.Println(<-ch)
: https://twitter.com/jub0bs/status/1400387164358197254