Skip to content

Commit e3b7172

Browse files
committed
Tutorial 3 improvements
1 parent 1e862c8 commit e3b7172

File tree

2 files changed

+126
-2
lines changed

2 files changed

+126
-2
lines changed

Diff for: documentation.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -373,7 +373,7 @@ The [distributed-process-platform][18] library implements parts of the
373373
in the original paper and implemented by the [remote][14] package. In particular,
374374
we diverge from the original design and defer to many of the principles
375375
defined by Erlang's [Open Telecom Platform][13], taking in some well established
376-
Haskell concurrency design patterns alongside.
376+
Haskell concurrency design patterns along the way.
377377

378378
In fact, [distributed-process-platform][18] does not really consider the
379379
*task layer* in great detail. We provide an API comparable to remote's

Diff for: tutorials/ch-tutorial3.md

+125-1
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,27 @@
11
---
22
layout: tutorial
33
categories: tutorial
4-
sections: ['Message Ordering', 'Selective Receive', 'Advanced Mailbox Processing']
4+
sections: ['The Thing About Nodes', 'Message Ordering', 'Selective Receive', 'Advanced Mailbox Processing', 'Process Lifetime', 'Monitoring And Linking', 'Getting Process Info']
55
title: Getting to know Processes
66
---
77

8+
### The Thing About Nodes
9+
10+
Before we can really get to know _processes_, we need to consider the role of
11+
the _Node Controller_ in Cloud Haskell. As per the [_semantics_][4], Cloud
12+
Haskell makes the role of _Node Controller_ (occasionally referred to by the original
13+
"Unified Semantics for Future Erlang" paper on which our semantics are modelled
14+
as the "ether") explicit.
15+
16+
Architecturally, Cloud Haskell's _Node Controller_ consists of a pair of message
17+
buss processes, one of which listens for network-transport level events whilst the
18+
other is busy processing _signal events_ (most of which pertain to either message
19+
delivery or process lifecycle notification). Both these _event loops_ runs sequentially
20+
in the system at all times.
21+
22+
With this in mind, let's consider Cloud Haskell's lightweight processes in a bit
23+
more detail...
24+
825
### Message Ordering
926

1027
We have already met the `send` primitive, which is used to deliver
@@ -86,6 +103,19 @@ subject was covered briefly in the first tutorial. Matching on messages allows
86103
us to separate the type(s) of messages we can handle from the type that the
87104
whole `receive` expression evaluates to.
88105

106+
Consider the following snippet:
107+
108+
{% highlight haskell %}
109+
usingReceive = do
110+
() <- receiveWait [
111+
match (\(s :: String) -> say s)
112+
, match (\(i :: Int) -> say $ show i)
113+
]
114+
{% endhighlight %}
115+
116+
Note that each of the matches in the list must evaluate to the same type,
117+
as the type signature indicates: `receiveWait :: [Match b] -> Process b`.
118+
89119
The behaviour of `receiveWait` differs from `receiveTimeout` in that it
90120
blocks forever (until a match is found in the process' mailbox), whereas the
91121
variant taking a timeout will return `Nothing` unless a match is found within
@@ -183,7 +213,101 @@ distributed-process can be utilised to develop highly generic message processing
183213
All the richness of the distributed-process-platform APIs (such as `ManagedProcess`) which
184214
will be discussed in later tutorials are, in fact, built upon these families of primitives.
185215

216+
### Process Lifetime
217+
218+
A process will continue executing until it has evaluated to some value, or is abruptly
219+
terminated either by crashing (with an un-handled exception) or being instructed to
220+
stop executing. Stop instructions to stop take one of two forms: a `ProcessExitException`
221+
or `ProcessKillException`. As the names suggest, these _signals_ are delivered in the form
222+
of asynchronous exceptions, however you should not to rely on that fact! After all,
223+
we cannot throw an exception to a thread that is executing in some other operating
224+
system process or on a remote host! Instead, you should use the [`exit`][5] and [`kill`][6]
225+
primitives from distributed-process, which not only ensure that remote target processes
226+
are handled seamlessly, but also maintain a guarantee that if you send a message and
227+
*then* an exit signal, the message will be delivered to the destination process (via its
228+
local node controller) before the exception is thrown - note that this does not guarantee
229+
that the destination process will have time to _do anything_ with the message before it
230+
is terminated.
231+
232+
The `ProcessExitException` signal is sent from one process to another, indicating that the
233+
receiver is being asked to terminate. A process can choose to tell itself to exit, and since
234+
this is a useful way for processes to terminate _abnormally_, distributed-processes provides
235+
the [`die`][7] primitive to simplify doing so. In fact, [`die`][7] has slightly different
236+
semantics from [`exit`][5], since the latter involves sending an internal signal to the
237+
local node controller. A direct consequence of this is that the _exit signal_ may not
238+
arrive immediately, since the _Node Controller_ could be busy processing other events.
239+
On the other hand, the [`die`][7] primitive throws a `ProcessExitException` directly
240+
in the calling thread, thus terminating it without delay.
241+
242+
The `ProcessExitException` type holds a _reason_ field, which is serialised as a raw `Message`.
243+
This exception type is exported, so it is possible to catch these _exit signals_ and decide how
244+
to respond to them. Catching _exit signals_ is done via a set of primitives in
245+
distributed-process, and the use of them forms a key component of the various fault tolerance
246+
strategies provided by distributed-process-platform. For example, most of the utility
247+
code found in distributed-process-platform relies on processes terminating with a
248+
`ProcessKillException` or `ProcessExitException` where the _reason_ has the type
249+
`ExitReason` - processes which fail with other exception types are routinely converted to
250+
`ProcessExitException $ ExitOther reason {- reason :: String -}` automatically. This pattern
251+
is most prominently found in supervisors and supervised _managed processes_, which will be
252+
covered in subsequent tutorials.
253+
254+
A `ProcessKillException` is intended to be an _untrappable_ exit signal, so its type is
255+
not exported and therefore you can __only__ handle it by catching all exceptions, which
256+
as we all know is very bad practise. The [`kill`][6] primitive is intended to be a
257+
_brutal_ means for terminating process - e.g., it is used to terminate supervised child
258+
processes that haven't shutdown on request, or to terminate processes that don't require
259+
any special cleanup code to run when exiting - although it does behave like [`exit`][5]
260+
in so much as it is dispatched (to the target process) via the _Node Controller_.
261+
262+
### Monitoring and Linking
263+
264+
Processes can be linked to other processes (or nodes or channels). A link, which is
265+
unidirectional, guarantees that once any object we have linked to *dies*, we will also
266+
be terminated. A simple way to test this is to spawn a child process, link to it and then
267+
terminate it, noting that we will subsequently die ourselves. Here's a simple example,
268+
in which we link to a child process and then cause it to terminate (by sending it a message
269+
of the type it is waiting for). Even though the child terminates "normally", our process
270+
is also terminated since `link` will _link the lifetime of two processes together_ regardless
271+
of exit reasons.
272+
273+
{% highlight haskell %}
274+
demo = do
275+
pid <- spawnLocal $ receive >>= return
276+
link pid
277+
send pid ()
278+
() <- receive
279+
{% endhighlight %}
280+
281+
The medium that link failures uses to signal exit conditions is the same as exit and kill
282+
signals - asynchronous exceptions. Once again, it is a bad idea to rely on this (not least
283+
because it might fail in some future release) and the exception type (`ProcessLinkException`)
284+
is not exported so as to prevent developers from abusing exception handling code in this
285+
special case.
286+
287+
Whilst the built-in `link` primitive terminates the link-ee regardless of exit reason,
288+
distributed-process-platform provides an alternate function `linkOnFailure`, which only
289+
dispatches the `ProcessLinkException` if the link-ed process dies abnormally (i.e., with
290+
some `DiedReason` other than `DiedNormal`).
291+
292+
Monitors on the other hand, do not cause the *listening* process to exit at all, instead
293+
putting a `ProcessMonitorNotification` into the process' mailbox. This signal and its
294+
constituent fields can be introspected in order to decide what action (if any) the receiver
295+
can/should take in response to the monitored processes death.
296+
297+
Linking and monitoring are foundational tools for *supervising* processes, where a top level
298+
process manages a set of children, starting, stopping and restarting them as necessary.
299+
300+
### Getting Process Info
301+
302+
The `getProcessInfo` function provides a means for us to obtain information about a running
303+
process. The `ProcessInfo` type it returns contains the local node id and a list of
304+
registered names, monitors and links for the process. The call returns `Nothing` if the
305+
process in question is not alive.
306+
186307
[1]: hackage.haskell.org/package/distributed-process/docs/Control-Distributed-Process.html#v:receiveWait
187308
[2]: hackage.haskell.org/package/distributed-process/docs/Control-Distributed-Process.html#v:expect
188309
[3]: http://hackage.haskell.org/package/distributed-process-0.4.2/docs/Control-Distributed-Process.html#v:match
189310
[4]: /static/semantics.pdf
311+
[5]: http://hackage.haskell.org/package/distributed-process-0.4.2/docs/Control-Distributed-Process.html#v:exit
312+
[6]: http://hackage.haskell.org/package/distributed-process-0.4.2/docs/Control-Distributed-Process.html#v:kill
313+
[7]: http://hackage.haskell.org/package/distributed-process-0.4.2/docs/Control-Distributed-Process.html#v:die

0 commit comments

Comments
 (0)