|
| 1 | +--- |
| 2 | +layout: tutorial |
| 3 | +categories: tutorial |
| 4 | +sections: ['Getting Started', 'Create a node', 'Sending messages', 'Spawning Remote Processes'] |
| 5 | +title: Getting Started |
| 6 | +--- |
| 7 | + |
| 8 | +### Getting Started |
| 9 | + |
| 10 | +----- |
| 11 | + |
| 12 | +In order to go through this tutorial, you will need a Haskell development |
| 13 | +environment and we recommend installing the latest version of the |
| 14 | +[Haskell Platform](http://www.haskell.org/platform/) if you've not done |
| 15 | +so already. |
| 16 | + |
| 17 | +Once you're up and running, you'll want to get hold of the distributed-process |
| 18 | +library and a choice of network transport backend. This guide will use |
| 19 | +the network-transport-tcp backend, but other backends may be available |
| 20 | +on github. |
| 21 | + |
| 22 | +### Installing from source |
| 23 | + |
| 24 | +If you're installing from source, the simplest method is to checkout the |
| 25 | +[Umbrella Project](https://github.com/haskell-distributed/cloud-haskell) and |
| 26 | +run `make` to obtain the complete set of source repositories for building |
| 27 | +Cloud Haskell. The additional makefiles bundled with the umbrella assume |
| 28 | +that you have a recent version of cabal-dev installed. |
| 29 | + |
| 30 | +### Create a node |
| 31 | + |
| 32 | +Cloud Haskell's *lightweight processes* reside on a "node", which must |
| 33 | +be initialised with a network transport implementation and a remote table. |
| 34 | +The latter is required so that physically separate nodes can identify known |
| 35 | +objects in the system (such as types and functions) when receiving messages |
| 36 | +from other nodes. We will look at inter-node communication later, for now |
| 37 | +it will suffice to pass the default remote table, which defines the built-in |
| 38 | +types that Cloud Haskell needs at a minimum in order to run. |
| 39 | + |
| 40 | +We start with our imports: |
| 41 | + |
| 42 | +{% highlight haskell %} |
| 43 | +import Network.Transport.TCP (createTransport, defaultTCPParameters) |
| 44 | +import Control.Distributed.Process |
| 45 | +import Control.Distributed.Process.Node |
| 46 | +{% endhighlight %} |
| 47 | + |
| 48 | +Our TCP network transport backend needs an IP address and port to get started |
| 49 | +with: |
| 50 | + |
| 51 | +{% highlight haskell %} |
| 52 | +main :: IO () |
| 53 | +main = do |
| 54 | + Right t <- createTransport "127.0.0.1" "10501" defaultTCPParameters |
| 55 | + node <- newLocalNode t initRemoteTable |
| 56 | + .... |
| 57 | +{% endhighlight %} |
| 58 | + |
| 59 | +And now we have a running node. |
| 60 | + |
| 61 | +### Sending messages |
| 62 | + |
| 63 | +We start a new process by evaluating `forkProcess`, which takes a node, |
| 64 | +a `Process` action - because our concurrent code will run in the `Process` |
| 65 | +monad - and returns an address for the process in the form of a `ProcessId`. |
| 66 | +The process id can be used to send messages to the running process - here we |
| 67 | +will send one to ourselves! |
| 68 | + |
| 69 | +{% highlight haskell %} |
| 70 | +-- in main |
| 71 | + _ <- forkProcess node $ do |
| 72 | + -- get our own process id |
| 73 | + self <- getSelfPid |
| 74 | + send self "hello" |
| 75 | + hello <- expect :: Process String |
| 76 | + liftIO $ putStrLn hello |
| 77 | + return () |
| 78 | +{% endhighlight %} |
| 79 | + |
| 80 | +Lightweight processes are implemented as `forkIO` threads. In general we will |
| 81 | +try to forget about this implementation detail, but let us note that we |
| 82 | +haven't deadlocked our own thread by sending to and receiving from its mailbox |
| 83 | +in this fashion. Sending messages is a completely asynchronous operation - even |
| 84 | +if the recipient doesn't exist, no error will be raised and evaluating `send` |
| 85 | +will not block the caller, not even if the caller is sending messages to itself! |
| 86 | + |
| 87 | +Receiving works quite the other way around, blocking the caller until a message |
| 88 | +matching the expected type arrives in our (conceptual) mailbox. If multiple |
| 89 | +messages of that type are in the queue, they will be returned in FIFO |
| 90 | +order, otherwise the caller will be blocked until a message arrives that can be |
| 91 | +decoded to the correct type. |
| 92 | + |
| 93 | +Let's spawn two processes on the same node and have them talk to each other. |
| 94 | + |
| 95 | +{% highlight haskell %} |
| 96 | +import Control.Concurrent (threadDelay) |
| 97 | +import Control.Monad (forever) |
| 98 | +import Control.Distributed.Process |
| 99 | +import Control.Distributed.Process.Node |
| 100 | +import Network.Transport.TCP (createTransport, defaultTCPParameters) |
| 101 | + |
| 102 | +replyBack :: (ProcessId, String) -> Process () |
| 103 | +replyBack (sender, msg) = send sender msg |
| 104 | + |
| 105 | +logMessage :: String -> Process () |
| 106 | +logMessage msg = say $ "handling " ++ msg |
| 107 | + |
| 108 | +main :: IO () |
| 109 | +main = do |
| 110 | + Right t <- createTransport "127.0.0.1" "10501" defaultTCPParameters |
| 111 | + node <- newLocalNode t initRemoteTable |
| 112 | + forkProcess node $ do |
| 113 | + -- Spawn another worker on the local node |
| 114 | + echoPid <- spawnLocal $ forever $ do |
| 115 | + -- Test our matches in order against each message in the queue |
| 116 | + receiveWait [match logMessage, match replyBack] |
| 117 | + |
| 118 | + -- The `say` function sends a message to a process registered as "logger". |
| 119 | + -- By default, this process simply loops through its mailbox and sends |
| 120 | + -- any received log message strings it finds to stderr. |
| 121 | + |
| 122 | + say "send some messages!" |
| 123 | + send echoPid "hello" |
| 124 | + self <- getSelfPid |
| 125 | + send echoPid (self, "hello") |
| 126 | + |
| 127 | + -- `expectTimeout` waits for a message or times out after "delay" |
| 128 | + m <- expectTimeout 1000000 |
| 129 | + case m of |
| 130 | + -- Die immediately - throws a ProcessExitException with the given reason. |
| 131 | + Nothing -> die "nothing came back!" |
| 132 | + (Just s) -> say $ "got " ++ s ++ " back!" |
| 133 | + return () |
| 134 | + |
| 135 | + -- A 1 second wait. Otherwise the main thread can terminate before |
| 136 | + -- our messages reach the logging process or get flushed to stdio |
| 137 | + liftIO $ threadDelay (1*1000000) |
| 138 | + return () |
| 139 | +{% endhighlight %} |
| 140 | + |
| 141 | +Note that we've used the `receive` class of functions this time around. |
| 142 | +These can be used with the [`Match`][5] data type to provide a range of |
| 143 | +advanced message processing capabilities. The `match` primitive allows you |
| 144 | +to construct a "potential message handler" and have it evaluated |
| 145 | +against received (or incoming) messages. As with `expect`, if the mailbox does |
| 146 | +not contain a message that can be matched, the evaluating process will be |
| 147 | +blocked until a message arrives which _can_ be matched. |
| 148 | + |
| 149 | +In the _echo server_ above, our first match prints out whatever string it |
| 150 | +receives. If first message in out mailbox is not a `String`, then our second |
| 151 | +match is evaluated. This, given a tuple `t :: (ProcessId, String)`, will send |
| 152 | +the `String` component back to the sender's `ProcessId`. If neither match |
| 153 | +succeeds, the echo server process blocks until another message arrives and |
| 154 | +tries again. |
| 155 | + |
| 156 | +### Serializable Data |
| 157 | + |
| 158 | +Processes may send any datum whose type implements the `Serializable` typeclass, |
| 159 | +which is done indirectly by implementing `Binary` and deriving `Typeable`. |
| 160 | +Implementations are already provided for off of Cloud Haskell's primitives |
| 161 | +and the most commonly used data structures. |
| 162 | + |
| 163 | +### Spawning Remote Processes |
| 164 | + |
| 165 | +In order to spawn processes on a remote node without additional compiler |
| 166 | +infrastructure, we make use of "static values": values that are known at |
| 167 | +compile time. Closures in functional programming arise when we partially |
| 168 | +apply a function. In Cloud Haskell, a closure is a code pointer, together |
| 169 | +with requisite runtime data structures representing the value of any free |
| 170 | +variables of the function. A remote spawn therefore, takes a closure around |
| 171 | +an action running in the `Process` monad: `Closure (Process ())`. |
| 172 | + |
| 173 | +In distributed-process if `f : T1 -> T2` then |
| 174 | + |
| 175 | +{% highlight haskell %} |
| 176 | + $(mkClosure 'f) :: T1 -> Closure T2 |
| 177 | +{% endhighlight %} |
| 178 | + |
| 179 | +That is, the first argument to the function we pass to mkClosure will act |
| 180 | +as the closure environment for that process. If you want multiple values |
| 181 | +in the closure environment, you must "tuple them up". |
| 182 | + |
| 183 | +We need to configure our remote table (see the documentation for more details) |
| 184 | +and the easiest way to do this, is to let the library generate the relevant |
| 185 | +code for us. For example (taken from the distributed-process-platform test suites): |
| 186 | + |
| 187 | +{% highlight haskell %} |
| 188 | +sampleTask :: (TimeInterval, String) -> Process String |
| 189 | +sampleTask (t, s) = sleep t >> return s |
| 190 | + |
| 191 | +$(remotable ['sampleTask]) |
| 192 | +{% endhighlight %} |
| 193 | + |
| 194 | +We can now create a closure environment for `sampleTask` like so: |
| 195 | + |
| 196 | +{% highlight haskell %} |
| 197 | +($(mkClosure 'sampleTask) (seconds 2, "foobar")) |
| 198 | +{% endhighlight %} |
| 199 | + |
| 200 | +The call to `remotable` generates a remote table and a definition |
| 201 | +`__remoteTable :: RemoteTable -> RemoteTable` in our module for us. |
| 202 | +We compose this with other remote tables in order to come up with a |
| 203 | +final, merged remote table for use in our program: |
| 204 | + |
| 205 | +{% highlight haskell %} |
| 206 | +myRemoteTable :: RemoteTable |
| 207 | +myRemoteTable = Main.__remoteTable initRemoteTable |
| 208 | + |
| 209 | +main :: IO () |
| 210 | +main = do |
| 211 | + localNode <- newLocalNode transport myRemoteTable |
| 212 | + -- etc |
| 213 | +{% endhighlight %} |
| 214 | + |
| 215 | +------ |
| 216 | + |
| 217 | +[1]: /static/doc/distributed-process/Control-Distributed-Process.html#v:Message |
| 218 | +[2]: http://hackage.haskell.org/package/distributed-process |
| 219 | +[3]: /static/doc/distributed-process-platform/Control-Distributed-Process-Platform-Async.html |
| 220 | +[4]: /static/doc/distributed-process-platform/Control-Distributed-Process-Platform-ManagedProcess.htmlv:callAsync |
| 221 | +[5]: http://hackage.haskell.org/packages/archive/distributed-process/latest/doc/html/Control-Distributed-Process-Internal-Primitives.html#t:Match |
0 commit comments