@@ -109,6 +109,107 @@ nginx example (in addition to SSL setup):
109
109
}
110
110
```
111
111
112
+
112
113
# Cluster management
113
114
114
- TODO
115
+ ## Perequisites
116
+
117
+ Install autossh using package manager, and "dwq" from pip3.
118
+
119
+
120
+ ## Setup
121
+
122
+ 1 . set up ssh authentication to ci.riot-os.org.
123
+
124
+ E.g., add this to ` ~/.ssh/config ` :
125
+
126
+ ```
127
+ Host murdock
128
+ HostName ci.riot-os.org
129
+ User murdock-slave
130
+ Port 22
131
+ IdentityFile ~/.ssh/id_rsa_murdock-slave
132
+ IdentitiesOnly yes
133
+ LocalForward 7711 127.0.0.1:7711
134
+ LocalForward 6379 127.0.0.1:6379
135
+ ServerAliveInterval 60
136
+ ServerAliveCountMax 2
137
+ ```
138
+
139
+ Make sure
` ~/.ssh/id_rsa_murdock-slave ` can log in to
` [email protected] ` .
140
+
141
+ 2 . keep an ssh connection open that forwards the ports 7711 and 6379.
142
+
143
+ E.g., use this alias and "autossh":
144
+
145
+ $ alias dwq_connect='autossh -M0 -N -C -f murdock'
146
+
147
+ Then start up autossh with ` dwq_connect ` (automate this or repeat for each session).
148
+
149
+
150
+ ## dwqm (dwq management utility)
151
+
152
+ Try "dwqm --help".
153
+
154
+ Useful things:
155
+
156
+ - list all queues in the disque instance:
157
+
158
+ $ dwqm queue --list
159
+
160
+ This is a raw queue listing and includes queues used internally by dwq. Those
161
+ are named "control::* " and "status::* ".
162
+
163
+ - list all connected workers:
164
+
165
+ $ dwqm control --list
166
+
167
+ - set worker(s) to "paused", will not run any jubs until resumed or restarted:
168
+
169
+ $ dwqm control --pause worker1 [ worker2] ...
170
+
171
+ - resume worker(s):
172
+
173
+ $ dwqm control --resume worker1 [ worker2] ...
174
+
175
+ - shutdown worker(s) (with our current murdock scripts, this will shutdown the
176
+ worker, pull the newest build container, then __ restart__ the worker):
177
+
178
+ $ dwqm control --shutdown worker1 [ worker2] ...
179
+
180
+ ## dwqc (dwq client, runs jobs on queue)
181
+
182
+ In our setup, every build worker listens on the "default" queue. Those workers
183
+ are executing inside of the build container.
184
+
185
+ Every test worker listens on a queue named after the board it is connected to,
186
+ e.g., "samr21-xpro", "nrf52dk" or "esp32-wroom-32".
187
+
188
+ __ every__ worker also listens on a queue named after it's hostname
189
+
190
+ For example, in our setup, "riotbuild" listens on the queues "default" and
191
+ "riotbuild", "pi-36f90aef" listend on "pi-36f90aef" and "nrf52dk".
192
+
193
+ ` dwqc ` needs a git repo and commit either as parameters or via environment.
194
+ Either manually set "DWQ_REPO" and "DWQ_COMMIT", or use an alias:
195
+
196
+ $ alias dwqset='export DWQ_REPO=https://github.com/RIOT-OS/RIOT DWQ_COMMIT=$(git rev-parse HEAD)'
197
+ $ cd src/riot
198
+ $ dwqset # following dwqc jobs will now be executed in the specified checkout
199
+
200
+
201
+ Run a single job on the queue named "default":
202
+
203
+ $ dwqc "echo hello world!"
204
+
205
+ Run a single job on a specific queue:
206
+
207
+ $ dwqc -q riotbuild "ccache -s"
208
+
209
+ Run multiple jobs on a single queue:
210
+
211
+ $ for i in $(seq 10); do echo "echo $i"; done | dwqc -q queue_name
212
+
213
+ Create command from stdin plus base command:
214
+
215
+ $ echo "first second third" | dwqc -s "echo \${1}" # will create job "echo first"
0 commit comments