-
Notifications
You must be signed in to change notification settings - Fork 18
/
Copy pathoptimal-connectivity.xml
537 lines (458 loc) · 25.9 KB
/
optimal-connectivity.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
<?xml version="1.0" encoding="UTF-8"?>
<chapter xmlns="http://docbook.org/ns/docbook"
xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
xml:id="optimal-connectivity">
<title>Optimizing Connectivity</title>
<para>People who have tried many of the free RTC softphones have observed
that they don't always work through firewalls or NAT networks. Sometimes
these problems give visual feedback, in the form of error messages
advising that the call can't proceed. In other cases, the call appears
to be connected but audio only works in one direction or stops after
some brief period of time.</para>
<para>Metcalfe's law tells us that the value of a telecommunications
network is proportional to the square of the number of connected users
of the system (n<superscript>2</superscript>), demonstrated in
<xref linkend="metcalfe-quadratic"/>.</para>
<figure float="none" xml:id="metcalfe-quadratic">
<title>Metcalfe's law</title>
<mediaobject>
<imageobject role="print">
<imagedata fileref="figs/gnuplot/metcalfe_quadratic.svg" width="5in"
format="SVG" />
</imageobject>
<imageobject role="web">
<imagedata fileref="figs/gnuplot/metcalfe_quadratic.svg"
format="SVG" />
</imageobject>
</mediaobject>
</figure>
<para>Therefore, the benefit of making the solution work for all those
users who may suffer in certain NAT environments does not just have a
gradual or linear impact, the benefit is quadratic.</para>
<para>Today's RTC technology gives us the tools to deal with these
problems in the vast majority of cases. This chapter gives an overview
of the main concerns.</para>
<sect1 xml:id="optimal-connectivity-codecs">
<title>Codec selection</title>
<para>Codec is a portmanteau of coder-decoder. A codec is an algorithm
for encoding an audio or video signal for storage or transmission over
a digital communications network. Codecs are responsible for
compression of the data stream and may also take some responsibility
for error correction, packet loss concealment and silence
suppression.</para>
<para>Each device or softphone typically has one or more codec
algorithms included. Some software, such as the Asterisk PBX, can
support additional codecs with the help of modules or plugins.</para>
<para>The list of codecs supported by a product depends on the age of
the product, patents and the type of products it is intended to
interact with. Open source solutions generally avoid patented codecs,
although unofficial implementations of them can be found online,
such as the popular G.729 codec for Asterisk.</para>
<table xml:id="codecs-list">
<title>Common codecs</title>
<tgroup cols="5">
<colspec colname="c1" colnum="1" colwidth="1*" />
<colspec colname="c2" colnum="2" colwidth="1*" />
<colspec colname="c3" colnum="3" colwidth="1*" />
<colspec colname="c4" colnum="4" colwidth="1*" />
<colspec colname="c5" colnum="5" colwidth="3*" />
<thead>
<row>
<entry align="center">Name</entry>
<entry align="center">Type</entry>
<entry align="center">Bitrate (kbps)</entry>
<entry align="center">Patented</entry>
<entry align="center">Comments</entry>
</row>
</thead>
<tbody>
<row>
<entry>G.711 (alaw, ulaw)</entry>
<entry>audio</entry>
<entry>64</entry>
<entry>N</entry>
<entry>Widely supported in phones, WebRTC browsers, ISDN gateways
and virtually everywhere else. Quality of traditional phone
calls.</entry>
</row>
<row>
<entry>G.722</entry>
<entry>audio</entry>
<entry>64</entry>
<entry>N</entry>
<entry>Supported in most modern software and high quality desk
phones. Transmits higher quality wideband audio in the same
64kbps bandwidth used by G.711</entry>
</row>
<row>
<entry>Opus</entry>
<entry>audio</entry>
<entry>6 - 510</entry>
<entry>N</entry>
<entry>Support in more modern software, WebRTC browsers and some
very recent desk phones.</entry>
</row>
<row>
<entry>G.729</entry>
<entry>audio</entry>
<entry>8</entry>
<entry>Y</entry>
<entry>A low bitrate codec supported in a lot of older VoIP phones
and related hardware. Voice quality less than a standard
telephone call.</entry>
</row>
<row>
<entry>G.723.1</entry>
<entry>audio</entry>
<entry>5.3, 6.3</entry>
<entry>Y</entry>
<entry>An ultra-low bitrate codec support in some older VoIP
phones and related hardware. Voices sound very bad.</entry>
</row>
<row>
<entry>GSM full rate</entry>
<entry>audio</entry>
<entry>13</entry>
<entry>N</entry>
<entry>The original codec for GSM mobile telephony. Supported in
many older open source products and some VoIP hardware.</entry>
</row>
<row>
<entry>iLBC</entry>
<entry>audio</entry>
<entry>15</entry>
<entry>N</entry>
<entry>A codec developed for Internet use before Opus. Supported
in many older open source products but not widely supported in
VoIP hardware.</entry>
</row>
<row>
<entry>speex</entry>
<entry>audio</entry>
<entry>2-44</entry>
<entry>N</entry>
<entry>A codec developed for Internet use before Opus. Supported
in many older open source products but not widely supported in
VoIP hardware.</entry>
</row>
<row>
<entry>VP8</entry>
<entry>video</entry>
<entry>depends</entry>
<entry>N</entry>
<entry>Mandatory part of WebRTC, supported in browsers. Bitrate
depends on resolution and frame rate.</entry>
</row>
<row>
<entry>H.264</entry>
<entry>video</entry>
<entry>depends</entry>
<entry>Y</entry>
<entry>Mandatory part of WebRTC, supported in browsers and in
many existing video conferencing hardware products.</entry>
</row>
</tbody>
</tgroup>
</table>
<para>The person configuring the software or device can typically
select which codecs are permitted and also specify which codecs are
preferred over others.</para>
<para>When a call is made, the endpoints negotiate to select a codec
that is supported at both ends. If both endpoints have no codecs in
common, the call is not possible and the user may see a message
telling them the call failed. If both endpoints have more than one
codec in common, the exact codec selected in this negotiation algorithm
depends on the relative priorities specified by the administrators
of each endpoint.</para>
<para>Some software has the ability to dynamically change the list
of permitted codecs. For example, some mobile apps will only enable
high-bandwidth codecs when they detect the mobile device is using wifi
or a 4G/LTE network.</para>
<para>More modern codecs support a variable bit rate that can be
changed automatically during a call to adapt to poor network conditions.
The Opus codec used for WebRTC has this capability.</para>
<sect2 xml:id="optimal-connectivity-codecs-recommendations">
<title>Recommendations</title>
<para>Enable as many codecs as possible to maximize the chance of
connection. This reduces the number of calls where the endpoints
fail to find a common set of codecs.</para>
<para>Disable those codecs that won't possibly work given the available
bandwidth. For example, in a remote location with a 128kbit DSL
broadband connection it is usually necessary to disable 64kbit codecs
like G.711 and G.722 and use codecs that have lower audio quality
such as Opus, GSM and G.729.</para>
<para>However, in a location with good bandwidth, don't disable the
low-bit rate/low-quality codecs. They will still be needed when
calling a user who doesn't have great bandwidth.</para>
<para>Order the remaining codecs based on the quality, with the
best quality codec first. G.711 is typically present for
compatibility but other codecs like G.722 offer better quality for
the same amount of bandwidth, so rate G.722 ahead of G.711 and
G.722 will be used whenever possible.</para>
<para>Example 1: a mobile app: enable, in order of priority
starting with the most preferred: Opus, Silk and GSM codecs.</para>
<para>Example 2: a soft PBX in an office accessed by local users
and mobile users: enable, in order of priority starting with the
most preferred: Opus, G.722, G.711, Silk, iLBC, GSM, G.729.</para>
</sect2>
</sect1>
<sect1 xml:id="optimal-connectivity-rtp-encryption">
<title>Media stream encryption compatibility</title>
<para>RTC voice and video (media) streams are almost always
transmitted using Real Time Protocol (RTP).</para>
<para>Encrypting RTP streams is a popular requirement. Some organizations
are subject to regulations requiring encryption. In other cases,
there is a commercial imperative to use encryption. Just as HTTP
sessions can be encrypted with HTTPS, RTP streams can be encrypted
using SRTP. SRTP is optimized for the real-time nature of the media
stream and it permits packet loss. However, thanks to the way that
SRTP has involved, just looking for the encryption setting and turning
it on may lead to a big loss in compatability with other users
if the products you are using don't support the right combination
of encryption protocols. We consider the issues here and then
provide some recommendations.</para>
<para>While SRTP itself is relatively standard, there are several
different ways to exchange keys for an SRTP session. If the two
endpoints trying to setup a call are not trying to use the same
method for key exchange, or if one of them hasn't enabled encryption
at all, <emphasis>the call will not connect</emphasis>.
To meet our goal of maximizing connectivity, these mismatches must
be avoided.</para>
<para>The original method of key exchange for SRTP involves exchanging
SDES keys through the signalling channel, which may be a SIP or XMPP
connection. Most of the original products supporting this standard
take an all-or-nothing approach: if encryption is disabled,
they will only talk to other endpoints without encryption and if
encryption is enabled, they will only talk to other endpoints
capable of the same key exchange protocol.</para>
<para>One disadvantage of sending the keys through the signalling
channel is that the operator of the SIP proxy can easily observe
the keys and use them to decrypt the RTP streams. Another risk
is that the end-user has no way to know if a man-in-the-middle
has swapped the keys, decrypting the streams, recording or modifying
them and then sending them on to the other endpoint using a different
key. Two alternative solutions to these problems have emerged.</para>
<para>Phil Zimmerman, the legendary creator of PGP encryption,
created the ZRTP key exchange protocol. When the endpoints support
ZRTP, they typically send signalling messages (SDP) specifying that
regular, unencrypted RTP will be used for the call. When the
call is answered, each endpoint tries to send some special ZRTP
"hello" packets to the peer on the port normally used for the RTP.
At this stage, the phones will indicate to the users that they are
in a call without encryption. If the endpoints both support ZRTP
then they recognize the "hello" packets from the peer and they
perform a key exchange using the Diffie-Hellman algorithm. Once
the key exchange is completed, the interface on the phone changes
somehow to advise the user that the call is now encrypted.
Due to the nature of the Diffie-Hellman algorithm and the verification
of the algorithm by a short authentication string (SAS) that the users
read to each other, the keys can not be observed or substituted by
any man-in-the-middle.</para>
<para>Nonetheless, the use of the SAS may appear slightly geeky and
it is only valid if you know the other person personally and can
<emphasis>recognize their voice</emphasis> when they are reading the
SAS to you. If you are calling an organization, such as the call
center at the bank or a Government department, the person you are
speaking to is likely to be a complete stranger, you will not have
be familiar with the sound of their voice when they read the SAS
and so you will not be able to rely on this algorithm.</para>
<para>The DTLS-SRTP standard provides another alternative,
although on its own, it does not provide certainty that there is
no man-in-the-middle. DTLS-SRTP provides a way to use DTLS key
agreement before starting SRTP media streams. To provide security
against a man-in-the-middle, DTLS-SRTP can be combined with another
mechanism for the exchange of key fingerprints. For example, RFC 5763
specifies a mechanism for exchanging tamper-proof key fingerprints
in SDP using SIP Identity (RFC 4474).</para>
<para>DTLS-SRTP is the mechanism that has been chosen for the WebRTC
standard and it is widely implemented in web browsers. It is also
supported by the Asterisk and FreeSWITCH projects and some softphones
including Jitsi.</para>
<para>ZRTP is also supported in a number of products, including
Asterisk, FreeSWITCH and the Jitsi softphone. ZRTP is not currently
supported in WebRTC browsers although it is preferred by products
that focus on the privacy of personal communication between
people who know each other, such as the Lumicall app.</para>
<sect2 xml:id="optimal-connectivity-rtp-encryption-multi">
<title>Supporting multiple schemes</title>
<para>For encryption to be effective, users must know when it is
working. For this reason, many products started out with a
simple all-or-nothing approach. If the encryption setting is
enabled, the product will only permit calls to and from peers using
them same encryption settings. Users could then conclude that any
call that connects successfully is encrypted correctly.</para>
<para>Furthermore, the way that an SDP offer/answer exchange is
designed, a media descriptor either has crypto attributes or it
doesn't. There is no simple way for an endpoint to insert
the attributes in the SDP and hint that they are optional.
If they are present, the peer must act as if they are mandatory
and reject the call if it is not capable of using encryption.
</para>
<para>Some phones have an option to send two media descriptors in SDP,
one of them with crypto attributes and the other without. However,
some other phones don't understand this type of SDP and reject the
call completely, so it is not a reliable strategy.</para>
<para>Some phones have an option to try the call with encryption
enabled and if it is rejected, automatically try again without
encryption. This strategy is not glamorous but it has wider
compatibility.</para>
</sect2>
<sect2 xml:id="optimal-connectivity-rtp-encryption-recommendations">
<title>Recommendations for maximizing connectivity</title>
<para>Generally, SDES SRTP should be avoided and should not enabled
except for very specific cases. Do not enable SDES SRTP for arbitrary
calls across the Internet as a means to improve compatability: the
risks outweigh the benefits. The cases where you may use SDES SRTP
include situations where you have IP phones that don't support any
other form of encryption and connections to SIP trunking providers
who don't support any other form of encryption. In these special
cases, ensure that the SDES SRTP calls are only possible for
the specific IP addresses or SIP accounts that you designate.
Note that when you configure a connection profile in the PBX to use
this encryption mode with certain peers, it may not be able to use
the same connection profile for any other peers who don't use
SDES SRTP. This is the all-or-nothing scenario.</para>
<para>When making calls, do not try to use the approach that involves
sending multiple media descriptors, one with crypto attributes
and the other without. For all other calls, use software that tries
to make the call using DTLS-SRTP and if that fails retries the call
using ZRTP. If encryption is not essential for you, allow calls
to proceed even when ZRTP does not secure a connection.</para>
<para>When receiving calls, be willing to accept calls using either
DTLS-SRTP or ZRTP and possibly without any encryption at all, depending
upon your requirements. A SIP proxy can inspect the SDP to determine
which type of encryption is attempted and route the call appropriately.
For example, depending upon which attributes are present, the SIP
proxy could route the call to an Asterisk <literal>sip.conf</literal>
profile with DTLS-SRTP support or to another profile with ZRTP support.
Alternatively, you may use a softphone that is capable of recognizing
and accepting either type of encryption.</para>
<para>Throughout this guide, we present further details about how
to achieve all of the above with the products described herein.</para>
</sect2>
<sect2 xml:id="optimal-connectivity-rtp-encryption-recommendations-sec">
<title>Recommendations for security</title>
<para>The recommendations in the previous section will help optimize
connectivity, by enabling more calls to connect successfully.
Additional steps are required to ensure you fully benefit from the
security that encryption can provide, these are described here.</para>
<para>If you are enabling and relying on DTLS-SRTP with SIP, make sure
you also use SIP Identity (RFC 4474) and use SIP Identity to
authenticate the key fingerprints (RFC 5763).</para>
<para>If you pass calls through intermediate network components
such as Session Border Controllers or a PBX where the media streams
are decrypted and re-encrypted, you need to think carefully about
the impact on authentication. For example, you may configure the PBX
to enforce identity verification and then tell users that all calls
that have come through the PBX can be trusted.</para>
<para>If using phones or softphones that are configured to work
with or without encryption (this is referred to as
<emphasis>opportunistic encryption</emphasis>), it is very desirable
for them to give the user an indication about whether each
call is encrypted. Otherwise, the user should be told to assume
that calls are not encrypted.</para>
</sect2>
</sect1>
<sect1 xml:id="optimal-connectivity-turn">
<title>Use ICE and a TURN server</title>
<para>A TURN server provides a standard way to relay media on behalf
of users who are stuck on a NAT network. The Interactive Connectivity
Establishment (ICE) protocol uses the TURN server to help explore
network topology and give immediate feedback if the call is not possible,
eliminating the menace of ghost calls.</para>
<para>Several TURN servers are now available in convenient Linux packages,
see <xref linkend="turn"/> for details about selecting and installing
one.</para>
</sect1>
<sect1 xml:id="optimal-connectivity-tls">
<title>Use the TLS transport for SIP signalling</title>
<para>When SIP messages are sent over UDP, there are several things
that go wrong. The first problem is that large SIP messages can be
fragmented by the IP stack and some fragments are not delivered.</para>
<para>When ICE is used, the SIP message contains a larger SDP body to
encapsulate the ICE candidates. When a softphone attempts a video call,
the combination of the video and audio descriptors further enlarges
the SDP. More and more frequently in modern RTC deployments, SIP
messages sent over UDP exceed the maximum transmission unit and are
subject to fragmentation. IP packet fragments are not always routed
correctly by other intermediate network components. This was not a
problem in the early days of SIP when the vast majority of devices only
supported a limited number of audio codecs and overall packet sizes
were well under one kilobyte.</para>
<para>A more obscure issue is the presence of routers in homes and
small offices that claim to have <emphasis>SIP helper</emphasis>
capabilities. These routers try to modify the SIP messages to help
them through NAT. In reality, the modifications made by the router
can clash with the ICE protocol or other NAT discovery techniques
used by the phone or the server.</para>
<para>Sending all the SIP messages over a TLS connection eliminates all
of these problems. While there is slightly more effort involved to
create a certificate for the server, it saves an enormous amount of
ongoing support effort.</para>
<para>See <xref linkend="tls"/> for details about creating the TLS
certificates for SIP and XMPP.</para>
</sect1>
<sect1 xml:id="optimal-connectivity-firewalls">
<title>Getting through firewalls</title>
<para>When a user is in a developed country, at their home or using a
mobile Internet connection, they may be behind a NAT router but the
firewall on these devices is usually very permissive for connections
initiated by the user. In the vast majority of cases, the user will be
able to initiate outbound TCP connections to any destination (such as
a SIP, XMPP or WebSocket server on any arbitrary port number) and will
be able to send and receive UDP.</para>
<para>For some NAT routers, the UDP flows will only work reliably when
the peer is using a public IP address. This problem is automatically
detected by ICE connectivity checks and it is resolved by sending the
UDP packets through the TURN server.</para>
<para>For users with more repressive Internet providers, in some less
sophisticated wifi hotspots and in some corporate networks there are
more aggressive firewall policies. With a little care, the RTC
deployment can be designed to work reliably in many of these
environments too.</para>
<para>The first issue is the signalling connection, whether it is SIP,
XMPP or WebSockets. More restricted corporate networks block
outgoing TCP connections, except for those on port 80 or 443, which
they redirect to a transparent proxy.</para>
<para>As a consequence of this TCP blocking, it should be anticipated
that the user's softphone may need to use the HTTP proxy for all RTC
traffic, including media streaming and signalling.</para>
<para>HTTP Proxy servers have a number of issues. Older proxy servers
do not understand the WebSocket protocol and newer proxy servers are
not always configured to allow the WebSocket protocol by default.
To avoid these problems, it is recommended that the WebSocket
connection should always use TLS (WebRTC clients uses a
<code>wss://</code> URL instead of a <code>ws://</code> URL) and the
port 443 only. Many HTTP proxy servers are correctly configured to allow
the web browser to use the HTTP <code>CONNECT</code> method to
initiate pass-through connections to <code>https://</code> URLs using
the default port, 443. Sometimes, however, the HTTP proxy does not
allow any port other than 443. Thanks to the encryption provided by
TLS, the proxy server can not observe whether the HTTP
<code>CONNECT</code> method is being used to reach a web server, a
SIP server, a WebSocket server or even something else such as an
<code>ssh</code> server. Therefore, all services, including SIP over
TLS, XMPP client connectivity (<code>c2s</code>), TURN over TLS and
WebSockets, should listen on port 443 so that any of them can be
reached by a user stuck behind a HTTP proxy.</para>
<para>The next issue is the transmission of UDP. In some networks,
users simply can not exchange any UDP packets with external hosts.
The only resolution to this issue is to tunnel the UDP packets through
TCP. Fortunately, the TURN server can also assist, as the TURN
specification includes support for tunneling packets through a TLS
connection. To maximize the chance of success, it is recommended that
the TURN server is also configured to listen on port 443 so that the
connection to the TURN server will be able to pass through HTTP proxy
servers using the HTTP <code>CONNECT</code> method.</para>
<para>This strategy often requires several different processes (the
standard SIP server, the XMPP <code>c2s</code> service, the webserver
hosting a WebRTC phone, the WebSocket server and the TURN server) to
all listen on the same port, 443. For multiple processes to use the
same port, it is necessary to either have a different public IP
address for each process or to use a solution for port multiplexing,
such as the <emphasis>sslh</emphasis> daemon.</para>
<para>With these strategies, connectivity will be possible for the
vast majority of Internet users, whether at home, at the office or
on the road.</para>
</sect1>
</chapter>