This directory contains examples of HTTP servers/clients that send/receive a multipart response (Content-Type: multipart/mixed
) containing JSON data (Content-Type: application/json
), an Arrow IPC stream data (Content-Type: application/vnd.apache.arrow.stream
), and (optionally) plain text data (Content-Type: text/plain
).
The multipart/mixed
response format uses a boundary string to separate the
parts. This string must not appear in the content of any part according
to RFC 1341.1
We do not recommend checking for the boundary string in the content of the parts as that would prevent streaming them. Which would add up to the memory usage of the server and waste CPU time.
For every multipart/mixed
response produced by the server:
- Using a CSPRNG,2 generate a byte string of enough entropy to make the probability of collision3 negligible (at least 160 bits = 20 bytes).4
- Encode the byte string in a way that is safe to use in HTTP headers. We
recommend using
base64url
encoding described in RFC 4648.5
base64url
encoding is a variant of base64
encoding that uses -
and _
instead of +
and /
respectively. It also omits padding characters (=
).
This algorithm can be implemented in Python using the secret.token_urlsafe()
function.
If you generate a boundary string with generous 224 bits of entropy (i.e. 28 bytes), the base64url encoding will produce a 38-character string which is well below the limit defined by RFC 1341 (70 characters).
>>> import secrets
>>> boundary = secrets.token_urlsafe(28)
>>> len(boundary)
38