Skip to content

Latest commit

 

History

History

get_multipart

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

HTTP GET Arrow Data: Multipart Examples

This directory contains examples of HTTP servers/clients that send/receive a multipart response (Content-Type: multipart/mixed) containing JSON data (Content-Type: application/json), an Arrow IPC stream data (Content-Type: application/vnd.apache.arrow.stream), and (optionally) plain text data (Content-Type: text/plain).

Picking a Boundary

The multipart/mixed response format uses a boundary string to separate the parts. This string must not appear in the content of any part according to RFC 1341.1

We do not recommend checking for the boundary string in the content of the parts as that would prevent streaming them. Which would add up to the memory usage of the server and waste CPU time.

Recommended Algorithm

For every multipart/mixed response produced by the server:

  1. Using a CSPRNG,2 generate a byte string of enough entropy to make the probability of collision3 negligible (at least 160 bits = 20 bytes).4
  2. Encode the byte string in a way that is safe to use in HTTP headers. We recommend using base64url encoding described in RFC 4648.5

base64url encoding is a variant of base64 encoding that uses - and _ instead of + and / respectively. It also omits padding characters (=).

This algorithm can be implemented in Python using the secret.token_urlsafe() function.

If you generate a boundary string with generous 224 bits of entropy (i.e. 28 bytes), the base64url encoding will produce a 38-character string which is well below the limit defined by RFC 1341 (70 characters).

>>> import secrets
>>> boundary = secrets.token_urlsafe(28)
>>> len(boundary)
38

Footnotes

  1. RFC 1341 - Section 7.2 The Multipart Content-Type

  2. Cryptographically Secure Pseudo-Random Number Generator

  3. Birthday Problem

  4. Hash Collision Probabilities

  5. RFC 4648 - Section 5 Base 64 Encoding with URL and Filename Safe Alphabet