Skip to content
reiddraper edited this page Nov 16, 2011 · 29 revisions

Large Files

This page will document our ideas and questions for large-file support.

Implementation Pseudocode

  1. Inspect request Content-Length header, determine if it's even worth chunking the request. If so, move to step 2, otherwise perform a "normal" put.
  2. Create a UUID. This UUID will be used to namespace blocks from concurrent updates. For example, we don't want a namespace collision between bock 0 of a request/PUT that is in progress to trample over existing data.
  3. Create a metadata/manifest object. This object will have several fields, and be updated as the file is chunked up and written to Riak. The fields are (tentatively):
uuid
bucket
key
content-length
time created
time finished # if complete
blocks remaining # a set of the blocks to-be-written to Riak

to the same {Bucket, Key} that the object would

Questions

  • What should the block size be?

Longer-Term Ideas

  • Rack awareness
  • larger block size
  • multiple-disk support
Clone this wiki locally