|
| 1 | +--- |
| 2 | +slug: cad3-revolution |
| 3 | +title: The CAD3 Revolution |
| 4 | +authors: [mikera] |
| 5 | +tags: [convex, cad3, lattice] |
| 6 | +--- |
| 7 | + |
| 8 | +A quick note on [CAD3](https://docs.convex.world/docs/cad/encoding) because I think it is important for everyone to understand how important this is - it's also probably the last significant piece we NEED to get right before Protonet goes live. |
| 9 | +<!-- truncate --> |
| 10 | + |
| 11 | +What is CAD3? It's the format with which we encode lattice data, e.g. the number `13` becomes the 2-byte sequence `0x110d`. If you've used Convex Desktop, you may recognise these from the message encoding utility in the "Hacker Tools". |
| 12 | + |
| 13 | +### Why this is critical |
| 14 | + |
| 15 | +The CAD3 encodings are important to everything we are doing: |
| 16 | +- These encodings describe all the data in Convex and other lattice applications: the global state, DLFS drives, lattice structures for merging, transactions, CVM smart contract code etc. |
| 17 | +- These sequences of bytes (encodings) are what we put through a SHA3-256 cryptographic hash to build Merkle DAGs and verify integrity of data |
| 18 | +- These are also the raw bytes that get transmitted between peers and binary clients |
| 19 | +- These are also the bytes that get stored to disk in Etch |
| 20 | +- These are also performance critical - a lot of the performance in Convex depends on how fast we can encode, transmit and store data |
| 21 | +- These are also security critical - attackers might attempt to construct malicious encodings to circumvent security or mount a DoS attack |
| 22 | + |
| 23 | +Hopefully this makes it clear: these encodings are pretty essential to Convex and lattice technology as a whole! They are also very hard to change after we go live: changing encodings would mean everyone needs to re-encode all their data in the new format! Hence why we're super focused on getting this right before Protonet launch. |
| 24 | + |
| 25 | +### The Good News |
| 26 | + |
| 27 | +The good news: we are now very close to having a near-perfect encoding format for decentralised data. Has the potential to be a game changer much more broadly, as it solves a lot of the problems with existing encoding formats when used for decentralised data. Some juicy features: |
| 28 | +- A **unique canonical encoding** for every value, such that it can be hashed to a stable ID |
| 29 | +- An **efficient binary format** for both storage and transmission |
| 30 | +- A **self describing** format - no additional schema is required to read an encoding |
| 31 | +- Provision of **immutable persistent data structures** for the lattice data values used in Convex |
| 32 | +- Automatic generation of a verifiable **Merkle DAG** via references to other value IDs |
| 33 | +- Support for **rich data types** used in the CVM and lattice data (Maps, Sets, Vectors, Blobs etc.) |
| 34 | +- Data structure of **arbitrary size** may be represented. The lattice is huge. |
| 35 | +- Support for **partial data**: we often need to transmit deltas of large data structures, so need a way to build these deltas and reconstruct the complete structure when they are received (assuming existing data can fill the gaps) |
| 36 | +- Ability to read encode / decode n bytes of data in O(n) time and space to ensure **DoS resistance** |
| 37 | +- Fixed upper bound on the encoding size of any value (excluding referenced children) so that reading and writing can occur in fixed sized buffers - this allows **streaming capabilities** including zero-copy operations. |
| 38 | + |
| 39 | +### What next? |
| 40 | + |
| 41 | +Full CAD3 specifications are outlined in [CAD003](/docs/cad/encoding). For anyone wanting to work on the CAD3 format or it's implementation in Convex please get involved! |
0 commit comments