Skip to content

Commit 155ddd4

Browse files
authored
Merge pull request #47 from Convex-Dev/develop
Publish lattice technology updates
2 parents 370d8e8 + 34da143 commit 155ddd4

File tree

5 files changed

+124
-7
lines changed

5 files changed

+124
-7
lines changed

README.md

+3-2
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,8 @@ This repository is dedicated to Convex Design and Documentation.
66

77
## Key Documents
88

9-
- [Convex Manifesto](papers/manifesto.md)
10-
- [Convex White Paper (Draft)](papers/convex-whitepaper.md)
9+
- [Convex Manifesto](docs/overview/manifesto.md)
10+
- [Convex White Paper](docs/overview/convex-whitepaper.md)
1111

1212
## Current CADs
1313

@@ -57,6 +57,7 @@ The main [Convex repository](https://github.com/Convex-Dev/convex) is the primar
5757
- `convex-core` - the core Convex data structures and algorithms including CPoS
5858
- `convex-peer` - Convex peer implementation and P2P networking
5959
- `convex-gui` - Convex Desktop GUI Application
60+
- `convex-cli` - Convex CLI utilities
6061
- `convex-restapi` - REST API Server implementation
6162

6263
| Name | Description | Status | Lead Dev.

docs/cad/013_metadata/README.md

-2
Original file line numberDiff line numberDiff line change
@@ -16,9 +16,7 @@ Metadata is a map containing any arbitrary set of key-values. It is specified af
1616

1717
```clojure
1818
(def some-symbol
19-
2019
^{:my ["meta" :data]}
21-
2220
42)
2321
```
2422

docs/overview/lattice.md

+118
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
---
2+
title: Lattice Technology
3+
authors: [convex, mikera]
4+
sidebar_position: 1
5+
tags: [convex, lattice]
6+
---
7+
8+
# Lattice Technology
9+
10+
Lattice technology is the breakthrough that the decentralised digital world has been waiting for.
11+
12+
Imagine an infinitely scalable, self-repairing, decentralised cloud of data and compute resources accessed by self-sovereign individuals, secured with strong cryptographic technology and backed up by powerful consensus algorithms. Anybody can participate, nobody can control it. This is the promise of the Lattice.
13+
14+
## How the Lattice works
15+
16+
### Algebraic foundation
17+
18+
The Lattice is based on the mathematical / algebraic concept of a [lattice](https://en.wikipedia.org/wiki/Lattice_(order)). **Lattice values** are elements of a set where there is a *merge* function that can combine any two lattice values.
19+
20+
By repeated merges of lattice values, the system is guaranteed to converge to a single lattice value (in the sense of eventual consistency). This enables the Lattice to operate as a [Conflict-free Replicated Data Type (CRDT)](https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type).
21+
22+
### Lattice innovations
23+
24+
Lattice technology augments the idea of the CRDT in several key ways:
25+
- Adding cryptographic security to enable secure decentralised operation (digital signatures and cryptographic hashes).
26+
- Ability to create consensus over an ordering of transactions (essential for transaction security, e.g. the double-spend problem)
27+
- Use of powerful immutable persistent data structures as the lattice values. These can be of arbitrary size and contain arbitrary data, but only the differences need to be transmitted and processed - similar to the "git" version control system.
28+
- Lattice data structures are also Merkle trees, proving strong integrity guarantees and fast identity checking.
29+
- Enforcement of rules regarding which incoming lattice values are "accepted" by a participant: this prevents malicious actors from disrupting the Lattice as a whole. Merging a bad lattice value is generally pointless: in most cases all it means is that a participant wastes resources producing a lattice value that will subsequently be ignored by others, so there is an incentive for all participants to immediately reject such values.
30+
31+
## Parts of the Lattice
32+
33+
Sections of the Lattice are defined by the lattice values they utilise and how these values are merged, which in turn defines the rules by which they operate.
34+
35+
Each section is effectively a sub-lattice of the Lattice as a whole: we exploit the property that a map of keys to lattice values is itself a lattice (with the simple merge function: combine entries of both maps into a single map and merge the lattice values of any keys that collide)
36+
37+
Participants enforce these rules on a decentralised basis. Anyone breaking the rules and sharing illegal values is able to do so, but the lattice values they produce will effectively be ignored by other participants: such behaviour cannot harm the integrity of the Lattice as a whole.
38+
39+
The lattice is being initialised with a number of sub-lattices that perform critical functions, outlined below.
40+
41+
### Convex Consensus Lattice
42+
43+
The Convex CPoS consensus algorithm operates a lattice designed to provide a secure, decentralised global state machine.
44+
45+
Lattice values are beliefs, which are shared by peers and merged using the belief merge function, as defined in the [Convex White Paper](convex-whitepaper.md).
46+
47+
The Consensus Lattice performs the functions of a typical L1 blockchain:
48+
- A global state machine, publicly visible but with changes protected by byzantine fault tolerant consensus
49+
- An account based model allowing self-sovereign control of digital assets protected by digital signatures
50+
- The ability to use Turing-complete smart contracts and autonomous actors as "unstoppable code" on the CVM
51+
- Capability to store and manage arbitrary data as roots of trust for other decentralised applications
52+
53+
### Data Lattice
54+
55+
The Data Lattice is a lattice which stores arbitrary content-addressable data on a self-sovereign basis.
56+
57+
Lattice values are arbitrary sets of data (indexed by cryptographic hash) and the merge function simply takes the union of these sets. Nodes may discard values they are not interested in in order to save resources (effectively deleting such values from the )
58+
59+
Three essential functions are supported:
60+
- Participants can **read** data from a lattice node that they can access, acquiring whatever data is associated with a given hash
61+
- Lattice nodes can **acquire** data from other nodes, again indexed by cryptographic hash. This brings a copy of the data into the local storage of the node. Acquisition can be from a specific node, or searched for across the whole data lattice (similar to Bittorrent)
62+
- Controllers of nodes can **pin** data they are interested in so that it is retained by their lattice node. This ensures at least one copy will always be available to the lattice as a whole
63+
64+
The Data Lattice is similar in concept to IPFS / IPLD, but based on higher performance and more efficient lattice technology.
65+
66+
### Data Lattice File System (DLFS)
67+
68+
DLFS is a lattice that builds on top of the data lattice to provide self-sovereign replicated file systems.
69+
70+
Lattice values are file system trees with files and directory similar to a traditional file system. The merge function operates like file replication: files are updated if they are more recent versions and if the party making the change is authorised to do so.
71+
72+
Because lattice values are immutable persistent data structure, it is also possible to "snapshot" an entire DLFS drive with a single cryptographic hash. This snapshot could, for example, be pinned in the data lattice for audit / backup / analysis purposes. This operation is extremely efficient because of structural sharing: most of the actual storage will be shared with the current DLFS drive and/or other snapshots so this operation is extremely efficient (you are only really storing the deltas from other versions).
73+
74+
### Execution Lattice
75+
76+
The execution lattice specifies compute tasks to be performed on a decentralised basis.
77+
78+
Lattice values are a map of job IDs to signed and timestamped job record. The merge function again combines these maps, with the most recent correctly signed job status preferred in event of collisions.
79+
80+
Job records consist of:
81+
- A specification of the compute job to be performed
82+
- Metadata about the job (including authorisation for completing the job)
83+
- A map of inputs (provided by the requestor)
84+
- A map of outputs (filled in by the completer)
85+
86+
Importantly, such job executions are highly extensible. They can utilise any form of compute task including computation in private enclaves, use of encrypted data or harnessing specialised compute infrastructure. Flexible authorisation makes it possible to specify tasks that must be complete by a specific party, or to make it open for anyone to complete the task (perhaps in exchange for some for of tokenised payment)
87+
88+
### P2P Lattice
89+
90+
The P2P lattice is a lattice designed to facilitate P2P communications. It solves the problem of being able to identify and locate participants on a decentralised network, especially with respect to resolving IP addresses for communication. Peers can be lattice nodes (e.g. Convex peers) or clients wishing to set up secure communication channels with other clients.
91+
92+
Lattice values are a map of public keys to signed and timestamped metadata describing a peer. The merge function is simply to combine these maps, and to take the most recent correctly signed metadata if keys collide.
93+
94+
The P2P lattice operates in a manner similar to [Kademlia](https://en.wikipedia.org/wiki/Kademlia), allowing the location of arbitrary peers on the Internet without depending on any decentralised location service. In the Kademlia model, peers only need to store metadata for other peers that they are relatively "near" to in cryptographic space, making this a highly efficient and fault-tolerant decentralised service.
95+
96+
## Efficiency and scalability
97+
98+
How do we build a global, decentralised data structure of unlimited scale? How do we make it fast? Or even feasible?
99+
100+
There are a number of key engineering ideas here. We've been building and stress-testing lattice technology for 5+ years which has given use some unique implementation advantages and insights:
101+
102+
**Structural sharing** - using immutable [persistent data structures](https://en.wikipedia.org/wiki/Persistent_data_structure) means that when changes to a large lattice value are made, a new lattice value is produced which shares most of its data with the original value. This means that storage and processing is only required.
103+
104+
**Selective attention** - nodes may select whichever subsets of the lattice they are interested in handling on a self-sovereign basis. This means that participants can scale their resource usage based on their own needs. For example, a Convex peer operator might elect only to participate in the the Convex consensus lattice and a small subset of DLFS drives representing data that the peer operator needs to access and maintain.
105+
106+
**Delta transmission** - building upon structural sharing, it is possible to only transmit the deltas (changes) when a new lattice value is communicated. This assumes that the recipient will have the original data, but this is a good assumption if they are already participating in the same lattice (and if they don't they can simply acquire it again...). This means that network / communication requirements are only ever (at most)proportional to the number of changes made in regions of the Lattice that a specific node has chosen to participate in.
107+
108+
**Merge coalescing** - A node may receive multiple lattice values from different sources in a short amount of time. With a series of repeated merges, it produces a new lattice value incorporation all of these updates. It then only needs to produce and transmit one new lattice value (typically with a much smaller much smaller delta than the sum of those received). This coalescing behaviour therefore automatically reduces traffic and scales the load to the amount that nodes can individually handle (typically, network transmission bandwidth will be the primary bottleneck since local lattice value merges are very fast).
109+
110+
**Embedded encodings** - Merkle trees have the disadvantage that they require the computation of a cryptographic hash at every branch of the tree. This can become expensive with a large number of small branches, so the Lattice makes use of a novel efficient encoding scheme (outlined in [CAD003](../cad/003_encoding/README.md)) that compresses multiple embedded values into a single Merkle tree branch (while still maintaining the important property of having a unique encoding with a content-addressable hash). Typical branches might be around 1000 bytes on average (and never less than 141 bytes), which ensures efficiency from a hashing perspective and also keeps overall storage requirements near-optimal.
111+
112+
**Branching factor** - there is a trade-off with branching factors in Merkle trees. Too low, and your tree becomes excessively deep with a lot of extra intermediate hashes to store and compute. Too high, and the encoding of a single branch becomes large, meaning that small changes result in a lot of redundant copying. Lattice values are optimised to provide efficient branching ratios for different use cases (typically ~10). In all cases: The number of branches, encoding size and cost of navigating to a direct child branch are guaranteed to be `O(1)` by design.
113+
114+
**Orthogonal persistence** - Lattice values can exist in memory on other storage media (typically local disks). From a developer perspective, these are effectively identical, there is no need to treat in-memory and externally stored values differently. However, values are efficiently loaded from storage and cached on demand, so that most of the time the lattice behaves like a very fast in-memory database despite being potentially much larger than local RAM.
115+
116+
**Fast comparison** - Lattice values enable some extremely quick comparison algorithms which lattice technology fully exploits. Most simply, checking the identity of any two values is simply the comparison of two cryptographic hashes, which can be done in `O(1)` time. Perhaps surprisingly, computing the common prefix of two vectors of arbitrary length is also just `O(1)`, which is heavily exploited to compare transaction orderings efficiently in CPoS. More sophisticated comparisons include computing differences between multiple lattice data structures (typically `O(n)` or `O(n log n)` where `n` is the size of differences). It is thanks to these comparison algorithms that we are able to implement extremely fast lattice merge operations.
117+
118+
**Garbage collection** - lattice values work *extremely* well with a model of lazy garbage collection. Technically, you can keep lattice values as long as you like (they are immutable and content-addressable after all, so never go stale). However, sooner or later you are likely to hit storage constraints. In this case, you can simply retain the subset of the lattice(s) you are interested in as identified by current root value(s) and discard all other values. This works both for both in-memory caches (e.g. leveraging the JVM GC) and long term storage (e.g. `convex etch gc`).

docs/overview/manifesto.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -8,11 +8,11 @@ tags: [convex, community, philosophy]
88

99
Building Open, Decentralized Economies for the 21st Century
1010

11-
The time for change is now. For far too long, our societies have been shackled by inefficient and inequitable economic systems. Centralized institutions wield excessive power, exploiting monopolistic control over much of our economic activity. Transaction costs are exorbitant, hindering progress and imposing unnecessary burdens on individuals. Countless people are unfairly excluded from financial and economic participation. Our current economic models are inflicting irreversible damage on the natural world we all share.
11+
For too long, our societies have been shackled by inefficient and inequitable economic systems. Centralised institutions wield excessive power, exploiting monopolistic control over much of our economic activity. Transaction costs are exorbitant, hindering progress and imposing unnecessary burdens on individuals. Countless people are unfairly excluded from financial and economic participation. Our current economic models are inflicting irreversible damage on the natural world we all share.
1212

1313
Artificial intelligence is rapidly transforming the digital economy, and indeed, the entire world. At this pivotal moment in human history, it is imperative that control over data, computational power, and the economy as a whole is returned to the hands of individuals.
1414

15-
Convex is a public, decentralized system for real-time peer-to-peer exchange of data and value, designed as a foundational layer for the digital economy in the age of AI. As such, it provides the bedrock for the kind of economics we envision – fair, inclusive, efficient and sustainable.
15+
Convex is a public, decentralised system for real-time peer-to-peer exchange of data and value, designed as a foundational layer for the digital economy in the age of AI. As such, it provides the bedrock for the kind of economics we envision – fair, inclusive, efficient and sustainable.
1616

1717
This manifesto outlines our core beliefs and principles.
1818

docs/tutorial/convex-lisp/lisp-cvm.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -312,7 +312,7 @@ Deploying libraries is like deploying an Actor, with a few key differences to no
312312

313313
### Important security note
314314

315-
A key difference between a `call` to an Actor function and running library code is the difference in *security context*:
315+
A key difference between a `call` to an actor function and running library code is the difference in *security context*:
316316

317317
- An actor `call` runs code in the actor's account and environment, with the actor itself as the current `*address*` (and the calling account as `*caller*`)
318318
- Library code runs in the environment of the current account, i.e. `*address*` is unchanged

0 commit comments

Comments
 (0)