|
1 | 1 | Proof Aggregation
|
2 | 2 | -----
|
3 | 3 |
|
4 |
| - |
5 |
| - |
| 4 | + |
| 5 | +<!-- |
6 | 6 | This repo does proof aggregations for zkEVM proofs.
|
7 | 7 |
|
8 | 8 | ## zkEVM circuit
|
@@ -56,4 +56,191 @@ In addition, it attests that, for chunks indexed from `0` to `k-1`,
|
56 | 56 | - chunk_pi_hash := keccak(chain_id || prev_state_root || post_state_root || withdraw_root || chunk_data_hash) where chunk_data_hash is a public input to the i-th batch snark circuit
|
57 | 57 | - and the related field matches public input
|
58 | 58 |
|
59 |
| -See [public input aggregation](./src/proof_aggregation/public_input_aggregation.rs) for the details of public input aggregation. |
| 59 | +See [public input aggregation](./src/proof_aggregation/public_input_aggregation.rs) for the details of public input aggregation. --> |
| 60 | + |
| 61 | +<!-- # Spec for Dynamic aggregator --> |
| 62 | + |
| 63 | +# Params |
| 64 | +|param|meaning | |
| 65 | +|:---:|:---| |
| 66 | +|k | number of valid chunks| |
| 67 | +|n | max number of chunks per batch| |
| 68 | +|t | number of rounds for the final hash $\lceil32\times n/136\rceil$ | |
| 69 | + |
| 70 | +Currently `n` is hard coded to `10`. |
| 71 | +# Structs |
| 72 | + |
| 73 | +## Chunk |
| 74 | + |
| 75 | +A __chunk__ is a list of continuous blocks. It consists of 4 hashes: |
| 76 | +- state root before this chunk |
| 77 | +- state root after this chunk |
| 78 | +- the withdraw root of this chunk |
| 79 | +- the data hash of this chunk |
| 80 | + |
| 81 | +Those 4 hashes are obtained from the caller. |
| 82 | + |
| 83 | +The chunk's public input hash is |
| 84 | +``` |
| 85 | +chunk_pi_hash := keccak(chain_id || prev_state_root || post_state_root || withdraw_root || chunk_data_hash) |
| 86 | +``` |
| 87 | + |
| 88 | +## Continuous chunks |
| 89 | + |
| 90 | +A list of continuous chunks $c_1, \dots, c_k$ satisfy |
| 91 | +``` |
| 92 | +c_i.post_state_root == c_{i+1}.prev_state_root |
| 93 | +``` |
| 94 | +for $i \in [1, k-1]$. |
| 95 | + |
| 96 | +## Empty chunk |
| 97 | +An __empty chunk__ is a chunk that does not contain any transactions. It is used for padding. |
| 98 | +If $k< n$, $(n-k)$ empty chunks are padded to the list. An empty chunk has the same data fields as a real chunk, and the parameters are set as |
| 99 | +- state root before this chunk: `c_k.post_state_root` |
| 100 | +- state root after this chunk: `c_k.post_state_root` |
| 101 | +- the withdraw root of this chunk: `c_k.withdraw_root` |
| 102 | +- the data hash of this chunk: `keccak("")` |
| 103 | + |
| 104 | +## Batch |
| 105 | + |
| 106 | +A __batch__ consists of continuous chunks of size `n`. If the input chunks' size `k` is less than `n`, we pad the input with `(n-k)` empty chunks using the above logic. |
| 107 | + |
| 108 | +# Circuits |
| 109 | + |
| 110 | +## Chunk circuit |
| 111 | + |
| 112 | +Circuit proving the relationship for a chunk is indeed the zkEVM circuit. It will go through 2 layers of compression circuit, and becomes a __snark__ struct. We do not list its details here. Abstractly, a snark circuit has the following properties: |
| 113 | +- it takes 44 elements as public inputs |
| 114 | + - 12 from accumulators |
| 115 | + - 32 from public input hash |
| 116 | + |
| 117 | +## Empty chunk circuit |
| 118 | +An empty chunk circuit also takes 44 elements as public inputs. |
| 119 | +In our design it is curial that __a same circuit__ is used for both real chunk circuit and empty chunk circuit. In other words, an empty chunk circuit will also go through the same compressions before it is aggregated. |
| 120 | + |
| 121 | + |
| 122 | + |
| 123 | + |
| 124 | +## Aggregation Circuit |
| 125 | + |
| 126 | +We want to aggregate `k` snarks, each from a valid chunk. We generate `(n-k)` empty chunks, and obtain a total of `n` snarks. |
| 127 | + |
| 128 | +In the above example, we have `k = 2` valid chunks, and `2` empty chunks. |
| 129 | + |
| 130 | +> Interlude: we just need to generate 1 empty snark, and the rest `n-k-1` will be identical for the same batch. We cannot pre-compute it though, as the witness `c_k.post_state_root` and `c_k.withdraw_root` are batch dependent. |
| 131 | +
|
| 132 | +### Configuration |
| 133 | + |
| 134 | +There will be three configurations for Aggregation circuit. |
| 135 | +- FpConfig; used for snark aggregation |
| 136 | +- KeccakConfig: used to build keccak table |
| 137 | +- RlcConfig: used to compute RLC of hash inputs |
| 138 | + |
| 139 | +### Public Input |
| 140 | +The public input of the aggregation circuit consists of |
| 141 | +- 12 elements from accumulator |
| 142 | +- 32 elements of `batch_pi_hash` |
| 143 | +- 1 element of `k` |
| 144 | + |
| 145 | +### Statements |
| 146 | +For snarks $s_1,\dots,s_k,\dots, s_n$ the aggregation circuit argues the following statements. |
| 147 | + |
| 148 | +1. batch_data_hash digest is reused for public input hash. __Static__. |
| 149 | + |
| 150 | +2. batch_pi_hash used same roots as chunk_pi_hash. __Static__. |
| 151 | +``` |
| 152 | +batch_pi_hash := keccak(chain_id || chunk_1.prev_state_root || chunk_n.post_state_root || chunk_n.withdraw_root || batch_data_hash) |
| 153 | +``` |
| 154 | +and `batch_pi_hash` matches public input. |
| 155 | + |
| 156 | +3. batch_data_hash and chunk[i].pi_hash use a same chunk[i].data_hash when chunk[i] is not padded |
| 157 | + |
| 158 | +``` |
| 159 | +for i in 1 ... __n__ |
| 160 | + chunk_pi_hash := keccak(chain_id || prev_state_root || post_state_root || withdraw_root || chunk_data_hash) |
| 161 | +``` |
| 162 | + |
| 163 | +This is done by compute the RLCs of chunk[i]'s data_hash for `i=0..k`, and then check the RLC matches the one from the keccak table. |
| 164 | + |
| 165 | +4. chunks are continuous: they are linked via the state roots. __Static__. |
| 166 | + |
| 167 | +for i in 1 ... __n-1__ |
| 168 | +``` |
| 169 | +c_i.post_state_root == c_{i+1}.prev_state_root |
| 170 | +``` |
| 171 | + |
| 172 | +5. All the chunks use a same chain id. __Static__. |
| 173 | +``` |
| 174 | +for i in 1 ... __n__ |
| 175 | + batch.chain_id == chunk[i].chain_id |
| 176 | +``` |
| 177 | + |
| 178 | +6. The last `(n-k)` chunk[i]'s prev_state_root == post_state_root when chunk[i] is padded |
| 179 | +``` |
| 180 | +for i in 1 ... n: |
| 181 | + is_padding = (i > k) // k is a public input |
| 182 | + if is_padding: |
| 183 | + chunk_i.prev_state_root == chunk_i.post_state_root |
| 184 | + chunk_i.withdraw_root == chunk_{i-1}.withdraw_root |
| 185 | + chunk_i.data_hash == [0u8; 32] |
| 186 | +``` |
| 187 | +7. chunk[i]'s data_hash len is `0` when chunk[i] is padded |
| 188 | + |
| 189 | + |
| 190 | +### Handling dynamic inputs |
| 191 | + |
| 192 | + |
| 193 | + |
| 194 | + |
| 195 | + |
| 196 | +Our keccak table uses `2^19` rows. Each keccak round takes `300` rows. When the number of round is is less than $2^19/300$, the cell manager will fill in the rest of the rows with dummy hashes. |
| 197 | + |
| 198 | +The only hash that uses dynamic number of rounds is the last hash. |
| 199 | +Suppose we target for `MAX_AGG_SNARK = 10`. Then, the last hash function will take no more than `32 * 10 /136 = 3` rounds. |
| 200 | + |
| 201 | +We also know in the circuit if a chunk is an empty one or not. This is given by a flag `is_padding`. |
| 202 | + |
| 203 | +For the input of the final data hash |
| 204 | +- we extract `32 * MAX_AGG_SNARK` number of cells (__static__ here) from the last hash. We then compute the RLC of those `32 * MAX_AGG_SNARK` when the corresponding `is_padding` is not set. We constraint this RLC matches the `data_rlc` from the keccak table. |
| 205 | + |
| 206 | + |
| 207 | +For the output of the final data hash |
| 208 | +- we extract all three hash digest cells from last 3 rounds. We then constraint that the actual data hash matches one of the three hash digest cells with proper flags defined as follows. |
| 209 | + |
| 210 | +|#valid snarks | offset of data hash | flags| |
| 211 | +|---| ---| ---| |
| 212 | +|1,2,3,4 | 0 | 1, 0, 0| |
| 213 | +|5,6,7,8 | 32 | 0, 1, 0 | |
| 214 | +|9,10 | 64 | 0, 0, 1| |
| 215 | + |
| 216 | +Additional checks for dummy chunk |
| 217 | +- if `is_padding` for `i`-th chunk, we constrain `chunk[i].prev_state_root = chunk[i].post_state_root` |
| 218 | +- if `is_padding` for `i`-th chunk, we constrain `chunk[i-1].withdraw_root = chunk[i].withdraw_root` |
| 219 | +- if `is_padding` for `i`-th chunk, we constrain `chunk[i-1].data_hash.len() == 0` |
| 220 | + |
| 221 | +<!-- |
| 222 | +1. Extact the final `data_rlc` cell from each round. There are maximum $t$ of this, denoted by $r_1,\dots r_t$ |
| 223 | + - __caveat__: will need to make sure the circuit is padded as if there are $t$ rounds, if the actual number of rounds is less than $t$. This is done by keccak table already: |
| 224 | + all columns of keccak table are padded to `1<<LOG_DEGREE` by construction (__need to double check this is circuit dependent__) |
| 225 | +2. Extract a challenge and then compute `rlc:= RLC(chunk_1.data_hash || ... || chunk_k.data_hash)` using a __phase 2__ column |
| 226 | +3. assert `rlc` is valid via a lookup argument |
| 227 | + - constrain `rlc` cell is within the "data_rlc" column of keccak table via standard lookup API |
| 228 | + - potential optimization: avoid using lookup API. There is only $t$ elements as $rlc \in \{r_1,\dots r_t\}$ and we may check equality one by one. |
| 229 | + --> |
| 230 | + |
| 231 | +<!-- |
| 232 | +Circuit witnesses: |
| 233 | +- a list of k __real__ CHUNKs, each with 44 elements of public inputs (12 from accumulators and |
| 234 | +32 from public input hash) |
| 235 | + - |
| 236 | + - Those 4 hashes are obtained from the caller. |
| 237 | + - It's public input hash is |
| 238 | + - chunk_pi_hash := keccak(chain_id || prev_state_root || post_state_root || withdraw_root || |
| 239 | + chunk_data_hash) |
| 240 | +Circuit public inputs: |
| 241 | +- an accumulator of 12 elements |
| 242 | +- a batch public input hash of 32 elements |
| 243 | +- the value k, 1 element |
| 244 | +
|
| 245 | +The aggregation circuit aggregates MAX_AGG_NUM snarks. |
| 246 | +If k < MAX_AGG_NUM, dummy snarks will be padded --> |
0 commit comments