Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: proof aggregation #670

Merged
merged 40 commits into from
Jul 28, 2023
Merged

Feat: proof aggregation #670

merged 40 commits into from
Jul 28, 2023

Conversation

zhenfeizhang
Copy link

@zhenfeizhang zhenfeizhang commented Jul 24, 2023

Description

  • replace Proof aggregation circuit #523 and dynamic proof aggregation #564
  • diff against dynamic aggregation
  • update circuit to handle keccak with empty input
  • mock prover test with aggregation circuit
  • real prover test with aggregation circuit
    • log: super:/home/ubuntu/zhenfei/0227/zkevm-circuits/aggregator/agg_pass_0724.log
  • e2e test and integration @ scroll-prover
  • update document and spec w.r.t. static aggregation
  • re-enable skip first pass
  • re-enable feature flags to turn off different modules

Issue Link

[link issue here]

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Contents

  • see aggregator/readme

Rationale

[design decisions and extended information]

How Has This Been Tested?

aggregator/test.sh

Sorry, something went wrong.

This was referenced Jul 24, 2023
@zhenfeizhang zhenfeizhang self-assigned this Jul 24, 2023
@zhenfeizhang zhenfeizhang marked this pull request as ready for review July 24, 2023 19:28
@zhenfeizhang
Copy link
Author

The diff between dynamic aggregation and static aggregation can be found here.

@kunxian-xia
Copy link

When the current integration test that do not skip 1st pass succeed, then we need to test the aggregation circuit with the skip_first_pass feature enabled.

@zhenfeizhang
Copy link
Author

When the current integration test that do not skip 1st pass succeed, then we need to test the aggregation circuit with the skip_first_pass feature enabled.

Yes. This is left as a todo for now. The code does not compile yet when we skip first pass

@zhenfeizhang
Copy link
Author

hmmm some how 47beb9e reintroduced the issue with constraint system after using the new fixed cells method

@lispc lispc changed the title Static proof aggregation Feat: proof aggregation Jul 28, 2023
@lispc
Copy link

lispc commented Jul 28, 2023

Will merge it first. Review comments can be addressed in later prs

@lispc
Copy link

lispc commented Jul 28, 2023

Oh there are 4 pending issues. I have to manually close them now to merge.....

@lispc lispc merged commit 1ed6efd into develop Jul 28, 2023
@lispc lispc deleted the static-proof-aggregation branch July 28, 2023 07:42
@kunxian-xia
Copy link

It will be more managable if we can collect all bug fix PRs into one large pr.

Comment on lines +112 to +118
let (hash_input_cells, hash_output_cells, data_rlc_cells, hash_input_len_cells) =
extract_hash_cells(
&config.keccak_circuit_config,
layouter,
challenges,
preimages,
)?;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think that directly using the keccak's input cells and output cells is not a good design.

  1. It's very hard to handle the case in which number of input bytes to get batch data hash is not fixed. If you change the constant MAX_AGG_SNARKS to something larger than 10, then you have more cases to handle as the position of the output of batch data hash cells lie in a larger interval. This is not desireable in my opinion.
  2. The cells in aggregation circuit are too tightly coupled with keccak sub-circuit. It's hard to audit the code.

Therefore, I have the following suggestion:

  1. Allocate 1 advice column for holding the input and output cells for doing keccak;
  2. Allocate 1 advice column for accumulating the length of keccak.
  3. Allocate 1 selector for lookup into the keccak table.

The layout will be looking like (using the chunk_pi_hash as an example)

adv len q_keccak
chain_id[0] 1 0
.. .. 0
chain_id[7] 8 0
prev_state_root[0] 9 0
.. .. 0
chunk_data_hash[31] 136 0
rlc cells for getting rlc(chunk_pi_hash_preimage) .. .. 0
chunk_pi_hash[0] 1 0
chunk_pi_hash[1] 2 0
.. .. 0
chunk_pi_hash[31] 32 0
rlc cells for getting rlc(chunk_pi_hash) .. 0
input_rlc 136 1
output_rlc .. 0

The layout for getting batch_data_hash can be drawn similarly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spend some time to think about this.

The cells in aggregation circuit are too tightly coupled with keccak sub-circuit. It's hard to audit the code.

is there a case where we modify the keccak circuit while not require changes to this pi aggregation circuit? I think not.

So the real complaint here is that the code is hard coded for MAX_AGG_SNARK. This I agree.

OTOH I think the keccak circuit itself is not well designed. I.e., currently, it uses 87 columns and 2^19 rows, i.e., roughly 43 million cells, supporting 1600ish keccak rounds; while we actually only used 25 rounds, or, 87 * 25 * 300 = 652 K cells. If we stick with k=20, then we should be able to fill all those in a single column or two, rather than 87 columns. This will result into 3x in the overall aggregator performance.

(there are also quite a few wasted cells within those 652k cells as per currently layout of the keccak table)

I think a refactor of (semi-static) keccak circuit is of a higher priority than making the above code less hard-coded.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a case where we modify the keccak circuit while not require changes to this pi aggregation circuit? I think not.

But the design goal is this aggregation circuit should not need any change even if we change the layout of keccak circuit to a more thinner one, right?
For now, if we shrink the keccak circuit's layout from 87 cols to like 29 cols (num_rows_per_round: 12 -> 50), then we need at least refactor the get_indices function which is an example of tight coupling that I worried.

Comment on lines +478 to +484
let not_flag3 = rlc_config.is_smaller_than(
&mut region,
&num_of_valid_snarks_cell[0],
&eight,
&mut offset,
)?;
let flag3 = rlc_config.not(&mut region, &not_flag3, &mut offset)?;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flag3 is true iff num_of_valid_snarks_cell[0] > 8. The comparison here is buggy because if num_of_valid_snarks_cells[0] = 8, then this not_flag3 = false and flag3 will be true (which should not be the case!).

Therefore, I think we should change the comparison to

let flag3 = rlc_config.is_smaller_than(&mut region, &eight, &num_of_valid_snarks_cell[0], &mut offset)?;
let not_flag3 = rlc_config.not(&mut region, &flag3, &mut offset)?;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in #699

flag3 is true is #snarks > 8. the proposed solution does not work as well.
I changed it to

let not_flag3 = rlc_config.is_smaller_than(
    &mut region,
    &num_of_valid_snarks_cell[0],
    &nine,
    &mut offset,
)?;
let flag3 = rlc_config.not(&mut region, &not_flag3, &mut offset)?;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not_flag3 = (num_snarks < 9) in your proposed solution is equivalent to
flag3 = (num_snarks >= 9) which is equivalent to (8 < num_snarks)? Therefore I think my proposed solution should work.

@zhenfeizhang
Copy link
Author

It will be more managable if we can collect all bug fix PRs into one large pr.

here it is: #698

zhenfeizhang added a commit that referenced this pull request Aug 1, 2023
lispc added a commit that referenced this pull request Aug 4, 2023
…s for the dummy chunk proofs to avoid dummy chunk proofgen time (#712)

* partiall address review comments of #670

* disable disable_proof_agg by default

* fix soundness issues with data hash

* add fixed cells for chunk_is_valid

* fix typo

* well... need morning coffee

* typo in readme

* fix soundness issue in is_smaller_than

* [feat] unify u8 u16 lookup table (#694)

* add range table

* replace Range256Table of rlp_circuit_fsm

* replace u16_table of tx_circuit

* use TableColumn

* add type alias

* use type alias

* annotate lookup column

* opt min_num_rows_block in poseidon_circuit (#700)

prune debug prints and lint

* speedup ci using verify_at_rows (#703)

* speedup ci using verify_at_rows

* use verify_at_rows to speedup super circuit ci tests

* update super circuit row estimation API (#695)

* remove num_of_chunks in aggregation circuit's instance column (#704)

Co-authored-by: kunxian xia <xiakunxian130@gmail.com>

* fix: lost tx_type when doing type conversion (#710)

* fix condition (#708)

* gates for zero checks

* statement 1 is correct

* reenable statements 3,6,7

* reenable statement 4

* everything seems to work again

* update aggregation test accordingly

* update spec

* minor clean up

* Fix fmt.

* Make `ChunkHash` fields public.

* fix decompose function

* update figures

* clean up

* address comments

* add is_final checks

* update readme

* constraint hash input length

* fix clippy

---------

Co-authored-by: Akase Cho <lightsing@users.noreply.github.com>
Co-authored-by: Ho <noel.wei@gmail.com>
Co-authored-by: Zhang Zhuo <mycinbrin@gmail.com>
Co-authored-by: kunxian xia <xiakunxian130@gmail.com>
Co-authored-by: Steven Gu <asongala@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants