-
Notifications
You must be signed in to change notification settings - Fork 52
Add Circuit and Witness serialization for GPU prover #299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @slzk, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request lays the foundational groundwork for integrating GPU-accelerated proving by introducing the capability to serialize circuit and witness data into plain-text files. When a specific environment variable is set, the prover will now output this data, along with relevant transcript information, and then terminate, allowing an external GPU-based system to take over the computation. This is a crucial step towards offloading computationally intensive proving tasks to specialized hardware.
Highlights
- New GPU Crate: Introduced a new
gpuRust crate to encapsulate functionality related to GPU support, including circuit and witness serialization. - Circuit and Witness Serialization: Implemented plain-text serialization of the circuit structure (layers, gates, input/output values) and witness data (transcript digest, proof bytes, and challenges) to files. This serialization is activated when the
EXPANDER_GPUenvironment variable is set to '1'. - Conditional Prover Behavior: Modified the main prover logic (
gkr_vanilla.rs) to conditionally serialize data and exit the process ifEXPANDER_GPUis enabled. This allows an external GPU prover to consume the generated files and continue the proving process. - Field Configuration Adaptation: Updated the
bin/src/main.rsto usex1(single-precision/non-SIMD) field configurations for 'keccak' circuits with 'm31ext3' and 'goldilocks' field types when GPU mode is active, aligning with potential GPU processing requirements. - Transcript Data Access: Adjusted the visibility of
proofandhash_start_indexfields intranscript/src/byte_hash_transcript.rsto public, enabling the newgpucrate to directly access and serialize these components of the witness.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new gpu crate to serialize circuits and witnesses to a text format, intended for a GPU prover. The changes enable this serialization via an environment variable.
While the functionality is a good step, the implementation in gpu/src/serdes.rs has several critical issues that must be addressed:
- Unsafe Code: There is extensive use of
unsafeblocks that rely on the memory layout of generic types and perform unsound pointer casts. This can lead to undefined behavior and is a major security and correctness risk. - Fragile Implementation: The serialization logic relies on parsing
Debugoutput, which is brittle and will break if theDebugimplementations change. - Error Handling: The new code panics on errors instead of returning
Results, which is not robust.
I've left detailed comments on these points with suggestions for how to refactor the code to be safe, robust, and maintainable. These critical issues should be resolved before merging.
| let element_bytes = unsafe { | ||
| std::slice::from_raw_parts( | ||
| (element as *const C::CircuitField) as *const u8, | ||
| element_size | ||
| ) | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using unsafe to transmute a generic type &C::CircuitField to a byte slice is extremely dangerous. This code relies on the internal memory representation of C::CircuitField, which is not guaranteed by Rust. This can lead to undefined behavior, memory corruption, or crashes if the struct layout changes, or on different compiler versions or target architectures.
A safe and robust alternative is to define a trait with a serialization method (e.g., to_bytes_le()) and implement it for each field type. This ensures type safety and removes the need for unsafe code.
This same issue is present in serialize_simd_field_element on lines 78-83.
| // For other field types: m31ext3 and goldilocks | ||
| let debug_str = format!("{:?}", element); | ||
|
|
||
| // Try to extract value from "FieldType { v: value }" format | ||
| if let Some(v_pos) = debug_str.find("v: ") { | ||
| let after_v = &debug_str[v_pos + 3..]; | ||
| if let Some(end_pos) = after_v.find(" }") { | ||
| let value_str = &after_v[..end_pos]; | ||
| return value_str.trim().to_string(); | ||
| } else if let Some(end_pos) = after_v.find("}") { | ||
| let value_str = &after_v[..end_pos]; | ||
| return value_str.trim().to_string(); | ||
| } | ||
| } | ||
|
|
||
| // Panic if parsing fails | ||
| panic!("Failed to parse field element value from debug string: {}", debug_str) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relying on the Debug format for serialization is very fragile. The Debug output is not guaranteed to be stable across library versions and is intended for debugging, not for serialization. Any change to the Debug implementation of the field types will break this code.
Furthermore, parsing strings is inefficient compared to a proper serialization method.
Instead of parsing Debug output, you should implement a dedicated serialization function for each field type, for example via a trait. This will be both robust and more performant.
Additionally, this function panic!s on failure. It should return a Result<String, SerializationError> to allow the caller to handle errors gracefully.
| let (digest_bytes, proof_bytes, hash_start_index) = unsafe { | ||
| use transcript::BytesHashTranscript; | ||
| use gkr_hashers::SHA256hasher; | ||
|
|
||
| // Cast to BytesHashTranscript - the hasher type doesn't matter since we only access | ||
| // digest, proof, and hash_start_index fields which have the same layout regardless of H | ||
| let transcript_ptr = transcript as *mut _ as *mut BytesHashTranscript<SHA256hasher>; | ||
| let bytes_transcript = &*transcript_ptr; | ||
|
|
||
| (&bytes_transcript.digest, &bytes_transcript.proof.bytes, bytes_transcript.hash_start_index) | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The unsafe block here performs a cast from &mut impl Transcript to a concrete type *mut BytesHashTranscript<SHA256hasher>. This is fundamentally unsound and a major violation of Rust's safety guarantees.
This code makes several dangerous assumptions:
- The
Transcriptis always aBytesHashTranscript. - The hasher is always
SHA256hasher. - The memory layout of
BytesHashTranscriptis known and stable.
If any of these assumptions are false, this will lead to undefined behavior, including memory corruption and crashes.
To fix this, you should add methods to the Transcript trait to safely expose the required data (e.g., fn get_proof_bytes(&self) -> Option<&[u8]>). This is the preferred approach as it maintains abstraction and safety. The change in transcript/src/byte_hash_transcript.rs to make fields public is a symptom of this design issue and should be reverted in favor of safe accessor methods.
gkr/src/prover/gkr_vanilla.rs
Outdated
| if std::env::var("EXPANDER_GPU").map_or(false, |v| v == "1") { | ||
| if let Some(vx) = final_vx_claim { | ||
| gpu::serdes::print_final_claims::<F>(&vx, &final_vy_claim); | ||
| println!("GKR final proof claims as shown above."); | ||
| std::process::exit(0); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling std::process::exit(0) here terminates the process abruptly. This prevents the function from returning its values and may bypass proper resource cleanup. While this is likely intended for a debug/development workflow to get the serialized data and stop, it's a very aggressive approach.
A cleaner design would be to have this function return a special value or an error indicating that the process should stop. The caller (e.g., in bin/src/main.rs) can then decide whether to exit gracefully. This makes the gkr_prove function more reusable and its behavior more predictable.
| let field_type = match std::mem::size_of::<C::CircuitField>() { | ||
| 32 => "bn254", // BN254 field | ||
| 4 => "m31ext3", // m31ext3 field | ||
| 8 => "goldilocksext2", // goldilocks field | ||
| _ => "unknown" // unknown field | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using std::mem::size_of to determine the field type is brittle. If another field type is introduced with the same size (e.g., another 32-byte field), it will be incorrectly identified. This can lead to incorrect serialization and hard-to-debug issues.
A more robust approach is to use the FieldEngine::FIELD_TYPE associated constant, which provides a unique identifier for each field type.
Example:
let field_type = match C::FIELD_TYPE {
gkr_engine::FieldType::BN254 => "bn254",
gkr_engine::FieldType::M31x16 => "m31ext3",
gkr_engine::FieldType::Goldilocksx8 => "goldilocksext2",
// ... other cases
_ => "unknown",
};This same issue exists on lines 347-352.
bin/src/main.rs
Outdated
| if std::env::var("EXPANDER_GPU").map_or(false, |v| v == "1") { | ||
| run_benchmark::<M31x1ConfigSha2RawVanilla>(&args, mpi_config.clone()) | ||
| } else { | ||
| run_benchmark::<M31x16ConfigSha2RawVanilla>(&args, mpi_config.clone()) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check for the EXPANDER_GPU environment variable and its value ("1") is duplicated across multiple match arms (e.g., here and on lines 122-126). This introduces redundancy and makes the code less maintainable. Consider extracting this check into a boolean variable at the start of main and using constants for the environment variable name and value. This improves readability and reduces repetition.
Example:
// At the top of main()
const EXPANDER_GPU_ENV: &str = "EXPANDER_GPU";
const EXPANDER_GPU_ENABLED_VALUE: &str = "1";
let is_gpu_enabled = std::env::var(EXPANDER_GPU_ENV).map_or(false, |v| v == EXPANDER_GPU_ENABLED_VALUE);
// ... later in the match statement
if is_gpu_enabled {
run_benchmark::<M31x1ConfigSha2RawVanilla>(&args, mpi_config.clone())
} else {
run_benchmark::<M31x16ConfigSha2RawVanilla>(&args, mpi_config.clone())
}
gkr/src/prover/gkr_vanilla.rs
Outdated
| ); | ||
|
|
||
| // Serialize circuit to file if EXPANDER_GPU environment variable is set to 1 | ||
| if std::env::var("EXPANDER_GPU").map_or(false, |v| v == "1") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gkr/src/prover/gkr_vanilla.rs
Outdated
| // Serialize circuit to file if EXPANDER_GPU environment variable is set to 1 | ||
| if std::env::var("EXPANDER_GPU").map_or(false, |v| v == "1") { | ||
| if let Err(e) = gpu::serdes::serial_circuit_witness_as_plaintext(circuit, transcript, &challenge) { | ||
| println!("Failed to serialize circuit: {}", e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR only add functionality that used to print circuit and witness as plain-text in file