parse: Add set_verify_checksums API; improve fuzz coverage#11
Closed
parse: Add set_verify_checksums API; improve fuzz coverage#11
Conversation
The parse.rs and differential.rs fuzz targets were getting almost zero coverage of deeper parser logic (PAX extensions, GNU long name/link, sparse files, etc.) because random fuzz input almost never has valid tar header checksums. The parser's verify_checksum() call at the top of parse_header() rejects ~100% of random inputs immediately. For the parse.rs fuzzer, add a Parser::set_verify_checksums(bool) API that allows skipping checksum verification entirely. The fuzzer uses this ~90% of the time (determined by the first byte of input), letting the fuzzer exercise all the parsing code paths beyond the checksum gate. For the differential fuzzer, since both tar-core and tar-rs must see identical data with valid checksums, use a fixup_checksums() approach that rewrites checksum fields in-place before passing to both parsers. Assisted-by: OpenCode (Claude claude-opus-4-6)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The
parse.rsanddifferential.rsfuzz targets were getting almost zero coverage of deeper parser logic (PAX extensions, GNU long name/link, sparse files, etc.) because random fuzz input almost never has valid tar header checksums. The parser'sverify_checksum()call at the top ofparse_header()rejects ~100% of random inputs immediately.Changes
Library (
src/parse.rs): AddParser::set_verify_checksums(bool)API that controls whether header checksums are verified during parsing. Default istrue(safe by default). This follows the same pattern as the existingset_allow_empty_path(bool)API.parse.rsfuzzer: Use the new API to skip checksum verification ~90% of the time (determined by the first byte of input), letting the fuzzer exercise all parsing code paths beyond the checksum gate. The remaining 10% still tests checksum validation itself.differential.rsfuzzer: Since both tar-core and tar-rs must see identical data with valid checksums, use afixup_checksums()approach that rewrites checksum fields in-place before passing to both parsers. Also minor cleanup: extractcompare_entries()helper, use idiomaticzip+enumerate.