Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache z3 results #464

Open
iirekm opened this issue Feb 20, 2025 · 3 comments
Open

Cache z3 results #464

iirekm opened this issue Feb 20, 2025 · 3 comments
Labels
enhancement New feature or request

Comments

@iirekm
Copy link

iirekm commented Feb 20, 2025

z3 can be slow, very slow when codebase grows. On-disk caching would help a lot during development (only new things would be computed).
I tried to do ad-hoc caching with the following wrapper around z3:

#!/bin/bash -e

# Cache directory
CACHE_DIR="$HOME/.z3cache"
mkdir -p "$CACHE_DIR"

# Check if SMT file is provided
if [ "$#" -ne 1 ]; then
    echo "Usage: $0 <smt-file>" >&2
    exit 1
fi

SMT_FILE="$1"

# Ensure the file exists
if [ ! -f "$SMT_FILE" ]; then
    echo "Error: File '$SMT_FILE' not found!"  >&2
    exit 1
fi

# Generate a hash of the SMT file for caching
FILE_HASH=$(sha256sum "$SMT_FILE" | awk '{print $1}')
CACHE_FILE="$CACHE_DIR/$FILE_HASH.result"

# Check if result is already cached
if [ -f "$CACHE_FILE" ]; then
    echo "[CACHE HIT] Returning cached result."  >&2
    cat "$CACHE_FILE"
    exit 0
fi

cat "$SMT_FILE" >"$CACHE_DIR/$FILE_HASH.smt"

# Run Z3 and cache the result
echo "[CACHE MISS] Running Z3..." >&2
"$0".orig "$SMT_FILE" | tee "$CACHE_FILE"

It almost works, but the problem is that the variable identifiers change from run to run (e.g. it's p_x_uint256_9fdce11_00 one run, and p_x_uint256_7721c25_00 another). How to make those identifiers deterministic? From what piece of code do they come from?

In long term a better solution could be made, e.g. with popular joblib.Memory library.

@iirekm iirekm added the enhancement New feature or request label Feb 20, 2025
@iirekm
Copy link
Author

iirekm commented Feb 20, 2025

I made an experimental pull request for that. It needs lots of improvements, but in general - it works.

@0xkarmacoma
Copy link
Collaborator

I see, the idea is to build a global cache that works across runs. That's a cool idea! diskcache in particular seems really well suited for that (and I see it has a method to ensure the cache doesn't grow too big)

the random-look parts in identifiers comes from the uuid library, they change across runs because it is based on the current time. We may be able to seed it or switch to a deterministic id generation mechanism, wdyt @daejunpark ?

@iirekm
Copy link
Author

iirekm commented Feb 22, 2025

The random part also comes from .to_smt2() method (replacing with .sexpr() seems to fix it), and also from somewhere else in the code (I couldn't figure out where, but Z3 queries generated from run to run do differ sometimes). The final result is that the current code in PR does improve speed in some examples (e.g. ArithTest), but even worsens in other (due to unresolved all sources of randomness, and probably the fact that Z3 is called even thousands of times per run so the diskcache library becomes performance bottleneck itself).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants