Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache data lifecycles #218

Open
JesseTheRobot opened this issue Feb 15, 2025 · 0 comments
Open

Cache data lifecycles #218

JesseTheRobot opened this issue Feb 15, 2025 · 0 comments

Comments

@JesseTheRobot
Copy link
Member

JesseTheRobot commented Feb 15, 2025

A subtask of Scalable data ingress, this task focuses on all the changes required to the data life cycles for caches of all kinds.

notable caches are:
CachedDataRoots
CachedChunksIndex
CachedChunks
IngressProofs - Ingress proofs already have a basic LRU implementation, which expires them after 50 blocks from the last height an applicable tx was received

Here are the main chunk caching scenarios:

  1. Transaction Data Ingress

    • Uses unpacked chunks
    • Caches chunks assigned to node's partitions
    • Goal: Process assigned transactions efficiently (any chunks from assigned transaction)
  2. Ingress-Proof Generation

    • Also uses unpacked chunks
    • Attempts to download all chunks for recent transactions
    • Goal: Earn fees by participating in proof generation (download all chunks of a tx to generate proof)
  3. Programmable Data chunk distribution

    • Also uses unpacked chunks
    • Needed to compute PD transaction execution

Both Transaction Data Ingress and Ingress-Proof Generation share overlapping chunk lifetimes. They both target chunks from recent transactions, which can be pruned once data migrates to partition (disk) storage. A larger cache enables a node to handle larger transactions for ingress-proof generation. Since fees and rewards are a percentage of the uploaded data, there's an incentive to process larger proofs. We should set a minimum cache size, with chunks expiring based on the ingress proof LRU height check—open to suggestions on the exact value.

Programmable Data Chunks, however, have distinct lifetimes and constraints. Irys limits each block of programmable data to a maximum of X chunks (currently 7,000, about 1.71GB), allowing for a fixed cache size. Given the likelihood of repeated access to certain PD chunks, a 2GB LRU cache for PD chunks appears to be a reasonable minimum. Expiring PD chunks would be purely on an LRU bases without height/confirmations being a factor.

The most direct / obvious way to support these scenarios is to have two cach eviction strategies. The existing ingress-proof (data root) LRU cache for recently posted transactions and a pure LRU chunk cache for recently referenced PD chunks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant