Cache data lifecycles #218

JesseTheRobot · 2025-02-15T19:22:16Z

A subtask of Scalable data ingress, this task focuses on all the changes required to the data life cycles for caches of all kinds.

notable caches are:
CachedDataRoots
CachedChunksIndex
CachedChunks
IngressProofs - Ingress proofs already have a basic LRU implementation, which expires them after 50 blocks from the last height an applicable tx was received

Here are the main chunk caching scenarios:

Transaction Data Ingress
- Uses unpacked chunks
- Caches chunks assigned to node's partitions
- Goal: Process assigned transactions efficiently (any chunks from assigned transaction)
Ingress-Proof Generation
- Also uses unpacked chunks
- Attempts to download all chunks for recent transactions
- Goal: Earn fees by participating in proof generation (download all chunks of a tx to generate proof)
Programmable Data chunk distribution
- Also uses unpacked chunks
- Needed to compute PD transaction execution

Both Transaction Data Ingress and Ingress-Proof Generation share overlapping chunk lifetimes. They both target chunks from recent transactions, which can be pruned once data migrates to partition (disk) storage. A larger cache enables a node to handle larger transactions for ingress-proof generation. Since fees and rewards are a percentage of the uploaded data, there's an incentive to process larger proofs. We should set a minimum cache size, with chunks expiring based on the ingress proof LRU height check—open to suggestions on the exact value.

Programmable Data Chunks, however, have distinct lifetimes and constraints. Irys limits each block of programmable data to a maximum of X chunks (currently 7,000, about 1.71GB), allowing for a fixed cache size. Given the likelihood of repeated access to certain PD chunks, a 2GB LRU cache for PD chunks appears to be a reasonable minimum. Expiring PD chunks would be purely on an LRU bases without height/confirmations being a factor.

The most direct / obvious way to support these scenarios is to have two cach eviction strategies. The existing ingress-proof (data root) LRU cache for recently posted transactions and a pure LRU chunk cache for recently referenced PD chunks.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache data lifecycles #218

Cache data lifecycles #218

JesseTheRobot commented Feb 15, 2025 •

edited by DanMacDonald

Loading

Cache data lifecycles #218

Cache data lifecycles #218

Comments

JesseTheRobot commented Feb 15, 2025 • edited by DanMacDonald Loading

JesseTheRobot commented Feb 15, 2025 •

edited by DanMacDonald

Loading