Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panicked at ledger/src/blockstore.rs:3724:29 #5070

Open
SVS-bigj opened this issue Feb 25, 2025 · 2 comments
Open

panicked at ledger/src/blockstore.rs:3724:29 #5070

SVS-bigj opened this issue Feb 25, 2025 · 2 comments

Comments

@SVS-bigj
Copy link

Problem

I have multiple RPC nodes that are frequently crashing due to the following error message.

ERROR solana_metrics::metrics] datapoint: panic program="validator" thread="solRpcEl" one=1i message="panicked at ledger/src/blockstore.rs:3724:29

Followed by the following panic message.

Shred with slot: 322872774, index: 0, consumed: 856, completed_indexes: {37, 73, 109, 148, 190, 238, 278, 316, 358, 403, 448, 486, 527, 570, 608, 643, 678, 714, 751, 787, 823, 855} must exist if shred index was included in a range: 0 855" location="ledger/src/blockstore.rs:3724:29" version="2.1.11 (src:00000000; feat:1725507508, client:JitoLabs)"

Image

Version Info:

  • v2.1.11-jito
  • Yellowstone gRPC Geyser plugin v5.0.0+solana.2.1.11

Startup Arguments:

agave-validator \
  --ledger /var/solana/data/ledger \
  --accounts /var/solana/accounts \
  --identity /var/solana/data/config/validator-keypair.json \
  --known-validator 7Np41oeYqPefeNQEHSv1UDhYrehxin3NStELsSKCT4K2 \
  --known-validator GdnSyH3YtwcxFvQrVVJMm1JhTS4QVX7MFsX56uJLUfiZ \
  --known-validator DE1bawNcRJB9rVm3buyMVfr8mBEoyyu73NBovf2oXJsJ \
  --known-validator HyperSPG8w4jgdHgmA8ExrhRL1L1BriRTHD9UFdXJUud \
  --known-validator GdnSyH3YtwcxFvQrVVJMm1JhTS4QVX7MFsX56uJLUfiZ \
  --expected-genesis-hash 5eykt4UsFv8P8NJdTREpY1vzqKqZKvdpKuc147dw2N9d \
  --entrypoint entrypoint.mainnet-beta.solana.com:8001 \
  --entrypoint entrypoint2.mainnet-beta.solana.com:8001 \
  --entrypoint entrypoint3.mainnet-beta.solana.com:8001 \
  --entrypoint entrypoint4.mainnet-beta.solana.com:8001 \
  --entrypoint entrypoint5.mainnet-beta.solana.com:8001 \
  --no-voting \
  --only-known-rpc \
  --log /home/solana/validator.log \
  --rpc-port 8899 \
  --dynamic-port-range 8000-8100 \
  --init-complete-file /var/solana/data/init-completed \
  --limit-ledger-size  100000000 \
  --wal-recovery-mode skip_any_corrupted_record \
  --full-rpc-api \
  --enable-rpc-transaction-history \
  --enable-cpi-and-log-storage \
  --account-index program-id \
  --account-index spl-token-owner \
  --account-index spl-token-mint \
  --rpc-bind-address 10.10.5.2 \
  --rpc-send-leader-count 2 \
  --private-rpc \
  --rpc-threads 48 \
  --geyser-plugin-config /home/solana/bin/yellowstone-grpc-config.json \
  --minimal-snapshot-download-speed 50485760 \
  --rpc-send-service-max-retries 10 \
  --block-verification-method unified-scheduler \
  --unified-scheduler-handler-threads 8 \
  --health-check-slot-distance 25

This has been happening on average every 1-2 days for the past 2 weeks per RPC node we are running. However sometimes it will happen multiple times per day on the same node. I don't see a pattern associated with how often it happens as it seems pretty random. However it does seem like it is getting progressively worse. We also run a voting validator and this crash/error has not happened once on that node. It seems to only be affecting the RPC nodes.

Proposed Solution

I am not sure why this is happening so I do not have a solution to propose at this time. Please let me know if there is anything I can do on my side to help assist with solving this.

@steviez
Copy link

steviez commented Mar 3, 2025

Please paste text in code blocks (triple ` characters to start and end) like below instead of sharing a screenshot. In regards to the actual panic, can you please share logs for ~10 min of runtime before one of these panics ?

@SVS-bigj
Copy link
Author

SVS-bigj commented Mar 4, 2025

I have attached two log files. I am not "easily" able to export any more than 50,000 lines at a time. The first file is about 10-5 minutes before the panic error and the second file picks up after the first up until the error occurs.

Ironically we have not seen this error on any of our nodes in the past few days. Nothing has been changed in the meantime to on our end.

Explore-logs-2025-03-03 20_01_05.txt

Explore-logs-2025-03-03 19_54_29.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants