Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPC: Report Better Transaction/Block Cost Information #4815

Open
steviez opened this issue Feb 6, 2025 · 3 comments
Open

RPC: Report Better Transaction/Block Cost Information #4815

steviez opened this issue Feb 6, 2025 · 3 comments

Comments

@steviez
Copy link

steviez commented Feb 6, 2025

Problem

Among other transaction metadata, RPC nodes currently store a field called compute_units_consumed for each transaction:

pub compute_units_consumed: OptionSerializer<u64>,

The value is recorded in the Blockstore (and eventually BigTable) if the node is running with --enable-rpc-transaction-history. TransactionStatusService is the thread that populates the metadata structure and inserts into the Blockstore:

compute_units_consumed: Some(executed_units),

The executed_units field originally comes from Bank (several functions down the callstack):

let executed_units = processing_result.executed_units();

The issue is that executed units do not paint the full picture in regards to the block CU limit. This distinction is not widely known and furthermore, there is no other easy manner to direct query the "total cost" for each transaction.

Furthermore, while the more accurate number per transaction is desired, some aggregate block data might be of interest as well. Here are the fields that we currently report in a metric if we're a leader:

"cost_tracker_stats",
("bank_slot", bank_slot as i64, i64),
("block_cost", self.block_cost as i64, i64),
("vote_cost", self.vote_cost as i64, i64),
("transaction_count", self.transaction_count.0 as i64, i64),
("number_of_accounts", self.number_of_accounts() as i64, i64),
("costliest_account", costliest_account.to_string(), String),
("costliest_account_cost", costliest_account_cost as i64, i64),
(
"allocated_accounts_data_size",
self.allocated_accounts_data_size.0,
i64
),
(
"transaction_signature_count",
self.transaction_signature_count.0,
i64
),
(
"secp256k1_instruction_signature_count",
self.secp256k1_instruction_signature_count.0,
i64
),
(
"ed25519_instruction_signature_count",
self.ed25519_instruction_signature_count.0,
i64
),
(
"inflight_transaction_count",
self.in_flight_transaction_count.0,
i64
),
(
"secp256r1_instruction_signature_count",
self.secp256r1_instruction_signature_count.0,
i64
)

Some of these fields are obviously redundant and could be easily derived from examining all individual transactions. Something like the costliest_account_cost or allocated_accounts_data_size could be of interest but harder to derive from the tx's.

Proposed Solution

Several steps here:

  1. Figure out exactly what additional fields are desired:
    • The minimum here seems like adding the "total" cost for the block
    • The middle ground is seemingly the above + adding the "total" cost for each transaction
      • The sum over all transactions should match the value reported for the block
    • The maximum is all of the above + the inclusion of fields from the cost tracker metrics
  2. Figure out how data will be accessed/queried:
    • Tag it onto the existing calls getTransaction and getBlock OR
    • Create a new method for querying just the costs, ie getBlockCost OR
    • Add another filter cost to transactionDetails to limit output to cost OR
    • ... ?
  3. Figure out how data will be stored; this will likely be influenced by answers to above
    • The per transaction total cost could seemingly live with the existing TransactionStatusMeta
      • This field is stored at protobuf in the Blockstore, we might be do a "partial" decoding to avoid reading the entire block + TransactionMeta for each column
    • If the total block cost is sufficient for many use cases, we might consider storing the field individually.
      • This would probably look similar to block time + getBlockTime in that data is stored in an individual column and can be queried directly (ie we might add getBlockCost)
@steviez steviez changed the title RPC: Report More Indepth Transaction Cost Information RPC: Report Better Transaction/Block Cost Information Feb 6, 2025
@steviez steviez changed the title RPC: Report Better Transaction/Block Cost Information rpc: Report Better Transaction/Block Cost Information Feb 6, 2025
@steviez steviez changed the title rpc: Report Better Transaction/Block Cost Information RPC: Report Better Transaction/Block Cost Information Feb 6, 2025
@steviez
Copy link
Author

steviez commented Feb 6, 2025

My current thoughts on how we should proceed:

  1. I think total block cost + cost per transaction is a good starting point. I'm guessing the accounts related stuff would be of interest too, but I'll defer to others for that.
  2. I think a new getBlockCost makes sense, and is consistent with getBlockTime and getBlockHeight. Additionally, I think extending the result from getBlock to with full filter to include cost per tx makes sense. Up for debate on whether a new cost filter is desired / useful
  3. Given my thoughts on above, add a new column for block costs, and extend the existing TransactionMeta type to store the per-tx data

@mmcgee-jump
Copy link

Thanks Steve for putting that together! I think our preference, from a limited bit of experience as a consumer, is the following:

  • Data reported is this option: The middle ground is seemingly the above + adding the "total" cost for each transaction (we only care about per-transaction, and would ignore the overall block total). We simply do not care about the detail-level stuff, and I doubt any other consumer ever would, seems like something to add further down the line if there is ever an actual ask for it.
  • New API call overall seems silly, I don't think anyone is using, or probably would use this data except indexers which already call getBlock. Don't care about full vs cost in the query param. cost seems a little cleaner, and is a backwards compatible change, plus you already have like, rewards param.
  • You didn't suggest it, but just want to note, strongly opposed to not surfacing this via. getBlockCost. Making all the indexers have to call another RPC (adds overhead), and manage state between two sets of response data just to avoid sticking an extra field on an RPC call that already has a meta field with like, exactly the kind of data you want to return here, would not be wise.

I would also soft-deprecate the existing field as part of this PR, and then have a schedule to hard deprecate it at some point in future.

@steviez
Copy link
Author

steviez commented Feb 7, 2025

Thanks for the response @mmcgee-jump.

You didn't suggest it, but just want to note, strongly opposed to not surfacing this via. getBlockCost. Making all the indexers have to call another RPC (adds overhead), and manage state between two sets of response data just to avoid sticking an extra field on an RPC call that already has a meta field with like, exactly the kind of data you want to return here, would not be wise.

Given our previous conversations, I'm fairly confident that I'll be telling you information you already know. But for the sake of leaving a nice paper trail, let me give some context ...

Shred are indexed by (slot, index) in the Blockstore. And transactions are indexed in by their transaction signature. This means there is a multi-level lookup for getBlock. Supposing I call getBlock for slot X ...

  • We start by fetching all of the shreds for slot X and deserializing into a Vec<Entry>
  • For each transaction in that Vec<Entry>, we do a lookup by tx signature to grab the TransactionStatusMeta
  • We also do a couple small lookups to grab block height, block time and block rewards

As you can see, getBlock isn't exactly the cheapest query. The way I see it, there are a couple potential uses case for interacting with tx/block cost:

  1. Some folks might simply be looking to show block cost vs. slot number. This isn't too different than the TPS chart currently shown on explorer.solana.com
  2. Some folks (indexers) might actually be consuming each and every transaction. Given the delivery mechanism, this type of consumer can figure out the total block cost by summing across transactions

Storing a cost with each individual transaction will support use case 2. However, for use case 1., calling getBlock and just pulling the cost field out of each transaction is wasteful. So, the idea with adding a getBlockCost is add a "lite" method for the cases that are simply showing a high level telemetry value as opposed to in depth analysis on transactions.

Unfortunately, historical storage (BigTable) requires fetching the full block. This fetch is a little different than blockstore, but TLDR is we have to fetch the full thing even if we only care about cost. Unfortunate, but if people are printing real-time-ish dashboards, they should be hitting local storage on RPC nodes (and not BigTable).

As a final note, getBlockHeight and getBlockTime establish "precedent" for fetching these individual fields. So, I could see having both an extra column to fetch total block cost by slot AND extending the TransactionStatusMeta with per-transaction cost

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants