-
Notifications
You must be signed in to change notification settings - Fork 368
feat(ckbtc): consolidate small value UTXOs periodically #7891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
gregorydemay
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @ninegua for starting this work! I have some understanding questions but the approach and the code generally looks good to me!
| /// fee corresponding to burning ckbtc from the fee subaccount | ||
| /// at the given ledger index. | ||
| #[serde(rename = "accepted_consolidate_utxos_request")] | ||
| AcceptedConsolidateUtxosRequest(ConsolidateUtxosRequest), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understanding question: why do we need to differentiate it from a regular AcceptedRetrieveBtcRequest? I have the impression it would be simpler if we could have the same flow/steps in both cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because ConsolidateUtxosRequest is different than RetrieveBtcRequst. Their UTXO selection algorithm is different.
rs/bitcoin/ckbtc/minter/src/main.rs
Outdated
| let _ = ic_ckbtc_minter::estimate_fee_per_vbyte(&IC_CANISTER_RUNTIME).await; | ||
| } | ||
|
|
||
| #[cfg(feature = "self_check")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
understanding question: why is-it under a feature flag? I would expect that logic to be called by a new task instantiated in setup_tasks, no?
Ideally we should not rely on such self_check endpoints at all, because we want to test the production wasm and it doesn't contain such endpoints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this method I would have to resort to advancing the time and controlling the ticks in order to exactly trigger the consolidation, which is ad-hoc and fragile.
Besides, having a canister method that can return an error message allows more error cases to be checked in the same test. So I'm inclined to keep this method under self_check, just like the refresh_fee_percentiles method above. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this method I would have to resort to advancing the time and controlling the ticks in order to exactly trigger the consolidation, which is ad-hoc and fragile.
But right now the task is not scheduled anywhere, right? I think for integration tests we want to test the production wasm, in particular we want to be sure that the task runs as expected and is for example not hindered by other tasks (e.g. is they were to share the same guard)
Besides, having a canister method that can return an error message allows more error cases to be checked in the same test.
Totally, but this should be maybe better checked in a unit test, which we should be able to do thanks to the runtime abstraction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in particular we want to be sure that the task runs as expected
Yes, that is a good point. I changed to using an upgrade to trigger the consolidation and removed the self_check method. It indeed has caught a missing schedule_task mistake! Thanks!
rs/bitcoin/ckbtc/minter/src/lib.rs
Outdated
| let utxos_guard = guard(input_utxos, restore_utxos); | ||
|
|
||
| // Burn transaction fee (bitcoin_fee) from fee collector's account. | ||
| let burn_memo = memo::BurnMemo::Convert { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we maybe have another memo type for this kind of transactions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another memo type would be a breaking change. Existing services that read the memo (e.g. our dashboard) would need to handle this new type. Do you think it is still worth adding a new type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Adding a new variant type in
memo::BurnMemowill not hinder the decoding of past values (thanks to minicbor_derive) - It's true that if we had a new type, then applications interpreting the memo will need to do something if they want to handle that new variant (if they track what's happening on the minter's fee acount)
- I think here we should have a new memo variant because it does not correspond at all the semantic of
memo::BurnMemo::Convertwhich is used whenThe minter processed a retrieve_btc requestand has aThe destination of the retrieve BTC request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Greg. Why isn't there a "big plus" emoji?!?
| "[consolidate_utxos]: failed to build conslidation transaction {:?}", | ||
| err | ||
| ); | ||
| restore_utxos(input_utxos); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Why not declare the utxos_guard earlier and let this code path be automatically correctly handled by the guard?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating utxos_guard earlier would require an extra clone of input_utxos.
| batch.insert(req); | ||
| } | ||
| } | ||
| BtcTransactionRequest::ConsolidateUtxos(_) => (), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My impression is that at this stage we could build a batch with a single transaction (the consolidation one) and go through the normal flow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The UTXO selection logic for RetrieveBtcRequest is to find the largest UTXO matching the given amount and then repeat, which is very different than ConsolidateUtxo.
So if we want to do as you suggested, the batch's type must change to be something that is an enum of either BTreeSet<RetrieveBtcRequest> or ConsolidateUtxosRequest. The handling after build_batch must do different things based on the enum. Then it is not very different than the current implementation. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My current impression is that there are a couple of "tricks" and that's why I'm wondering if we could not streamline it:
- The method
build_batchnow effectively removes all requests of typeBtcTransactionRequest::ConsolidateUtxoswhich seems unexpected (becausestd::mem::take(&mut self.pending_btc_requests)is called and in the case of consolidation the request is not put back). - This is actually ok because no consolidation request was ever inserted in
pending_btc_requests(that's the comment inaccept_consolidate_utxos_request)
Maybe the type of pending_btc_requests should not have been changed to Vec<BtcTransactionRequest> but be kept as Vec<RetrieveBtcRequest>? But that also feels weird since the consolidation request should be treated "like" other requests in terms of steps: building -> signing -> sending ->confirming, where the concrete implementation of those steps may change depending on whether this is a consolidation request.
What about build_batch returning a new type like BatchBtcTransactionRequests, which contains either a single consolidation request or a BTreeSet<RetrieveBtcRequest>? Then this batch is given to build_unsigned_transaction, which needs to do the correct utxo selection depending on whether this is a consolidation transaction. WDYT?
| main_address, | ||
| fee_millisatoshi_per_vbyte, | ||
| &fee_estimator, | ||
| ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is the transaction no longer push into the state push_in_flight_request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It still does, but moved into the function submit_unsigned_request.
gregorydemay
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes, some additional comments/thoughts
rs/bitcoin/ckbtc/minter/src/lib.rs
Outdated
| let utxos_guard = guard(input_utxos, restore_utxos); | ||
|
|
||
| // Burn transaction fee (bitcoin_fee) from fee collector's account. | ||
| let burn_memo = memo::BurnMemo::Convert { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Adding a new variant type in
memo::BurnMemowill not hinder the decoding of past values (thanks to minicbor_derive) - It's true that if we had a new type, then applications interpreting the memo will need to do something if they want to handle that new variant (if they track what's happening on the minter's fee acount)
- I think here we should have a new memo variant because it does not correspond at all the semantic of
memo::BurnMemo::Convertwhich is used whenThe minter processed a retrieve_btc requestand has aThe destination of the retrieve BTC request
|
|
||
| // There will be two outputs: 1 normal output and 1 change output, each about | ||
| // half of the total value of the input UTXOs. It could be made into just one | ||
| // output, but two outputs make it easier to reuse the existing implementation | ||
| // of build_unsigned_transaction_from_inputs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, then we have I think the same understanding, I was just confused by that part of the comment
It could be made into just one
// output, but two outputs make it easier to reuse the existing implementation
// of build_unsigned_transaction_from_inputs.
since it has nothing to do with how we split the total_amount.
But if you think that is not a problem, I can make this change.
I wanted to look back at Thomas' simulations but I don't have access to the slides. I think we can leave as is for now and we can always revisit later if needed.
rs/bitcoin/ckbtc/minter/src/lib.rs
Outdated
| .await | ||
| .ok_or(ConsolidateUtxoError::EstimateFeeNotAvailable)?; | ||
|
|
||
| // Select UTXOs to consolidate. Note that they are not removed from available_utxos yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second part of the comment does not seem to be correct Note that they are not removed from available_utxos yet., since they are removed (as they should) by the method select_utxos_to_consolidate.
So maybe I would just remove that comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Removed.
| batch.insert(req); | ||
| } | ||
| } | ||
| BtcTransactionRequest::ConsolidateUtxos(_) => (), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My current impression is that there are a couple of "tricks" and that's why I'm wondering if we could not streamline it:
- The method
build_batchnow effectively removes all requests of typeBtcTransactionRequest::ConsolidateUtxoswhich seems unexpected (becausestd::mem::take(&mut self.pending_btc_requests)is called and in the case of consolidation the request is not put back). - This is actually ok because no consolidation request was ever inserted in
pending_btc_requests(that's the comment inaccept_consolidate_utxos_request)
Maybe the type of pending_btc_requests should not have been changed to Vec<BtcTransactionRequest> but be kept as Vec<RetrieveBtcRequest>? But that also feels weird since the consolidation request should be treated "like" other requests in terms of steps: building -> signing -> sending ->confirming, where the concrete implementation of those steps may change depending on whether this is a consolidation request.
What about build_batch returning a new type like BatchBtcTransactionRequests, which contains either a single consolidation request or a BTreeSet<RetrieveBtcRequest>? Then this batch is given to build_unsigned_transaction, which needs to do the correct utxo selection depending on whether this is a consolidation transaction. WDYT?
| kyt_provider : opt principal; | ||
| reimbursement_account : opt Account; | ||
| }; | ||
| accepted_consolidate_utxos_request : record { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I should continue the review first to make sense of it but why is it considered an accepted request? It is quite unlike a user-triggered BTC retrieval request, right? Why not call this event type something like utxo_consolidation?
| req.block_index(), | ||
| req.address() | ||
| .map(|x| x.display(s.btc_network)) | ||
| .unwrap_or("(minter)".to_string()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is using the "minter" string better than just using the minter's main Bitcoin address here?
rs/bitcoin/ckbtc/minter/src/lib.rs
Outdated
| /// Undoes changes we make to the ckBTC state when we construct a pending transaction. | ||
| /// We call this function if we fail to sign or send a Bitcoin transaction. | ||
| fn undo_sign_request(requests: BTreeSet<state::RetrieveBtcRequest>, utxos: Vec<Utxo>) { | ||
| fn undo_sign_request(requests: state::SubmittedWithdrawalRequests, utxos: Vec<Utxo>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: It sounds the function undoes a request to sign but obviously the signing itself cannot be undone. How about undo_withdrawal_request or something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion! Renamed.
rs/bitcoin/ckbtc/minter/src/lib.rs
Outdated
| } | ||
| } | ||
|
|
||
| async fn submit_unsigned_request<R: CanisterRuntime>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obviously, unsigned requests are not submitted. Wouldn't it make sense to split it into signing (unsigned) requests and submitting (signed) requests? If it's more convenient to have it in one function, shouldn't it then be sign_and_submit_request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion! Renamed
rs/bitcoin/ckbtc/minter/src/lib.rs
Outdated
| } | ||
|
|
||
| #[derive(Debug)] | ||
| pub enum ConsolidateUtxoError { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| pub enum ConsolidateUtxoError { | |
| pub enum ConsolidateUtxosError { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed
| mockall = { workspace = true } | ||
| pocket-ic = { path = "../../../../packages/pocket-ic" } | ||
| proptest = { workspace = true } | ||
| rand = { workspace = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is randomness needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Rng trait is required in a test.
rs/bitcoin/ckbtc/minter/src/lib.rs
Outdated
| const MIN_CONSOLIDATION_INTERVAL: Duration = Duration::from_secs(24 * 3600); | ||
| let min_consolidation_utxo_required = | ||
| read_state(|s| s.min_utxo_consolidation_threshold) as usize; | ||
| assert!(min_consolidation_utxo_required > MAX_NUM_INPUTS_IN_TRANSACTION); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get the point of this assertion. Is the idea to ensure that MAX_NUM_INPUTS_IN_TRANSACTION can be packed together?
Isn't the check below that the total number of UTXOs is over the threshold sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, this is a bit redundant. But on the other hand, if the threshold is less than MAX_NUM_INPUTS_IN_TRANSACTION, then no consolidation is required because any transaction created by the minter will not exceed the MAX.
Anyway, I removed this assert. But this is still check in another place when setting the threshold through update or init.
| reason: WithdrawalReimbursementReason, | ||
| }, | ||
| ToConsolidate { | ||
| request: ConsolidateUtxosRequest, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if a new consolidation request is to be scheduled but the current one hasn't been confirmed yet? Does that mean that there can always be at most one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think currently in this PR there is any hard requirement on there being only at most one consolidation at all time, except that they would be created 24 hours apart. Is it better to avoid creating the second one when the first one is not yet finalized?
Another related question is, do we need a mechanism to cancel a previous consolidation request by making a new one with the same set of UTXOs but higher fee? It will be more complex to implement, especially considering that we want to charge the bitcoin transaction fee to the fee account (which means we would have to do another burn).
So there seems to be 3 options when it comes to creating a consolidation transaction when another one is already in flight but not yet finalized:
- Just create another one with the remaining available utxos, because both are valid and should eventually go through.
- Create another one with the same UTXO set but higher fee, so that the previous one will be cancelled;
- Do nothing or maybe send the previous one again, just to make sure it will still be in the mempool.
Which do you prefer? @THLO @gregorydemay
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question!
- Those consolidation transactions are costly (because they are large due to the large number of inputs) and is currently somewhat new. My impression is that the less risky approach would be
- Have at most one in flight consolidation transaction. This is to avoid creating multiple costly consolidation transactions which may not go through due to a bug.
- Had a metric to track the age of a consolidation transaction. We could have an alert in case age > threshold
- Regarding resubmission, one thing that I didn't have on my radar before is that resubmission here must be different because the minter cannot reduce the output amount to accommodate higher fees (we are basically solving the problem of withdrawal with guaranteed received amount). Is it already solved by the current implementation and do we have an integration test for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation won't try to resend the consolidation transaction to replace the previous one, since the request was never moved from in-flight back into the pending queue. So it is essentially option 1, optimistically assuming that the transaction will eventually go through.
rs/bitcoin/ckbtc/minter/src/state.rs
Outdated
| } | ||
|
|
||
| pub fn len(&self) -> usize { | ||
| pub fn count_retrieve_btc_request(&self) -> usize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| pub fn count_retrieve_btc_request(&self) -> usize { | |
| pub fn count_retrieve_btc_requests(&self) -> usize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed.
| match self { | ||
| Self::ToConfirm { requests } => requests.len(), | ||
| Self::ToCancel { requests, .. } => requests.len(), | ||
| Self::ToConsolidate { .. } => 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if there is a request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THis function counts only retrieve_btc requests.
| pub fn address(&self) -> Option<&BitcoinAddress> { | ||
| match self { | ||
| Self::RetrieveBtc(request) => Some(&request.address), | ||
| Self::ConsolidateUtxos(_) => None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why doesn't this function return the minter's Bitcoin address here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mostly because minter's address is not readily available in the state.
| .set_expiration(Duration::from_secs(expiration)); | ||
| } | ||
| if let Some(threshold) = min_utxo_consolidation_threshold { | ||
| if threshold > crate::MAX_NUM_INPUTS_IN_TRANSACTION as u64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I'm not sure whether this check is needed but I guess it cannot hurt. 🤷♂️
(If the threshold is set to a low value, then all UTXOs are simply merged into 2, at which point the consolidation should definitely stop, so maybe the threshold should be greater than 2...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in another reply, if the threshold is less, then there is no reason to do any consolidation because no transactions would exceed MAX_NUM_INPUTS_IN_TRANSACTION.
rs/bitcoin/ckbtc/minter/src/tasks.rs
Outdated
| use std::time::Duration; | ||
|
|
||
| /// Interval of calling the consolidation task. | ||
| pub const CONSOLIDATION_TASK_INTERVAL: Duration = Duration::from_secs(3600); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to have this constant when we already have MIN_CONSOLIDATION_INTERVAL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still need to call the task at some interval. But this doesn't have to be global, so I moved it to a local definition instead.
| // Step 3: wait for the consolidation transaction to be submitted. | ||
| // Expect the corresponding burn index to transfer_index + 1. | ||
| let burn_index = transfer_index + 1; | ||
| let txid = ckbtc.await_btc_transaction(burn_index, MAX_NUM_INPUTS_IN_TRANSACTION * 6); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why times 6?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is entirely empirical. Signing a transaction with 1000 inputs does take a lot of ticks, but I'm also a bit surprised that I need to multiply the num of input by 6, even by 5 doesn't seem work.
Co-authored-by: Thomas Locher <[email protected]>
Co-authored-by: Thomas Locher <[email protected]>
Co-authored-by: Thomas Locher <[email protected]>
Co-authored-by: Thomas Locher <[email protected]>
Co-authored-by: Thomas Locher <[email protected]>
Co-authored-by: Thomas Locher <[email protected]>
DEFI-2479: Implement UTXO consolidation
Currently, large withdrawals of bitcoins fail as they would require too many inputs to build a standard Bitcoin transaction.
This PR implements a UTXO consolidation mechanism: if there are more than 10,000 UTXOs, the smallest 1000 UTXOs are combined into big ones. The consolidation will take place every 24 hours for now, and later can be tuned to happen less frequently.
It breaks candid compatibility due to addition of new log event types, which shouldn't affect external callers.