-
Notifications
You must be signed in to change notification settings - Fork 22
Cleanup Espresso Config #932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary of ChangesHello @lukeiannucci, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the configuration management for Espresso TEE service registration. It centralizes various transaction and retry-related parameters into global constants, thereby eliminating the need to pass these configurations through multiple structs and function signatures. This change simplifies the API of the TEE key manager and verifier components, reduces configuration overhead in related modules like the batch poster and Caff node, and promotes a more consistent approach to these operational settings. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request successfully cleans up the Espresso configuration by removing the EspressoRegisterServiceConfig struct and its associated fields and options. The parameters previously managed by this struct are now defined as constants within espressotee/espresso_tee_helpers.go. This change streamlines the configuration process and removes unused code, contributing to a cleaner and more maintainable codebase. All modifications are consistent across the affected files, including arbnode/batch_poster.go, arbnode/espresso_batch_poster_config.go, arbnode/espresso_caff_node.go, espresso/key-manager/key_manager.go, espressotee/espresso_tee_helpers.go, espressotee/espresso_verifier.go, and system_tests/caff_node_test.go. The removal of unused imports also contributes to the overall cleanup. The changes are well-executed and align with the pull request's objective.
|
✅ All tests successful. No failed tests were found. 📣 Thoughts on this report? Let Codecov know! | Powered by Codecov |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/gemini review
| panic(err) | ||
| if registrationFailCount > espressotee.EspressoMaxRetries { | ||
| log.Crit("Espresso signer registration failed 5 times consecutively. Stopping.", "err", err) | ||
| b.fatalErrChan <- err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we would need to handle this error in node.go specifically on this line
So instead it should be:
if n.BatchPoster != nil {
err = n.BatchPoster.Start(ctx)
if err != nil {
return fmt.Errorf("failed to start batch poster: %w", err)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not convinced this is needed, when we start the batch poster it is launching a new thread which is trying to post a batch and register the signer address, and it may take some time for it to register the address (say for example 5 retries, 20 seconds each). So, while this is going on we want to bring up any other components and we are never waiting around to see if batchPoster.Start(ctx) errors out. When we trigger the error through the channel, it should simply stop all components that are running. Does this make sense? What do others think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm yes that makes sense to me. I just wasn't sure if fataErrChan getting populated would automatically trigger a shutdown
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its a little bit confusing so here it defers the StopAndWait()
nitro-espresso-integration/cmd/nitro/nitro.go
Line 772 in e8d0468
| deferFuncs = []func(){func() { currentNode.StopAndWait() }} |
so when this method exits this is called it stops all the components that were started:
nitro-espresso-integration/arbnode/node.go
Line 1562 in e8d0468
| func (n *Node) StopAndWait() { |
it only gets exited through a sig int or this channel:
nitro-espresso-integration/cmd/nitro/nitro.go
Line 794 in e8d0468
| case err = <-fatalErrChan: |
I also browsed through the codebase and sometimes as you mention some components return an error and sometimes they dont when using the fatalChanErr. So i think it is fine to not return an error here, but we can see what the team thinks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a reason why in Start() we dont wait for it to register the signer and do it in a background thread. It is a bit long to explain but essentially there can be dependencies on other components being started, in order for the registration to complete successfully. Does this make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh understood
Thank you for clearing this up
varun-doshi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would also be nice to make the appropriate changes to the migrate-config-file.py according to the config changes you made but I would suggest waiting till this PR gets merged (which should be done by today I believe)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I may be wrong but we need to update the BatchPoster.Start to return an error and use CallIterativelySafe instead of CallIteratively
Essentially, the structure needs to be similar to what ForceInclusionChecker does
Not required
| TxnsResubmissionInterval: DefaultEspressoBatchPosterConfig.TxnsResubmissionInterval, | ||
| MaxTransactionSize: DefaultEspressoBatchPosterConfig.TxSizeLimit, | ||
| ResubmitEspressoTxDeadline: DefaultEspressoBatchPosterConfig.ResubmitEspressoTxDeadline, | ||
| TxnsResubmissionInterval: DefaultEspressoBatchPosterConfig.TxnsMonitoringInterval, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why cant we also combine these two?
f.Duration(prefix+".txns-sending-interval", DefaultEspressoBatchPosterConfig.TxnsSendingInterval, "interval between sending transactions to Espresso Network")
f.Duration(prefix+".txns-monitoring-interval", DefaultEspressoBatchPosterConfig.TxnsMonitoringInterval, "time threshold after which a transaction will be automatically resubmitted if no response is received")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably can, I just noticed in the configs we currently have they tended to be different. I can look more into this. I think its a good idea to try and collapse them even more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i ended up making combining
TxnsSendingInterval and TxnsPollingInterval as these were always similar. The TxnsResubmissionInterval, i left different because if i understand it resubmits a transaction only if its not finalized by hotshot yet. For fast chains we would make TxnsSendingInterval and TxnsPollingInterval fast, like 125ms, so i dont think it makes sense to merge TxnsResubmissionInterval to the same value as hotshot may take a couple seconds - if not more for finalization
espresso/key-manager/key_manager.go
Outdated
| serviceType espressotee.ServiceType | ||
| registerSignerOpts espressotee.EspressoRegisterServiceOpts | ||
| userDataAttestationFile string | ||
| quoteFile string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quoteFile and useDataAttestationFile can also be hardcoded I think because they are only needed for SGX
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was able to do it in here, however this cause some problems as the tests were hard coded for tee type "SGX" and since the default for these fields were userDataAttestationFile and quoteFile were empty strings it would work as getAttestationQuote was returning empty bytes and not erroring out. I tried to change the tests to tee type "TESTS" but then the mock contract would revert the calls because enum was 2, and it is only aware of 0 and 1 (NITRO, SGX). Anyway I was able to hack it together, I think a more proper fix would be for mock contract to support TESTS as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of hacking can you just create a PR for mock contracts lol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea im getting to it lmao
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should fix it:
dc42a15
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ugh, it actually is harder than i thought as the nitro-contracts do this:
(uint256 hotshotHeight, bytes memory signature, IEspressoTEEVerifier.TeeType teeType) = abi.decode(
espressoMetadata,
(uint256, bytes, IEspressoTEEVerifier.TeeType)
);
meaning they only support espressotee enums, not mockespressotee enums
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see makes sense!
| SGXVerifierAddr string `koanf:"sgx-verifier-addr"` | ||
| BatchPosterAddr string `koanf:"batch-poster-addr"` | ||
| RecordPerformance bool `koanf:"record-performance"` | ||
| WaitForFinalization bool `koanf:"wait-for-finalization"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we like combine WaitForFinalization WaitForConfirmations and RequiredBlockDepth I feel like this can be like an enum or something? like if we are suppling RequiredBlockDepth its basically waiting for confirmations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I didnt look to much into caff node changes as that was a bit outside of my area of expertise, but i will do so now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checkout what i did here i removed WaitForConfirmations and if RequiredBlockDepth is greater than 0 we go inside the WaitForConfirmations code blocks
arbnode/espresso_caff_node.go
Outdated
| WaitForFinalization bool `koanf:"wait-for-finalization"` | ||
| WaitForConfirmations bool `koanf:"wait-for-confirmations"` | ||
| RequiredBlockDepth uint64 `koanf:"required-block-depth"` | ||
| BlocksToRead uint64 `koanf:"blocks-to-read"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we make this constant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which field? I can look into it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BlocksToRead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
The script optimisations PR was just merged. You can make appropriate changes to the script now 🫡 |
Thanks @varun-doshi let me know what you think of this commit: |
varun-doshi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Backport failed for Please cherry-pick the changes locally and resolve any conflicts. git fetch origin integration
git worktree add -d .worktree/backport-932-to-integration origin/integration
cd .worktree/backport-932-to-integration
git switch --create backport-932-to-integration
git cherry-pick -x 9837028f180efa8e763b6ab3500216cdca3e5ccb |
|
Backport failed for Please cherry-pick the changes locally and resolve any conflicts. git fetch origin integration-v3.5.6
git worktree add -d .worktree/backport-932-to-integration-v3.5.6 origin/integration-v3.5.6
cd .worktree/backport-932-to-integration-v3.5.6
git switch --create backport-932-to-integration-v3.5.6
git cherry-pick -x 9837028f180efa8e763b6ab3500216cdca3e5ccb |
|
Backport failed for Please cherry-pick the changes locally and resolve any conflicts. git fetch origin celestia-v3.5.6
git worktree add -d .worktree/backport-932-to-celestia-v3.5.6 origin/celestia-v3.5.6
cd .worktree/backport-932-to-celestia-v3.5.6
git switch --create backport-932-to-celestia-v3.5.6
git cherry-pick -x 9837028f180efa8e763b6ab3500216cdca3e5ccb |
|
Backport failed for Please cherry-pick the changes locally and resolve any conflicts. git fetch origin celestia-v3.9.2
git worktree add -d .worktree/backport-932-to-celestia-v3.9.2 origin/celestia-v3.9.2
cd .worktree/backport-932-to-celestia-v3.9.2
git switch --create backport-932-to-celestia-v3.9.2
git cherry-pick -x 9837028f180efa8e763b6ab3500216cdca3e5ccb |
* start * more cleanup * remove unused * more fixes * add fatal err chan to tests * start addressing comments * address more comments * add back tx deadline * fix mock * hack * re add * fix script * fixes
* Cleanup Espresso Config (#932) * start * more cleanup * remove unused * more fixes * add fatal err chan to tests * start addressing comments * address more comments * add back tx deadline * fix mock * hack * re add * fix script * fixes * fixes * submodule
Closes #<ISSUE_NUMBER>
This PR:
We are working towards making the config part of PCR0 hash. In order to do so we are revisiting the config, and what needs to actually be configurable vs what shouldn't be.
Changes to config:
RegisterSignerOpts- This is now all constants as we never really need to change these.EspressoTxnLimit- This is now a constant.DangerousConfigHotshotBlockNum- This is removed as we are able to parse hotshot blocks much faster if there is a big gap.espressoTxnsPollingIntervalandespressoTxnsResubmissionInterval- These have been collapsed into one config calledespressoTxnsMonitoringInterval. I noticed these were normally the same input in the config.This PR does not:
Key places to review: