Skip to content

Conversation

jimmygchen
Copy link
Member

@jimmygchen jimmygchen commented Oct 21, 2025

Issue Addressed

Addresses #8218

A simplified version of #8241 for the initial release.

I've tried to minimise the logic change in this PR, although introducing the NodeCustodyType enum still result in quite a bit a of diff, but the actual logic change in CustodyContext is quite small.

The main changes are in the CustdoyContext struct

  • combining validator_custody_count and current_is_supernode fields into a single custody_group_count_at_head field. We persist the cgc of the initial cli values into the custody_group_count_at_head field and only allow for increase (same behaviour as before).
  • I noticed the above approach caused a backward compatibility issue, I've made a fix and changed the approach slightly (which was actually what I had originally in mind):
    • when initialising, only override the validator_custody_count value if either flag --supernode or --semi-supernode is used; otherwise leave it as the existing default 0. Most other logic remains unchanged.

All existing validator custody unit tests are still all passing, and I've added additional tests to cover semi-supernode, and restoring CustodyContext from disk.

Note: I've added a WARN if the user attempts to switch to a --semi-supernode or --supernode - this currently has no effect, but once @eserilev column backfill is merged, we should be able to support this quite easily.

Things to test

  • cgc in metadata / enr
  • cgc in metrics
  • subscribed subnets
  • getBlobs endpoint

@jimmygchen jimmygchen requested a review from jxs as a code owner October 21, 2025 05:27
Co-authored-by: pawanjay176 <[email protected]>
@jimmygchen jimmygchen force-pushed the semi-supernode-simple branch 2 times, most recently from 0ffed11 to a8580d0 Compare October 21, 2025 05:36
@jimmygchen jimmygchen force-pushed the semi-supernode-simple branch from a8580d0 to 0932f36 Compare October 21, 2025 05:39
@jimmygchen jimmygchen added ready-for-review The code is ready for review v8.0.0 Q4 2025 Fusaka Mainnet Release labels Oct 21, 2025
jimmygchen added a commit that referenced this pull request Oct 21, 2025
Squashed commit of the following:

commit 7767a2e
Author: Jimmy Chen <[email protected]>
Date:   Tue Oct 21 16:47:40 2025 +1100

    More test fixes and update help text.

commit 0932f36
Author: Jimmy Chen <[email protected]>
Date:   Tue Oct 21 16:14:46 2025 +1100

    Add tests for restoring custody context from persisted.

commit 5cc3186
Author: Jimmy Chen <[email protected]>
Date:   Tue Oct 21 16:07:47 2025 +1100

    Implement semi-supernode.

    Co-authored-by: pawanjay176 <[email protected]>
@jimmygchen jimmygchen mentioned this pull request Oct 21, 2025
jimmygchen added a commit that referenced this pull request Oct 21, 2025
Squashed commit of the following:

commit 7767a2e
Author: Jimmy Chen <[email protected]>
Date:   Tue Oct 21 16:47:40 2025 +1100

    More test fixes and update help text.

commit 0932f36
Author: Jimmy Chen <[email protected]>
Date:   Tue Oct 21 16:14:46 2025 +1100

    Add tests for restoring custody context from persisted.

commit 5cc3186
Author: Jimmy Chen <[email protected]>
Date:   Tue Oct 21 16:07:47 2025 +1100

    Implement semi-supernode.

    Co-authored-by: pawanjay176 <[email protected]>
@mergify mergify bot added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Oct 21, 2025
… validator should have `validator_custody_at_head` set to 0.
@jimmygchen jimmygchen added ready-for-review The code is ready for review and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels Oct 21, 2025
let is_semi_supernode = parse_flag(cli_args, "semi-supernode");

client_config.chain.node_custody_type = if is_supernode {
client_config.network.subscribe_all_data_column_subnets = true;
Copy link
Member Author

@jimmygchen jimmygchen Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just noticed subscribe_all_data_column_subnets is redundant in network config, because NetworkGlobals already has the sampling columns when initialised, but I can do it in a follow up PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems pretty straight forward to remove: jimmygchen@e623eb6

jimmygchen added a commit that referenced this pull request Oct 21, 2025
Squashed commit of the following:

commit 45a4315
Author: Jimmy Chen <[email protected]>
Date:   Tue Oct 21 23:15:58 2025 +1100

    Add tests for flags

commit 15569bc
Author: Jimmy Chen <[email protected]>
Date:   Tue Oct 21 22:40:49 2025 +1100

    Revert some changes to preserve existing behaviour. Full node with no validator should have `validator_custody_at_head` set to 0.

commit 7767a2e
Author: Jimmy Chen <[email protected]>
Date:   Tue Oct 21 16:47:40 2025 +1100

    More test fixes and update help text.

commit 0932f36
Author: Jimmy Chen <[email protected]>
Date:   Tue Oct 21 16:14:46 2025 +1100

    Add tests for restoring custody context from persisted.

commit 5cc3186
Author: Jimmy Chen <[email protected]>
Date:   Tue Oct 21 16:07:47 2025 +1100

    Implement semi-supernode.

    Co-authored-by: pawanjay176 <[email protected]>
jimmygchen added a commit that referenced this pull request Oct 21, 2025
Squashed commit of the following:

commit 45a4315
Author: Jimmy Chen <[email protected]>
Date:   Tue Oct 21 23:15:58 2025 +1100

    Add tests for flags

commit 15569bc
Author: Jimmy Chen <[email protected]>
Date:   Tue Oct 21 22:40:49 2025 +1100

    Revert some changes to preserve existing behaviour. Full node with no validator should have `validator_custody_at_head` set to 0.

commit 7767a2e
Author: Jimmy Chen <[email protected]>
Date:   Tue Oct 21 16:47:40 2025 +1100

    More test fixes and update help text.

commit 0932f36
Author: Jimmy Chen <[email protected]>
Date:   Tue Oct 21 16:14:46 2025 +1100

    Add tests for restoring custody context from persisted.

commit 5cc3186
Author: Jimmy Chen <[email protected]>
Date:   Tue Oct 21 16:07:47 2025 +1100

    Implement semi-supernode.

    Co-authored-by: pawanjay176 <[email protected]>
@jimmygchen
Copy link
Member Author

Some devnet-3 testing:

  • All three types of nodes able to sync to head
  • Sampling subnets correctly set
  • Gossip topics correctly subscribed
  • Able to fetch blobs from supernode and semi-supernode

@sigp sigp deleted a comment from mergify bot Oct 21, 2025
@mergify

This comment was marked as outdated.

@mergify mergify bot added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Oct 21, 2025
@jimmygchen jimmygchen added ready-for-review The code is ready for review and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels Oct 21, 2025
@mergify
Copy link

mergify bot commented Oct 21, 2025

Some required checks have failed. Could you please take a look @jimmygchen? 🙏

@mergify mergify bot added the waiting-on-author The reviewer has suggested changes and awaits thier implementation. label Oct 21, 2025
Copy link
Member

@pawanjay176 pawanjay176 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a neat idea. Its minimal changes so we can be confident that it doesn't mess with anything before the release. I have also tested this on devnet 3 and am convinved it works as advertised.
Just a few questions

spec: &ChainSpec,
) -> Self {
let cgc_override = node_custody_type.get_custody_count_override(spec);
if let Some(cgc_from_cli) = cgc_override
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should just log a warn or explicitly refuse to start here and ask the user to resync if they want to change the mode.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed earlier, we'll trigger backfill in this case for better UX. Mind if I do it in a separate PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

@jimmygchen
Copy link
Member Author

ah I gotta pull unstable - the test are failing because of some new change from unstable.

@jimmygchen jimmygchen added ready-for-merge This PR is ready to merge. and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels Oct 22, 2025
@mergify mergify bot added the queued label Oct 22, 2025
@mergify
Copy link

mergify bot commented Oct 22, 2025

This pull request has been removed from the queue for the following reason: conflict with pull request ahead.

The pull request conflicts with at least one pull request ahead in queue.

There is nothing you can do for now. If the pull request ahead in the queue is merged, this pull request will become conflicting and you'll have to update it.
If the pull request ahead is not merged, you can requeue this pull request with a @mergifyio requeue comment.

@michaelsproul
Copy link
Member

Are we waiting for custody backfill to merge before merging this? Don't we want to add the feature to start custody backfill when the node changes to a semi-supernode/supernode?

@eserilev
Copy link
Member

if a node's cgc increases on start up, we just need two things for custody backfill to do its thing

  • data column custody info needs to be updated to reflect the cgc change
  • CustodyContext::validator_registrations::epoch_validator_custody_requirements needs to be updated to reflect the cgc change

@jimmygchen
Copy link
Member Author

@mergify requeue

@mergify
Copy link

mergify bot commented Oct 22, 2025

requeue

✅ The queue state of this pull request has been cleaned. It can be re-embarked automatically

@jimmygchen jimmygchen added the ready-for-merge This PR is ready to merge. label Oct 22, 2025
@mergify mergify bot added queued and removed dequeued labels Oct 22, 2025
@jimmygchen
Copy link
Member Author

Are we waiting for custody backfill to merge before merging this? Don't we want to add the feature to start custody backfill when the node changes to a semi-supernode/supernode?

Oh yeah I've discussed with Pawan and going to make a follow up PR so it's bit easier to review
#8254 (comment)

mergify bot added a commit that referenced this pull request Oct 22, 2025
@mergify mergify bot merged commit 43c5e92 into sigp:unstable Oct 22, 2025
37 checks passed
@mergify mergify bot removed the queued label Oct 22, 2025
mergify bot pushed a commit that referenced this pull request Oct 23, 2025
Open PRs to include for the release
- #7907
- #8247
- #8251
- #8253
- #8254
- #8265
- #8269
- #8266


  


Co-Authored-By: Jimmy Chen <[email protected]>

Co-Authored-By: Jimmy Chen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-merge This PR is ready to merge. v8.0.0 Q4 2025 Fusaka Mainnet Release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants