Skip to content

Commit

Permalink
Merge branch 'main' into feat-fixed-celestiaorg#3078
Browse files Browse the repository at this point in the history
  • Loading branch information
abhirajprasad authored Jan 15, 2025
2 parents eb069b1 + 9d906c7 commit 2b92ab6
Show file tree
Hide file tree
Showing 22 changed files with 65 additions and 52 deletions.
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -334,6 +334,7 @@ mptcp-disable: disable-mptcp
CONFIG_FILE ?= ${HOME}/.celestia-app/config/config.toml
SEND_RECV_RATE ?= 10485760 # 10 MiB

## configure-v3: Modifies config file in-place to conform to v3.x recommendations.
configure-v3:
@echo "Using config file at: $(CONFIG_FILE)"
@if [ "$$(uname)" = "Darwin" ]; then \
Expand Down
5 changes: 4 additions & 1 deletion app/default_overrides.go
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ func (stakingModule) DefaultGenesis(cdc codec.JSONCodec) json.RawMessage {
})
}

// stakingModule wraps the x/staking module in order to overwrite specific
// slashingModule wraps the x/slashing module in order to overwrite specific
// ModuleManager APIs.
type slashingModule struct {
slashing.AppModuleBasic
Expand Down Expand Up @@ -294,5 +294,8 @@ func DefaultAppConfig() *serverconfig.Config {
cfg.StateSync.SnapshotInterval = 1500
cfg.StateSync.SnapshotKeepRecent = 2
cfg.MinGasPrices = fmt.Sprintf("%v%s", appconsts.DefaultMinGasPrice, BondDenom)

const mebibyte = 1048576
cfg.GRPC.MaxRecvMsgSize = 20 * mebibyte
return cfg
}
8 changes: 8 additions & 0 deletions app/default_overrides_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,9 @@ func TestDefaultAppConfig(t *testing.T) {
assert.Equal(t, uint64(1500), cfg.StateSync.SnapshotInterval)
assert.Equal(t, uint32(2), cfg.StateSync.SnapshotKeepRecent)
assert.Equal(t, "0.002utia", cfg.MinGasPrices)

mebibyte := 1048576
assert.Equal(t, 20*mebibyte, cfg.GRPC.MaxRecvMsgSize)
}

func TestDefaultConsensusConfig(t *testing.T) {
Expand All @@ -89,6 +92,11 @@ func TestDefaultConsensusConfig(t *testing.T) {
}
assert.Equal(t, want, *got.Mempool)
})
t.Run("p2p overrides", func(t *testing.T) {
const mebibyte = 1048576
assert.Equal(t, int64(10*mebibyte), got.P2P.SendRate)
assert.Equal(t, int64(10*mebibyte), got.P2P.RecvRate)
})
}

func Test_icaDefaultGenesis(t *testing.T) {
Expand Down
2 changes: 1 addition & 1 deletion docs/architecture/adr-001-abci++-adoption.md
Original file line number Diff line number Diff line change
Expand Up @@ -237,7 +237,7 @@ func SplitShares(txConf client.TxConfig, squareSize uint64, data *core.Data) ([]
for _, rawTx := range data.Txs {
... // decode the transaction

// write the tx to the square if it normal
// write the tx to the square if it is normal
if !hasWirePayForBlob(authTx) {
success, err := sqwr.writeTx(rawTx)
if err != nil {
Expand Down
2 changes: 1 addition & 1 deletion docs/architecture/adr-004-qgb-relayer-security.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ In fact, the QGB smart contract is designed to update the data commitments as fo
- Check if the data commitment is signed using the current valset _(this is the problematic check)_
- Then, other checks + commit

So, if a relayer is up to date, it will submit data commitment and will pass the above checks.
So, if a relayer is up-to-date, it will submit data commitment and will pass the above checks.

Now, if the relayer is missing some data commitments or valset updates, then it will start catching up the following way:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ The commitment is still the same but we need to use the bottom subtree roots for

Given a square size k, the biggest message that you can construct that is affected by the proposed non-interactive default rules has a size (k/2)². If you construct a message that is bigger than (k/2)² the `minSquareSize` will be k. If the minSquareSize is k in a square of size k then the current non-interactive default rules are equivalent to the proposed non-interactive default rules, because the message starts always at the beginning of a row. In other words, if you have k² shares in a message the worst constructible message is a quarter of that k²/4, because that is the size of the next smaller square.

If you choose k²/4 as the worst constructible message it would still have O(sqrt(n)) subtree roots. This is because the size of the message is k²/4 with a width of k and a length of k/4. This means the number of rows the message fills approaches O(sqrt(n)). Therefore we need to find a message where the number of rows is log(n) of the size of the message.
If you choose k²/4 as the worst constructible message it would still have O(sqrt(n)) subtree roots. This is because the size of the message is k²/4 with a width of k and a length of k/4. This means the number of rows the message fills approaches O(sqrt(n)). Therefore, we need to find a message where the number of rows is log(n) of the size of the message.

With k being the square size and n being the number of shares and r being the number of rows, we want to find a message so that:
k * r = n & log(n) = r => k = n/log(n)
Expand Down Expand Up @@ -179,7 +179,7 @@ Light Nodes have additional access to row and column roots from the Data Availab

### Total Proof Size for Partial Nodes

Partial nodes in this context are light clients that may download all of the data in the reserved namespace. They check that the data behind the PFB was included in the `DataRoot`, via blob inclusion proofs.
Partial nodes in this context are light clients that may download all the data in the reserved namespace. They check that the data behind the PFB was included in the `DataRoot`, via blob inclusion proofs.

For this analysis, we take the result from the light nodes and scale them up to fill the whole square. We ignore for now the reserved namespace and what space it might occupy.
For the proposed non-interactive default rules we are also creating 1 more message that could practically fit into a square. This is because the current non-interactive default rules fit one more message if we construct it this way and don't adjust the first and last messages.
Expand Down
2 changes: 1 addition & 1 deletion docs/architecture/adr-010-remove-wire-msg-pay-for-blob.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,4 +130,4 @@ Consider an incremental approach for this and related changes:
## References
- [ADR 080: square size independent message commitments](./adr-008-square-size-independent-message-commitments.md)
- [ADR 008: square size independent message commitments](./adr-008-square-size-independent-message-commitments.md)
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ Note: the blue nodes are additional nodes that are needed for the Merkle proofs.

![PFB Merkle Proof](./assets/adr011/pfd-merkle-proof.png)

Let's assume a square size of k. The amount of blue nodes from the shares to ROW1 is O(log(k)). The amount of blue nodes from ROW1 to the `DataRoot` is also O(log(k). You will have to include the shares themselves in the proof.
Let's assume a square size of k. The amount of blue nodes from the shares to ROW1 is O(log(k)). The amount of blue nodes from ROW1 to the `DataRoot` is also O(log(k)). You will have to include the shares themselves in the proof.
Share size := 512 bytes
NMT-Node size := 32 bytes + 2\*8 bytes = 48 bytes
MT-Node size := 32 bytes
Expand Down
8 changes: 4 additions & 4 deletions docs/architecture/adr-018-network-upgrades.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,9 @@ All upgrades (barring social hard forks) are to be rolling upgrades. That is nod

## Detailed Design

The design depends on a versioned state machine whereby the app version displayed in each block and agreed upon by all validators is the version that the transactions are both validated and executed against. If the celestia state machine is given a block at version 1 it will execute it with the v1 state machine if consensus provides a v2 block, all the transactions will be executed against the v2 state machine.
The design depends on a versioned state machine whereby the app version displayed in each block and agreed upon by all validators is the version that the transactions are both validated and executed against. If the celestia state machine is given a block at version 1, it will execute it with the v1 state machine if consensus provides a v2 block, all the transactions will be executed against the v2 state machine.

Given this, a node can at any time spin up a v2 binary which will immediately be able to continue validating and executing v1 blocks as if it were a v1 machine.
Given this, a node can at any time spin up a v2 binary, which will immediately be able to continue validating and executing v1 blocks as if it were a v1 machine.

### Configured Upgrade Height

Expand All @@ -51,12 +51,12 @@ The height of the v1 -> v2 upgrade will initially be supplied via CLI flag (i.e.
- Given the uncertainty in scheduling, the system must be able to handle changes to the upgrade height that most commonly would come in the form of delays. Embedding the upgrade schedule in the binary is convenient for node operators and avoids the possibility for user errors. However, binaries are static. If the community wished to push back the upgrade by two weeks there is the possibility that some nodes would not rerun the new binary thus we'd get a split between nodes running the old schedule and nodes running the new schedule. To overcome this, proposers will only propose a version change in the first round of each height, thus allowing transactions to still be committed even under circumstances where there is no consensus on upgrading. Secondly, we define a range in which nodes will attempt to upgrade the app version and failing this will continue to run the current version. Lastly, the binary will have the ability to manually specify the app version height mapping and override the built-in values either through a flag or in the `app.toml` config. This is expected to be used in testing and in emergency situations only. Another example to keep in mind is if a quorum outright rejects an upgrade. If some of the validators are for the change they should have some way to continue participating in the network. Therefore we employ a range that nodes will attempt to upgrade and afterwards will continue on normally with the new binary however running the older version.
- The system needs to be tolerant of unexpected faults in the upgrade process. This can be:
- The community/contributors realize there is a bug in the new version after the binary has been released. Node operators will need to downgrade back to the previous version and restart their node.
- There is a halting bug in the migration or in processing of the first transactions. This most likely would be in the form of an apphash mismatch. This becomes more problematic with delayed execution as the block (with v2 transactions) has already been committed. Immediate execution has the advantage of the apphash mismatch being realised before the data is committed. It's still however feasible to over come this but it involves nodes rolling back the previous state and re-executing the transactions using the v1 state machine (which will skip over the v2 transactions). This means node operators should be able to manually override the app version that the proposer will propose with. Lastly, if state migrations occurred between v2 and v1, a reverse migration would need to be performed which would make things especially difficult. If we are unable to fallback to the previous version and continue then the other option is to remain halted until the bug is patched and the network can update and continue
- There is a halting bug in the migration or in processing of the first transactions. This most likely would be in the form of an apphash mismatch. This becomes more problematic with delayed execution as the block (with v2 transactions) has already been committed. Immediate execution has the advantage of the apphash mismatch being realized before the data is committed. It's still however feasible to overcome this but it involves nodes rolling back the previous state and re-executing the transactions using the v1 state machine (which will skip over the v2 transactions). This means node operators should be able to manually override the app version that the proposer will propose with. Lastly, if state migrations occurred between v2 and v1, a reverse migration would need to be performed which would make things especially difficult. If we are unable to fallback to the previous version and continue then the other option is to remain halted until the bug is patched and the network can update and continue
- There is a bug that is detected that could halt the chain but hasn't yet. There are other things we can develop to combat such scenarios. One thing we can do is develop a circuit breaker similar to the designs proposed in [Cosmos SDK](https://github.com/cosmos/cosmos-sdk/tree/main/x/circuit). This can disable certain message types or modules either in `CheckTx` or `ProcessProposal`. This violates the consistency property between `PrepareProposal` and `ProcessProposal` but so long as a quorum are the same, will still allow the chain to progress (inconsistency here can be interpreted as byzantine).

### Future Work: Signaled Upgrade Height

Preconfigured upgrade paths are vulnerable to halts. There is no indication that a quorum has in fact upgraded and that when the proposer proposes a block with the message to change version, that consensus will be reached. To mitigate this risk, the upgrade height can instead be signaled by validators. A version of `VoteExtension`s may be the most effective at ensuring this. Validators upon start up will automatically signal a version upgrade when they go to vote (i.e. `ExtendedVote`) so long as the latest supported version differs from the current network version. In `VerifyVoteExtension`, the version will be parsed and persisted (although not part of state). There is no verification. Upon a certain threshold which must be at least 2/3+ but could possibly be greater, the next proposer, who can support this version will propose a block with the `MsgVersionChange` that the quorum have agreed to. The rest works as before.
Preconfigured upgrade paths are vulnerable to halts. There is no indication that a quorum has in fact upgraded and that when the proposer proposes a block with the message to change version, that consensus will be reached. To mitigate this risk, the upgrade height can instead be signaled by validators. A version of `VoteExtension`s may be the most effective at ensuring this. Validators upon start-up will automatically signal a version upgrade when they go to vote (i.e. `ExtendedVote`), so long as the latest supported version differs from the current network version. In `VerifyVoteExtension`, the version will be parsed and persisted (although not part of state). There is no verification. Upon a certain threshold which must be at least 2/3+ but could possibly be greater, the next proposer, who can support this version will propose a block with the `MsgVersionChange` that the quorum have agreed to. The rest works as before.

For better performance, `VoteExtensions` should be modified such that empty messages don't require a signature (which is currently the case for v0.38 of [CometBFT](https://github.com/cometbft/cometbft/blob/91ffbf9e45afb49d34a4af91b031e14653ee5bd8/privval/file.go#L324))

Expand Down
4 changes: 2 additions & 2 deletions docs/maintainers/docker.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,15 @@

## Context

Github Actions should automatically build and publish a Docker image for each release. If Github Actions failed, you can manually build and publish a Docker image using this guide.
GitHub Actions should automatically build and publish a Docker image for each release. If GitHub Actions failed, you can manually build and publish a Docker image using this guide.

## Prerequisites

1. Navigate to <https://github.com/settings/tokens> and generate a new token with the `write:packages` scope.

## Steps

1. Verify that a Docker image with the correct tag doesn't already exist for the release you're trying to create publish on [GHCR](https://github.com/celestiaorg/celestia-app/pkgs/container/celestia-app/versions)
1. Verify that a Docker image with the correct tag doesn't already exist for the release you're trying to publish on [GHCR](https://github.com/celestiaorg/celestia-app/pkgs/container/celestia-app/versions)

1. In a new terminal

Expand Down
10 changes: 5 additions & 5 deletions docs/release-notes/release-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ This guide provides notes for major version releases. These notes may be helpful
#### Enabling BBR and MCTCP

Consensus node operators must enable the BBR (Bottleneck Bandwidth and Round-trip propagation time) congestion control algorithm. See [#3774](https://github.com/celestiaorg/celestia-app/pull/3774).
if using linux in docker, kubernetes, a vm or baremetal, this can be done by calling
If using Linux in Docker, Kubernetes, a VM or bare-metal, this can be done by calling

```sh
make enable-bbr
Expand All @@ -31,18 +31,18 @@ If the config file is not in the default spot, it can be provided using:
make configure-v3 CONFIG_FILE=path/to/other/config.toml
```

**Alternatively**, the configurations can be changed manually. This involves updating the mempool TTLs and the send and the receive rates.
**Alternatively**, the configurations can be changed manually. This involves updating the mempool TTLs and the send and receive rates.

- Configuring Bandwidth Settings
- update `recv_rate` and `send_rate` in your TOML config file to 10MiB (10485760).
- Update `recv_rate` and `send_rate` in your TOML config file to 10MiB (10485760)
- Extend TTLs
- update `ttl-num-blocks` in your TOML config file to 12.
- Update `ttl-num-blocks` in your TOML config file to 12

#### Signaling Upgrades

- Upgrades now use the `x/signal` module to coordinate the network to an upgrade height.

The following command can be used, if you are a validator in the active set, to signal to upgrade to v3
The following command can be used, if you are a validator in the active set, to signal to upgrade to v3:

```bash
celestia-appd tx signal signal 3 <plus transaction flags>
Expand Down
11 changes: 6 additions & 5 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ go 1.23.1

require (
cosmossdk.io/errors v1.0.1
cosmossdk.io/math v1.4.0
cosmossdk.io/math v1.5.0
github.com/celestiaorg/blobstream-contracts/v3 v3.1.0
github.com/celestiaorg/go-square v1.1.1
github.com/celestiaorg/go-square/v2 v2.1.0
Expand Down Expand Up @@ -33,8 +33,8 @@ require (
github.com/tendermint/tm-db v0.6.7
golang.org/x/exp v0.0.0-20240904232852-e7e105dedf7e
google.golang.org/genproto/googleapis/api v0.0.0-20241015192408-796eee8c2d53
google.golang.org/grpc v1.69.2
google.golang.org/protobuf v1.36.1
google.golang.org/grpc v1.69.4
google.golang.org/protobuf v1.36.2
gopkg.in/yaml.v2 v2.4.0
k8s.io/apimachinery v0.32.0
)
Expand Down Expand Up @@ -68,6 +68,7 @@ require (
github.com/chzyer/readline v1.5.1 // indirect
github.com/cilium/ebpf v0.12.3 // indirect
github.com/cockroachdb/apd/v2 v2.0.2 // indirect
github.com/cockroachdb/apd/v3 v3.2.1 // indirect
github.com/cockroachdb/errors v1.11.3 // indirect
github.com/cockroachdb/fifo v0.0.0-20240606204812-0bbfbd93a7ce // indirect
github.com/cockroachdb/logtags v0.0.0-20230118201751-21c54148d20b // indirect
Expand Down Expand Up @@ -253,8 +254,8 @@ require (
)

replace (
github.com/cosmos/cosmos-sdk => github.com/celestiaorg/cosmos-sdk v1.26.0-sdk-v0.46.16
github.com/cosmos/cosmos-sdk => github.com/celestiaorg/cosmos-sdk v1.26.1-sdk-v0.46.16
github.com/gogo/protobuf => github.com/regen-network/protobuf v1.3.3-alpha.regen.1
github.com/syndtr/goleveldb => github.com/syndtr/goleveldb v1.0.1-0.20210819022825-2ae1ddf74ef7
github.com/tendermint/tendermint => github.com/celestiaorg/celestia-core v1.44.1-tm-v0.34.35
github.com/tendermint/tendermint => github.com/celestiaorg/celestia-core v1.44.2-tm-v0.34.35
)
Loading

0 comments on commit 2b92ab6

Please sign in to comment.