Skip to content

Commit

Permalink
revised architecture changes
Browse files Browse the repository at this point in the history
  • Loading branch information
skundu42 committed Jul 2, 2024
1 parent 3dfd300 commit 5b1bd8e
Show file tree
Hide file tree
Showing 15 changed files with 58 additions and 66 deletions.
37 changes: 17 additions & 20 deletions content/docs/architecture/cyclic-operations.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Cyclic operations are a crucial component of the Shardeum blockchain network, en

### 2. Node List and Cycle Chain

In Shardeum, maintaining an up-to-date and consistent node list is fundamental. Each node keeps a list of every other node, known as the **node list**, and a history of changes made to this list, referred to as the **cycle chain**. Nodes suggest changes to the node list through signed messages called updates. These updates are collected and processed within the cycles.
In Shardeum, maintaining an up-to-date and consistent node list is fundamental. Each node keeps a list of every other node, known as the **node list**, and a history of changes made to this list, referred to as the **cycle chain**. Nodes suggest changes to the node list through signed messages called cycle transactions. These changes are collected and processed within the cycles.

However, nodes only keep a limited history of cycle records at the validator level. When a node syncs with the network, it first receives a snapshot of each list. Following this, it uses the cycle process to update the list. Nodes do not maintain a full history because the sync-to-snapshot process suffices.

Expand All @@ -31,41 +31,38 @@ This process ensures the network remains consistent and synchronized, handling t

### 4. Phases of the Cycle

* **Quarter 1: Update Phase Start**
**Quarter 1: Update Phase Start**
* **Function:** [`runQ1()`](https://github.com/shardeum/shardus-core/blob/e8c14ce4ba19785145646e840082acb57ec8ce3b/src/p2p/CycleCreator.ts#L496)
* **Detailed Operation:**
* Nodes use the `runQ1()` function to start collecting updates. This phase allows nodes to gather all the proposed changes, ensuring they have a comprehensive view of the network's current state.
* Nodes collect these updates for a specified duration, ensuring that all potential changes are captured for the current cycle.

Detailed Operation:
* Nodes use the `runQ1()` and `runQ2()` functions to gather the proposed changes, ensuring they have a comprehensive view of the network's current state.
* [`sendRequests()`](https://github.com/shardeum/shardus-core/blob/e8c14ce4ba19785145646e840082acb57ec8ce3b/src/p2p/Lost.ts#L582) in `p2p/Lost.ts` is invoked:
* **Lost Node Reporting:** It checks and reports nodes that are deemed lost based on certain conditions, using aggregated data collected until the first quarter (Q1).
* **Scheduled Removals:** It handles and processes scheduled node removals, ensuring nodes are removed as necessary and deletes them from the tracking list.
* **Lost Node Reporting:** It tells the checker nodes to check on some nodes that were unresponsive. * **Scheduled Removals:** It handles and processes scheduled node removals, ensuring nodes are removed as necessary and deletes them from the tracking list.
* **Gossiping Status Updates:** It gossips information about nodes that have been verified as down to other nodes in the network, and manages self-refutation messages to update the network about nodes that are back online.
* [`sendRequests()`](https://github.com/shardeum/shardus-core/blob/e8c14ce4ba19785145646e840082acb57ec8ce3b/src/p2p/Apoptosis.ts#L296) from `p2p/Apoptosis.ts` is invoked:
* When a node decides to exit the network, it sends an Apoptosis message to around three other active nodes to notify them of its departure, allowing for immediate removal from their node lists. This message, which can be sent at any time, is verified and stored to be gossiped in the next quarter 1, ensuring the exiting node is quickly removed from the network's node list without waiting for the usual discovery process.
* [`sendRequests()`](https://github.com/shardeum/shardus-core/blob/e8c14ce4ba19785145646e840082acb57ec8ce3b/src/p2p/Active.ts#L262) from `p2p/Active.ts` is invoked:
* This function handles the process of sending queued requests for a node to become active in the network. If a request is queued and the node is not restricted from becoming active, it signs the request and attempts to add it to the list of active transactions. It then gossips the signed request to other nodes. If the node does not become active within the expected cycle duration, it retries the request. If the node goes active, it stops the retry process.
* [`sendRequests()`](https://github.com/shardeum/shardus-core/blob/e8c14ce4ba19785145646e840082acb57ec8ce3b/src/p2p/LostArchivers/index.ts#L199) from `p2p/LostArchivers/index.ts` is invoked:
* This function manages the `lostArchiversMap`. It logs the current state of the map for debugging, then iterates through each entry to handle different statuses. For 'reported' entries, it creates and sends an investigation message and removes the entry. For 'down' statuses that haven't been gossiped, it creates and gossips a down message, marking it as gossiped. Similarly, for 'up' statuses, it creates and gossips an up message, marking it as gossiped. This ensures accurate and timely updates about the status of archivers in the network.
* [`sendRequests()`](https://github.com/shardeum/shardus-core/blob/0106966cd63fa7836f9d02ada452094d7ea84974/src/p2p/Join/index.ts#L604) from `p2p/Join/index.ts` is invoked:
* This function processes queued syncing and standby refresh requests by signing and validating them, and then gossiping the relevant messages to the network.
* Config queue changes and debug logic updates are being handled in a [listener](https://github.com/shardeum/shardus-core/blob/e8c14ce4ba19785145646e840082acb57ec8ce3b/src/shardus/index.ts#L934-L947) inside the `shardus.ts` [`start()`](https://github.com/shardeum/shardus-core/blob/e8c14ce4ba19785145646e840082acb57ec8ce3b/src/shardus/index.ts#L435) function.
* Changes in network coverage are calculated and summaries of previous cycles are processed are being executed by [listener](https://github.com/shardeum/shardus-core/blob/e8c14ce4ba19785145646e840082acb57ec8ce3b/src/state-manager/index.ts#L3703-L3740) in `state-manager.ts` `startShardCalculations()`.
* **Quarter 2: Update Phase End**

**Quarter 2: Update Phase End**
* **Function:** [`runQ2()`](https://github.com/shardeum/shardus-core/blob/e8c14ce4ba19785145646e840082acb57ec8ce3b/src/p2p/CycleCreator.ts#L526)
* **Detailed Operation:**

Detailed Operation:
* The `runQ2()` function marks the end of this collection period, consolidating the updates for synchronization.
* Node selection is triggered by [`executeNodeSelection()`](https://github.com/shardeum/shardus-core/blob/e8c14ce4ba19785145646e840082acb57ec8ce3b/src/p2p/Join/v2/select.ts#L28-L38).
* **Quarter 3: Cycle Sync Start**
* **Function:** [`runQ3()`](https://github.com/shardeum/shardus-core/blob/e8c14ce4ba19785145646e840082acb57ec8ce3b/src/p2p/CycleCreator.ts#L561)
* **Detailed Operation:**

Detailed Operation:
* During `runQ3()`, nodes compare their collected updates by exchanging hashes (`cycle_tx_hash`). This step ensures all nodes have a consistent view of the updates.
* Nodes validate these updates by checking the `cycle_tx_hash` and creating a cycle certificate based on the highest value votes from other nodes. This certificate ensures that only valid updates are applied.
* Nodes gossip their certificates to ensure the network agrees on the cycle's contents.
* Cycle data is being [cleaned up](https://github.com/shardeum/shardus-core/blob/e8c14ce4ba19785145646e840082acb57ec8ce3b/src/state-manager/index.ts#L3742-L3770) on every 5 cycles.
* **Quarter 4: Cycle Finalization**
* **Function:** [`runQ4()`](https://github.com/shardeum/shardus-core/blob/e8c14ce4ba19785145646e840082acb57ec8ce3b/src/p2p/CycleCreator.ts#L634)
* **Detailed Operation:**
* The `runQ4()` function handles the final certification of the updates. Nodes certify the agreed updates, creating a cycle marker and certificate to log the cycle's changes.
* This phase ensures the network's integrity by preventing double voting and other malicious activities through a robust consensus mechanism.
* Nodes prune the cycle chain to keep it within a manageable size, typically retaining a set number of recent cycles.

Detailed Operation:
* The `runQ4()` function involves comparing cycle certificates with other nodes until the best one is identified.

![Cycle](/img/new/i8.jpg)

7 changes: 4 additions & 3 deletions content/docs/architecture/node-lifecycle.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,16 @@ In the network, nodes exist in two primary states:

* **Staking and Joining**: To enter the standby list, nodes must first stake and submit a join request. This process involves creating a staking transaction, which is validated and recorded on the blockchain. The staking information is stored in the node and the operator account. Each staking transaction is associated with a unique address and includes a JSON blob containing the staking instructions, encoded in base 64. This allows the system to easily recognize and validate staking transactions without processing them through the EVM. Once validated, the user's balance is updated, and node and operator accounts are updated. This process is secured by requiring nodes to present signed certificates, ensuring that only verified nodes can enter the queue. The request to join the standby list can be seen at [`addStandbyJoinRequests()`](https://github.com/shardeum/shardus-core/blob/8e69984a098c8ed387ca8f0684b313a75f85aec4/src/p2p/Join/v2/index.ts#L107).
* **Rotation:** Nodes are periodically rotated out of active duty as a measure of security and efficiency. The network chooses which nodes to rotate based on their age (expired nodes). The number of nodes that will be replaced with ones from the standby list is a configurable parameter. The expired and removed nodes are returned from [`getExpiredRemoved()`](https://github.com/shardeum/shardus-core/blob/8e69984a098c8ed387ca8f0684b313a75f85aec4/src/p2p/Rotation.ts#L118).
![Cycle](/img/new/i8.jpg)

![Cycle](/img/new/iframe2000.jpg)

### 4. Node Selection and Activation

* **Deterministic Random Selection**: The network employs a deterministic algorithm to select nodes from the standby list. This selection method prevents manipulation and maintains a smooth transition from standby to active states. This selection not only picks the node but also assigns it an ID, which determines the address space it will cover and who its neighbors will be. Active validators create consensus on the list’s integrity by hashing it each cycle. This consensus prevents any single node from manipulating its chance of selection and subsequently landing on the same shard by generating new public-private key pairs in bulk. The selection process employs 'future numbers,' which means that the actual random numbers determining which nodes are selected and their placement are not generated until later in the selection phase. This setup creates virtually impossible conditions for any node to exploit the system and prevents potential attacks where one might control 51% of the consensus, thereby harming targeted accounts. The selection process is executed in [`selectNodes()`](https://github.com/shardeum/shardus-core/blob/8e69984a098c8ed387ca8f0684b313a75f85aec4/src/p2p/Join/v2/select.ts#L47).
* **Syncing Process:** When a node is selected to become active, it first enters a syncing mode to prevent disruptions in case it is chosen but turns off unexpectedly, stalling the process. The node begins by syncing cycle data from the network, followed by syncing account data relevant to it from other nodes. After completing this process, the node notifies the network that it has finished syncing. If a node fails to complete syncing within a specified threshold (typically double the median sync time of active nodes), it is subject to slashing and removal from the network. The syncing process can be examined in [`sync()`](https://github.com/shardeum/shardus-core/blob/8e69984a098c8ed387ca8f0684b313a75f85aec4/src/p2p/Sync.ts#L97).
* **Activation Process:** Once syncing is complete, nodes do not immediately become active due to the complexities involved in sharding calculations. These calculations could significantly disrupt the address space, creating conflicts over which nodes possess the necessary data. To maintain a steady rate of activation, only a certain number of nodes are allowed to go active in each cycle. This helps to smooth out potential waves and inconsistencies. The data synced from the queued nodes is discarded, as the likelihood of a node being assigned to the same shard in the next cycle is very low. Additionally, the time it might take for this node to be randomly selected again could be substantial. The activation process can be seen in [`updateRecord()`](https://github.com/shardeum/shardus-core/blob/8e69984a098c8ed387ca8f0684b313a75f85aec4/src/p2p/Active.ts#L146).


![Cycle](/img/new/lifecycle2.jpg)

### 5. Handling Node Failures and Loss Detection

Expand Down Expand Up @@ -54,7 +55,7 @@ The lost node detection system kicks into action when a node (Node A) attempts t
* **Finality of Status**: If Node B fails to refute its status, it is permanently removed from the active node list, ensuring that only functional and honest nodes participate in network activities.
* The status of the node is updated in [`updateRecord()`](https://github.com/shardeum/shardus-core/blob/8e69984a098c8ed387ca8f0684b313a75f85aec4/src/p2p/Lost.ts#L429).

![Cycle](/img/new/i31.jpg)
![Cycle](/img/new/lifecycle1.jpg)

#### 5.5 Prevention of Abuse

Expand Down
Loading

0 comments on commit 5b1bd8e

Please sign in to comment.