|
| 1 | +# Block Node Nano-Service Approach |
| 2 | + |
| 3 | +## Abstract |
| 4 | + |
| 5 | +To date, block node has been developed under pressure and with changing, incomplete, or inaccurate requirements. |
| 6 | +As a result, the system is tightly interconnected, and suffers from a difficulty making changes to |
| 7 | +one segment of the system without also impacting unrelated segments. To address this, and ensure |
| 8 | +the Block Node is able to be extended, enhanced, and deployed in the manner required for all |
| 9 | +current identified usage, a modified structure of the system is herein detailed. |
| 10 | + |
| 11 | +## Revised Diagram |
| 12 | + |
| 13 | + |
| 14 | + |
| 15 | +## Definitions |
| 16 | + |
| 17 | +<dl> |
| 18 | +<dt>Event</dt> |
| 19 | +<dd>A Java object defined by the messaging service that enables each other service |
| 20 | +to publish a service-defined object with content specific to that service. Note, |
| 21 | +the name of this object is not defined in this document. "Event" is a generic term.</dd> |
| 22 | +<dt>Service</dt> |
| 23 | +<dd>A Java module that is deployed with a particular installation of the Hiero |
| 24 | +Block Node. In most cases these modules are housed in independent jars to make |
| 25 | +adding and removing services (in a custom deployment) easier.</dd> |
| 26 | +</dl> |
| 27 | + |
| 28 | +## Core Concepts |
| 29 | + |
| 30 | +1. Helidon and API objects are restricted to the API layer. |
| 31 | + * The less we flow these externally defined interfaces and classes through |
| 32 | + the system, the more easily we can potentially make outward facing API |
| 33 | + changes without reworking the internal design. |
| 34 | +2. Most services are not required to be present in every deployment of the block node. |
| 35 | +3. No service should depend on classes or interfaces from another service. |
| 36 | + * That service might not be deployed, and each service should be removable |
| 37 | + without breaking other services. The exception is the Messaging service. |
| 38 | + * To this end, services should be independent modules with clearly defined |
| 39 | + and carefully controlled interactions. |
| 40 | +4. Two services are required for all deployments, Messaging and Status |
| 41 | + * Only Messaging should offer any internal API at all. |
| 42 | + * Basically, Messaging is where data and events are published (presumably |
| 43 | + via LMAX Disruptor instances) among other services. |
| 44 | + * The Status service is only required because it is specified as always |
| 45 | + present for a block node client to query via gRPC API. |
| 46 | + * The Messaging service should _not_ have any external (i.e. gRPC) API. |
| 47 | + * We must remain vigilant to avoid packing these two services with interfaces or extra classes. |
| 48 | + * These services should be as slim as possible, and interactions between |
| 49 | + services should be based on a very limited set of `Record` messages ("Events") |
| 50 | + that are passed between services (blindly) by the Messaging service |
| 51 | + rather than interfaces or direct method calls. |
| 52 | +5. There is an assumption in this approach that Messaging offers both "push" |
| 53 | + and "pull" options for receiving messages, and each service may choose the |
| 54 | + most appropriate interaction for that specific service. |
| 55 | + * A persistence service, for instance, might use "push" for simplicity and |
| 56 | + because it does not benefit from holding items within Messaging, but |
| 57 | + a streaming client service might use "pull" in order to allow each of |
| 58 | + many remote clients to be receiving data at slightly varying rates and |
| 59 | + more easily switch from live to historical and back if a particular |
| 60 | + client falls behind and later "catches up". |
| 61 | +6. Most services both publish and observe the service "event" messages |
| 62 | + * By listening for events, any service can react to changes in any other |
| 63 | + service, but also behave reasonably when another service does not exist. |
| 64 | + * Publishing an event (rather than calling an API) makes it easy for each |
| 65 | + service to focus entirely on its own function and not try to work out the |
| 66 | + highly complex possible interactions with all other possible services. |
| 67 | + * Some services (e.g. Archive service) won't make sense if _nothing_ publishes |
| 68 | + a particular event, but even then the service need not be concerned with |
| 69 | + the how/what/why of a event, and need only react if and when a event is |
| 70 | + encountered with the relevant type and content. |
| 71 | +7. Many services will _also_ listen to the main data messages (List<BlockItem>) |
| 72 | + which is the primary data flowing through the system. |
| 73 | + * Note that Publisher service is also not required, so this flow of data might |
| 74 | + be empty, or might be produced from some other source. |
| 75 | + * There _might_ also be a stream of "historical" blocks used to serve client |
| 76 | + requests for those blocks. This is still to be determined. |
| 77 | +8. Configuration for a service is entirely restricted to that service, and does |
| 78 | + not determine what "version" of a service or whether a service is running. |
| 79 | + * It _might_ assist multiple loaded "versions" to identify a conflict. |
| 80 | +9. The JVM `ServiceLoader` is used to load every service that is present, this |
| 81 | + may include multiple services of the same "type" (e.g. multiple archive |
| 82 | + services, multiple persistence services, etc...). |
| 83 | + * It is up to the particular services to ensure that either multiple |
| 84 | + different versions cooperate properly or an error is published on |
| 85 | + startup that multiple incompatible services are loaded. Generally it's |
| 86 | + cleanest if multiple services of the same type are able to work |
| 87 | + independently without issues. If that isn't possible, a service- |
| 88 | + specific configuration is a good alternative. |
| 89 | + |
| 90 | +## Expected Benefits |
| 91 | + |
| 92 | +1. Services are decomposed to small units, often what is thought of as a single |
| 93 | + process is accomplished by multiple nano-services. This makes each such |
| 94 | + service both simple and focused. This also makes adding, removing, and |
| 95 | + modifying these services much easier and faster. |
| 96 | + * It's also much easier to test services with nothing more than a mock of the |
| 97 | + "Messaging" service; which further improves velocity. |
| 98 | +2. Composing services may be easier to reason about than composing interfaces, |
| 99 | + and systems composed of independent services are easier to modify and revise |
| 100 | + than systems with many interacting method calls or complex module |
| 101 | + interactions. |
| 102 | +3. It is much easier to reason about concurrency for a single focused service |
| 103 | + than it is for a larger and more interconnected set of components. |
| 104 | + |
| 105 | +## Considerations and Possible Concerns |
| 106 | + |
| 107 | +1. Sending messages between services is not as efficient as calling a method. |
| 108 | + * This is true, but publishing a message consumed by an unknown set of (potentially) |
| 109 | + several services is significantly more efficient than trying to manage an uncertain |
| 110 | + (and possibly large) number of direct method calls. We are electing to |
| 111 | + prioritize emergent behavior and capability over direct-call efficiency. |
| 112 | +2. Some services may not make any sense without other services. For example, |
| 113 | + a Content Proof service might not be able to function without a State |
| 114 | + Management service and/or State Snapshot service. |
| 115 | + * If a particular service requires other services, it should document the |
| 116 | + expected events (e.g. publish "Need Snapshot For Instant{date/time}" and |
| 117 | + expect "Deliver Snapshot For Instant{date1/time1}") and also document |
| 118 | + behavior if the response-type event is not published. |
| 119 | + * Every service should function, at least to the level of not throwing |
| 120 | + exceptions, regardless of which other services are, or are not, present. |
| 121 | + * While a service may require certain _messages_ to function correctly |
| 122 | + (e.g. a "Deliver Snapshot For Instant..." message in the example above), |
| 123 | + the service _must not_ concern itself with _what_ produces those messages |
| 124 | + or _how_. This ensures that all services function as intended even if |
| 125 | + other services are replaced with completely different, but _compatible_ |
| 126 | + services. |
0 commit comments