Skip to content
Randall Hauch edited this page Jan 26, 2015 · 7 revisions

Debezium is composed of several software components. The lightweight driver implements the public API and is instantiated in applications that are directly exposed and used by mobile apps. The driver's public methods are asynchronous, and largely involve creating messages, registering the caller's response handler (if appropriate), writing them to the appropriate stream, and monitoring the partial-responses stream for responses. All other functionality is implemented by services described on this page.

Unless otherwise specified, all services are single-threaded. Multiple services can be deployed so that each service is consuming a distinct subset of the partitions of the consumed streams. The maximum number of service instances corresponds to the number of partitions in the consumed stream. That means that any given partition will be consumed by only one service instance, and a service instance only consumes one message at a time.

Also, pay particular attention to the way the consumed streams are partitioned. Many services use a local database to store information, often keyed by the partition key of the consumed stream. Because of the way streams are partitioned, any message with a given partition key will always be stored in the same stream partition, and the same service instance will always process all such messages. In this way, the services can actually persist information in a local database, resulting in a shared-nothing architecture that maximizes scalability, availability, and performance. In the prototypes, the databases are also backed by its own stream, so all the information can be recovered even if the local database fails or is lost.

The services are:

Device service

The device service consumes the connections stream, maintains a database of all device tokens for each username, and (proposed) writes out summary messages every time a device is added or removed.

Entity batch service

The entity batch service consumes a single stream of batch requests and for each batch simply writes each patch to the output stream. The incoming batch requests are recorded directly from client requests, but since they contain requests for multiple entities (potentially in different zones and collections) they are not suitable for direct processing. Instead, this service breaks up the batches into patches that are easily partitioned and processed by downstream services.

Consumes: entity-batches, partitioned by database ID

Outputs: entity-patches, partitioned by entity ID

Processing logic:

  1. Consume the batch request message
  2. For each patch request within the batch request
  3. Write the message to the entity-patches, partitioning it by entity ID
  4. Mark the input message as consumed

Failure tolerance: This service is tolerant of crashes and will not lose or corrupt any data. After restarting the service following a crash, some inputs messages that were only partially processed when the crash occurred will be reprocessed, potentially resulting in some duplicated output messages. Therefore, downstream services must be tolerant of duplicate messages (which is likely not to be a problem, since it will at most imply an entity changed twice when it only changed once).

This service stores no information locally, and merely consumes the incoming stream.

Entity storage service

The entity storage service is perhaps one of the most important services within Debezium. It is responsible for processing incoming entity patch requests and recording/updating the state for each entity. The patch requests are partitioned by entity ID, so a single service will be responsible only for those entities within the assigned partition(s). The service is stateful and uses a local database to maintain a cache of the latest state of each entity it sees. This database is also backed by a replicated and partitioned log (stream) so that the entire local database can be rebuilt should the database be corrupted or lost.

Consumes: entity-patches, partitioned by entity ID

Outputs:

Processing logic:

  1. Consume the patch request message
  2. Create a response message into which results will be placed
  3. Read from the local database the identified entity (if it exists)
  4. Based upon the type of request:
  5. For a read request 1. If the entity does not exist, record an error in the response message 1. Write into the response message the entity's current representation
  6. For an update request, 1. If the entity does not exist, create a new entity representation; otherwise, record the entity's current (non-updated) representation in the response message's before field 1. Attempt to apply the patch to the entity
    1. If the patch failed, record the error in the response message
    2. If the patch succeeded, store the updated entity and write to the response message the new representation and timestamp information
  7. Write the response message to the output streams
  8. Mark the input message as consumed

Failure tolerance: This service is tolerant of crashes and will not lose or corrupt any data. If the service crashes after step 1 but before step 6, then when the service is restarted it will reprocess the same input message it was processing when the crash occurred. If the patch were a read request, the response will be generated and (re)written to the output streams; if the patch were an update, then the patch would be re-applied to the entity, and since patches are idempotent there will be no net change to the entity, but a response message will be generated and (re)written to the output streams.

As mentioned above, this service maintains a local database to cache the latest state of each entity it sees. While the local database is itself not replicated, if the database crashes or is lost, the database can be rebuilt from the persisted and replicated log of database changes.

Should this service's local database and its persistent and replicated log be completely lost, the service's state could be rebuilt by replaying all the recorded patch requests in the entity-patches input stream. If that is lost, they can be rebuilt from the upstream entity-batches stream.

Schema storage service

The schema storage service is responsible for processing incoming schema patch requests and recording/updating the state for each schema, and it is nearly identical to the entity storage service. The schema patch requests are partitioned by database ID, so a single service will be responsible only for those entities within the assigned partition(s). The service is stateful and uses a local database to maintain a cache of the latest state of each schema it sees. This database is also backed by a replicated and partitioned log (stream) so that the entire local database can be rebuilt should the database be corrupted or lost.

Consumes: schema-patches, partitioned by database ID

Outputs:

Processing logic:

  1. Consume the patch request message
  2. Create a response message into which results will be placed
  3. Read from the local database the identified schema (if it exists)
  4. Based upon the type of request:
  5. For a read request 1. If the schema does not exist, record an error in the response message 1. Write into the response message the schema's current representation
  6. For an update request, 1. If the schema does not exist, create a new schema representation; otherwise, record the schema's current (non-updated) representation in the response message's before field 1. Attempt to apply the patch to the schema
    1. If the patch failed, record the error in the response message
    2. If the patch succeeded, store the updated schema and write to the response message the new representation and timestamp information
  7. Write the response message to the output streams
  8. Mark the input message as consumed

Failure tolerance: This service is tolerant of crashes and will not lose or corrupt any data. If the service crashes after step 1 but before step 6, then when the service is restarted it will reprocess the same input message it was processing when the crash occurred. If the patch were a read request, the response will be generated and (re)written to the output streams; if the patch were an update, then the patch would be re-applied to the schema, and since patches are idempotent there will be no net change to the schema, but a response message will be generated and (re)written to the output streams.

As mentioned above, this service maintains a local database to cache the latest state of each schema it sees. While the local database is itself not replicated, if the database crashes or is lost, the database can be rebuilt from the persisted and replicated log of database changes.

Should this service's local database and its persistent and replicated log be completely lost, the service's state could be rebuilt by replaying all the recorded patch requests in the schema-patches input stream in which were recorded all of the client requests to read or modify the schema.

Schema learning partitioning service

The schema learning partitioning service consumes two different streams, entity updates and schema updates, and simply rewrites these messages to a different stream with a different partitioning scheme that puts an entity type and all associated instances into the same partition. This is more suitable for the schema learning service.

Consumes:

Outputs: schema-learning, partitioned by entity type name

Processing logic: For messages from entity-updates:

  1. Consume the update message
  2. If learning is enabled on this entity
  3. Write the message to the schema-learning, partitioning it by entity type name
  4. Mark the input message as consumed

For messages from schema-updates:

  1. Consume the update message
  2. If learning is enabled on this entity
  3. For each entity type within the schema 1. Generate a message with the entity type representation in the after field 1. Write the message to the schema-learning, partitioning it by entity type name
  4. Mark the input message as consumed

Failure tolerance: This service is tolerant of crashes and will not lose or corrupt any data. After restarting the service following a crash, some inputs messages that were only partially processed when the crash occurred will be reprocessed, potentially resulting in some duplicated output messages. Therefore, downstream services must be tolerant of duplicate messages (which is likely not to be a problem, since it will at most imply an entity changed twice when it only changed once).

This service stores no information locally, and merely consumes the incoming stream.

Schema learning service

The schema learning service consumes this schema-learning stream and maintains a learning model for each of the entity types it sees. If the incoming message is a schema update, then the update is immediately applied to the entity type within the learning model. On the other hand, if the incoming message is an entity update, the entity patch is fed through the learning model, which may change as a result. When the learning model does change, it is recorded as a patch to the schema, and this schema patch is immediately written back to the schema-patches stream. See the Schema learning page for a more thorough description of how schema learning is implemented and its capabilities and limitations.

Consumes: schema-learning, partitioned by entity type name

Outputs: schema-patches, partitioned by database ID

Processing logic:

  1. Consume message from schema-learning, partitioned by entity type name
  2. If the message represents an entity update response:
  3. Find the learning model for the entity type
  4. Apply the patch to the learning model
  5. If the learning model changed the underlying entity type schema, then: 1. Write the resulting patch to the schema-patches stream, partitioned by database ID 1. Update the persisted cache of the learning model
  6. If the message represents a change to an entity type's definition:
  7. Find the learning model for the entity type
  8. Rebuild a new learning model using the entity type's new definition
  9. Update the persisted cache of the learning model
  10. Mark that the input message was consumed

Failure tolerance: This service is tolerant of crashes and will not lose or corrupt any data. After restarting the service following a crash, if the input message resulted in a schema patch and that schema patch was written just before the crash, then upon restart this same patch would be written again. However, the schema storage service is tolerant of duplicate messages with no ill effects.

The only information stored locally is a cached representation of the entity type definition. This information is stored in a local database, but it is also backed by a dedicated, replicated, and partitioned stream. Therefore, should the local database become lost or corrupt, the information can be recovered from the dedicated stream.

Total loss of this service's local database, should it occur, does not affect any of the persisted entity or schema information.

Response accumulator service

The response accumulator service processes all partial-response messages and aggregates the partial results from each request. When all of a request's parts have completed, then the aggregate response is written to the complete-response. Consumers of this stream will therefore see results only when all parts have completed.

Consumes: partial-response, partitioned by request ID

Outputs: complete-response, partitioned by request ID

Processing logic:

  1. Consume message from partial-response, partitioned by request ID
  2. If the request contained only 1 part:
  3. the forward it immediately to the output stream.
  4. Otherwise:
  5. Find in the local cache/database the aggregate response message for this request
  6. If no aggregate response message exists, create an empty one
  7. Add the consumed response message to the aggregate
  8. If responses for each request part have been added to the aggregate: 1. write the aggregate to the output stream 1. remove the aggregate response from the local cache/database
  9. Otherwise: 1. write the aggregate response to the local cache/database
  10. Mark that the input message was consumed

Failure tolerance: This service is tolerant of crashes and will not lose or corrupt any data. After restarting the service following a crash, some input messages that were only partially processed when the crash occurred will be reprocessed, potentially resulting in some duplicated output messages. Therefore, downstream services must be tolerant of duplicate messages.

The only information stored locally is the incomplete aggregate response messages. Once an aggregate is complete, it is removed from the local store. This information is stored in a local database, but it is also backed by a dedicated, replicated, and partitioned stream. Therefore, should the local database become lost or corrupt, the information can be recovered from the dedicated stream.

Total loss of this service's local database, should it occur, does not affect any of the persisted entity or schema information. It may, however, cause a number of messages that were already partially complete at the time to never be considered as finished. This could be averted by reprocessing the incoming stream from some point in time before the data loss.

Changes in zone service

The changes in zone service processes entity updates, converts them to summaries of what changed, and writes them out to a stream partitioned by zone ID. It also monitors any changed $zoneSubscription entities, and writes them out to the output stream partitioned by the zone ID of the zone that is the subject of the subscription (rather than the zone in $zoneSubscription entity is stored).

Consumes: entity-updates, partitioned by entity ID

Outputs: zone-changes, partitioned by zone ID

Processing logic:

  1. Consume message from entity-updates, partitioned by entity ID
  2. Extract the zone ID of the changed entity
  3. If the changed entity is from the built-in $zoneSubscription collection:
  4. Read the ID of the zone that is the subject of the subscription
  5. Write the entity update to the zone-changes, partitioned by subject zone ID
  6. Create a summary message of the changed entity (including zone subscription entities), determining whether the entity was created, updated, or deleted.
  7. Write the summary message to the zone-changes, partitioned by the changed entity's zone ID
  8. Mark that the input message was consumed

Failure tolerance: This service is tolerant of crashes and will not lose or corrupt any data. After restarting the service following a crash, some inputs messages that were only partially processed when the crash occurred will be reprocessed, potentially resulting in some duplicated output messages. Therefore, downstream services must be tolerant of duplicate messages (which is likely not to be a problem, since it will at most imply an entity changed twice when it only changed once).

This service stores no information locally, and merely consumes the incoming stream.

Zone watch service

The zone watch service is responsible for tracking zone subscriptions and using them to identify to which devices each entity change should be forwarded.

Consumes: zone-changes, partitioned by zone ID

Outputs: changes-by-device, partitioned by device ID

Processing logic:

  1. Consume message from zone-changes, partitioned by zone ID
  2. If the message is a summary of an entity change, then:
  3. Generate a unique notification identifier for this change
  4. Create a notification summary message for the entity change
  5. Find in local cache/store the list of all devices that have subscriptions to this zone
  6. For each device: 1. write out the notification summary message to changes-by-device, using the device ID as the key and partition key
  7. If the message is a change to a subscription for a specific device, then
  8. Read from the cache/store the zone's list of subscriptions
  9. Remove or update the subscription in the zone's list of subscriptions
  10. Write list of subscriptions for the zone back to the cache/store
  11. Mark that the input message was consumed

Failure tolerance: This service is tolerant of crashes and will not lose or corrupt any data. After restarting the service following a crash, some input messages that were only partially processed when the crash occurred will be reprocessed, potentially resulting in some duplicated output messages. Therefore, downstream services must be tolerant of duplicate messages.

The only information stored locally is the list of subscriptions for each zone (seen by the service instance). This information is stored in a local database, but it is also backed by a dedicated, replicated, and partitioned stream. Therefore, should the local database become lost or corrupt, the information can be recovered from the dedicated stream.

Total loss of this service's local database, should it occur, does not affect any of the persisted entity or schema information. All zone subscriptions can be rebuilt from the $zoneSubscription entities.

Coalescing service

The coalescing service is responsible for coalescing all of the notifications sent to a specific device.

Consumes: changes-by-device, partitioned by device ID

Outputs: device-notifications, partitioned by device ID

Processing logic:

  1. Consume message from changes-by-device, partitioned by device ID
  2. Find in local cache/store the list of all notifications for the device; if none is found create an empty document
  3. Add the notification to the list, ensuring that the notification does not appear more than once
  4. Write the list back to the cache/store
  5. Build an output message with the total number of notifications in the list, the device, the user, etc.
  6. Write the output message to device-notifications, partitioned by device ID
  7. Mark that the input message was consumed

Failure tolerance: This service is tolerant of crashes and will not lose or corrupt any data. After restarting the service following a crash, some input messages that were only partially processed when the crash occurred will be reprocessed, potentially resulting in some duplicated output messages. Therefore, downstream services must be tolerant of duplicate messages.

The only information stored locally is the list of subscriptions for each zone (seen by the service instance). This information is stored in a local database, but it is also backed by a dedicated, replicated, and partitioned stream. Therefore, should the local database become lost or corrupt, the information can be recovered from the dedicated stream.

Total loss of this service's local database, should it occur, does not affect any of the persisted entity or schema information.