Cluster configuration

Overview

Corda makes use of the typesafe config configuration library. We extend it with a notion of secret configuration values, where secret management can be implemented by a range of different secrets service backends. This repository provides one implementation of the secrets service that stores encrypted secrets, and external addons may be loaded via OSGi which for instance could read andw write to external secrets stores.

SmartConfig inherits the basic approach of typesafe config, which is to take data sources (e.g. JSON files) at run time, and provide convenient access functions which throw exceptions if the requested configuration is missing, or has the wrong type. This makes it concise and convenient to pass around SmartConfig instances and look up data on them where needed, without needing explicit type conversion or error checking code. This approach to configuration is dynamic; neither the IDE nor the compiler can tell if the wrong name or type is used. It feels more like Python than Java.

The following diagram shows the end-to-end process for creating or updating the configuration of a Corda cluster.

Configuration management process flow diagram

A cluster configuration is updated as follows:

An HTTP client sends an HTTP request to the RPC worker to update the cluster configuration.
The RPC worker validates the configuration against the JSON schema for the config section. If validation fails an error is returned to the client.
The RPC worker writes a configuration management request to the configuration management topic on the Kafka bus.
The DB worker reads the configuration management request from the configuration management topic on the Kafka bus.
The DB worker validates the config against the JSON schema for the config section. If validation fails an error is returned to the RPC worker via the RPC response topic.
The DB worker updates the configuration tables in the cluster database (no defaults applied).
The DB worker applies default values for the config based on the JSON schema definition for the config section.
The DB worker writes the updated configuration with defaults applied to the configuration topic on the Kafka bus.
The DB worker writes a success response to the configuration management response topic on the Kafka bus.
The RPC worker reads the response from the configuration management response topic on the Kafka bus.
The RPC worker notifies the HTTP client of the success of their request.
Workers are notified of the new or updated configuration from the configuration topic on the Kafka bus using the configuration read service.

The process uses three separate Kafka topics:

Configuration management topic — holds requests to modify the current state of the cluster configuration.
Configuration management response topic — holds responses to requests made on the topic above.
Configuration topic — a compacted topic that holds the current state of the cluster configuration.

The sections below describe this process in more detail.

HTTP requests for configuration updates

The RPC worker exposes an HTTP interface for managing cluster configuration. The endpoints for this interface are defined by the ConfigRPCOps interface, as implemented by the ConfigRPCOpsImpl interface. The HttpRpcGateway component discovers this implementation class at worker start-up and automatically serves its endpoints.

There is a single endpoint for configuration management:

/api/v1/config

Requests to this endpoint are expected to take the form of POST requests with the following body:

{
    "request": {
        "section": "configSection",
        "config": " \"key1\"=val1 \n \"key2\"=val2",
        "schemaVersion": {
            "major": 1,
            "minor": 0
         },
        "version": -1
    }
}

e.g

{
  "request": {
    "config": "session { \"messageResendWindow\" = 500 }",
	"schemaVersion": {
      "major": 1,
      "minor": 0
    },    "section": "corda.flow",
    "version": -1
  }
}

Where:

section — the section of the configuration to be updated.
config — the updated configuration in JSON or HOCON format.
schemaVersion — the schema version of the configuration.
version — the version number used for optimistic locking. The request fails if this version does not match the version stored in the database for the corresponding section or -1 if this is a new section for which no configuration has yet been stored.

These requests are automatically mapped to HTTPUpdateConfigRequest objects for handling by ConfigRPCOpsImpl.

Successful requests will receive a response with a success code (2xx) that contains the updated configuration in JSON format. For example:

{
    "section": "configSection",
    "config": " \"key1\"=val1 \n \"key2\"=val2",
    "schemaVersion": {
            "major": 1,
            "minor": 0
     },
    "version": 0
}

While unsuccessful requests are indicated by an error code (4xx or 5xx).

These responses are automatically mapped from HTTPUpdateConfigResponse objects.

Publication of configuration update requests by the RPC worker

ConfigRPCOpsImpl holds a reference to a running RPCSender<ConfigurationManagementRequest, ConfigurationManagementResponse>. For each incoming HTTP configuration update request to the RPC worker, the connection is held open and the RPC sender is used to publish a message to the config.management.request Kafka topic. This message uses the ConfigurationManagementRequest Avro schema:

{
  "type": "record",
  "name": "ConfigurationManagementRequest",
  "namespace": "net.corda.data.config",
  "fields": [
    {
      "name": "section",
      "type": "string",
      "doc": "Section of the configuration to update."
    },
    {
      "name": "config",
      "type": "string",
      "doc": "Updated configuration in JSON or HOCON format."
    },
    {
      "name": "schemaVersion",
      "type": "net.corda.data.config.ConfigurationSchemaVersion",
      "doc": "Schema version of the updated configuration."
    },
    {
      "name": "updateActor",
      "type": "string",
      "doc": "ID of RPC user that requested the configuration update."
    },
    {
      "name": "version",
      "type": "int",
      "doc": "Version of the configuration for optimistic locking."
    }
  ]
}

The RPC worker than awaits a response on the config.management.request.resp topic. This message uses the ConfigurationManagementResponse Avro schema:

{
  "type": "record",
  "name": "ConfigurationManagementResponse",
  "namespace": "net.corda.data.config",
  "fields": [
    {
      "name": "success",
      "type": "boolean",
      "doc": "Whether the request was successful."
    },
    {
      "name": "exception",
      "type": [
        "null",
        "net.corda.data.ExceptionEnvelope"
      ],
      "doc": "The cause of failure if the request was unsuccessful."
    },
    {
      "name": "section",
      "type": "string",
      "doc": "The configuration section for which an update was requested."
    },
    {
      "name": "config",
      "type": "string",
      "doc": "The current configuration in JSON format for the given section."
    },
    {
      "name": "schemaVersion",
      "type": "net.corda.data.config.ConfigurationSchemaVersion",
      "doc": "The current configuration's schema version for the given section."
    },
    {
      "name": "version",
      "type": "int",
      "doc": "The current configuration's optimistic-locking version for the given section."
    }
  ]
}

If the success field is true, the configuration update request was successful and a success HTTP response is sent to the HTTP client. Otherwise, a failure HTTP response is sent, based on the error type and error message in the exception field.

The HTTP connection is then closed.

Persistence of configuration updates by the DB worker

The DB worker uses two tables in the cluster database to manage configuration, config and configAudit. These tables are created using the following Liquibase scripts:

<createTable tableName="config" schemaName="${schema.name}">
    <column name="section" type="VARCHAR(255)">
        <constraints nullable="false"/>
    </column>
    <column name="config" type="TEXT">
        <constraints nullable="false"/>
    </column>
    <column name="schema_version_major" type="INT">
        <constraints nullable="false"/>
    </column>
    <column name="schema_version_minor" type="INT">
        <constraints nullable="false"/>
    </column>
    <column name="update_ts" type="DATETIME">
        <constraints nullable="false"/>
    </column>
    <column name="update_actor" type="VARCHAR(255)">
        <constraints nullable="false"/>
    </column>
    <column name="version" type="INT">
        <constraints nullable="false"/>
    </column>
</createTable>
<addPrimaryKey columnNames="section" constraintName="config_pk" tableName="config"
               schemaName="${schema.name}"/>

...

<createTable tableName="config_audit" schemaName="${schema.name}">
    <column name="change_number" type="SERIAL">
        <constraints nullable="false"/>
    </column>
    <column name="section" type="VARCHAR(255)">
        <constraints nullable="false"/>
    </column>
    <column name="config" type="TEXT">
        <constraints nullable="false"/>
    </column>
    <column name="config_version" type="INT">
        <constraints nullable="false"/>
    </column>
    <column name="update_ts" type="DATETIME">
        <constraints nullable="false"/>
    </column>
    <column name="update_actor" type="VARCHAR(255)">
        <constraints nullable="false"/>
    </column>
</createTable>
<addPrimaryKey columnNames="change_number" constraintName="config_audit_pk" tableName="config_audit"
               schemaName="${schema.name}"/>
<createSequence sequenceName="config_audit_id_seq"/>

The DB worker listens for incoming configuration-management requests using an RPCSubscription<ConfigurationManagementRequest, ConfigurationManagementResponse> that consumes ConfigurationManagementRequest messages from the config.management.request topic. These messages are handled by the ConfigWriterProcessor.

For each message, the DB worker creates a corresponding ConfigEntity and ConfigAuditEntity, and attempts to persist them to the database. The only non-technical reason the update might fail is optimistic locking. The config table contains a version column, which the database automatically increments for each successful update. If the version field of a configuration update request does not exactly match the current version in the database, the request is rejected.

Publication of configuration updates by the DB worker

If the database tables are updated successfully, the DB worker publishes a message to the config topic. The message's key is the section of the configuration that is updated and the message itself follows the Configuration Avro schema:

{
  "type": "record",
  "name": "Configuration",
  "namespace": "net.corda.data.config",
  "fields": [
    {
      "name": "value",
      "type": "string"
    },
    {
      "name": "version",
      "type": "string"
    },
    {
      "name": "schemaVersion",
      "type": "net.corda.data.config.ConfigurationSchemaVersion",
      "doc": "Schema version for this configuration."
    }
  ]
}

Other workers can consume the message off the topic via the ConfigurationReadService component to learn the current state of the cluster configuration.

If the persistence to the database and the publication to the config topic succeed, the DB worker responds to the RPC worker by publishing a ConfigurationManagementResponse message to the config.management.request.resp topic with the success field set to true. Otherwise, it publishes a message with the success field set to false, with the exception field documenting the cause of the failure.

Cluster configuration

Overview

HTTP requests for configuration updates

Publication of configuration update requests by the RPC worker

Persistence of configuration updates by the DB worker

Publication of configuration updates by the DB worker

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally