Skip to content

Does Enclaver support mTLS / self-signed certificates? #184

@mderriey

Description

@mderriey

Hi, it's me again 👋

I'm spiking running a production-grade Vault cluster in Enclaver.

I'm having issues joining a second node to a cluster, at the very last step where the existing leader node needs to communicate to the new-joining node with mTLS.
The client certificate is self-signed and generated by Vault, see an excerpt from the official documentation:

[...]
For the request forwarding method, the servers need direct communication with each other. In order to perform this securely, the active node also advertises, via the encrypted data store entry, a newly-generated private key (ECDSA-P521) and a newly-generated self-signed certificate designated for client and server authentication. Each standby uses the private key and certificate to open a mutually-authenticated TLS 1.2 connection to the active node via the advertised cluster address. When client requests come in, the requests are serialized, sent over this TLS-protected communication channel, and acted upon by the active node. The active node then returns a response to the standby, which sends the response back to the requesting client.

Unfortunately, this communication fails with the following error message from Vault:

{
  "@level": "error",
  "@message": "failed to heartbeat to",
  "@module": "storage.raft",
  "@timestamp": "2023-12-01T09:15:23.527220Z",
  "backoff time": 2500000000,
  "error": "dial tcp 10.1.54.175:8201: connect: network is unreachable",
  "peer": "10.1.54.175:8201"
}

Things I've confirmed:

  • The IP address is correct.

  • The nodes can communicate over HTTP on port 8200, since prior to that last step, the new-joining node makes an HTTP call to the existing leader node to submit its desire to join the cluster.

  • The Enclaver manifest file allows both ingress on port 8201 for the existing leader and egress to the VPC CIDR for the new-joining node:

    # https://edgebit.io/enclaver/docs/0.x/manifest/
    version: v1
    name: "enclaver-vault"
    
    sources:
      # Name and tag of the Docker container that contains the application code
      app: "$SOURCE_DOCKER_IMAGE_NAME"
    
    # Name and tag of the Docker container outputted from the build process
    target: "$TARGET_DOCKER_IMAGE_NAME"
    
    ingress:
      # Vault listens on both 8200 (API) and 8201 (node-to-node communication)
      - listen_port: 8200
      - listen_port: 8201
    
    egress:
      allow:
        # IMDS
        - 169.254.169.254
        # EC2 APIs for auto-join discovery
        - ec2.*.amazonaws.com
        # VPC CIDR
        - 10.1.0.0/16
        # EC2 host (I don't think we need this one)
        - host
    
    kms_proxy:
      listen_port: 9999
    
    defaults:
      memory_mb: 2000
  • I tried the same setup by running the "bare" source Docker images and the node-to-node communication works fine, i.e. the second node did complete joining the cluster.

Do you know if there's something in Enclaver that would prevent this from happening, or if maybe there's a way to make this work?

Thanks, please let me know if you need additional information.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions