Skip to content

Datahub connection to Opensearch #16217

@malishariat20

Description

@malishariat20

I am deploying DataHub in Kubernetes.
created an IAM user with a policy that allows access to Amazon OpenSearch Service in the same AWS account as Datahub.
I also configured the access policy in OpenSearch to allow that IAM user.
When I test the IAM user credentials manually (for example using kubectl exec into the pod or using AWS CLI), the IAM user can successfully access OpenSearch.
awscurl --service es --region ap-southeast-2 -k "https://search-datahub-opensearch-5fgezbrrqkpem4kgtxtk3grltu.ap-southeast-2.es.amazonaws.com/_cluster/health"

result:

{"cluster_name":"855460960717:datahub-opensearch","status":"green","timed_out":false,"number_of_nodes":1,"number_of_data_nodes":1,"discovered_master":true,"discovered_cluster_manager":true,"active_p 
rimary_shards":12,"active_shards":12,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":0,"delayed_unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,"task 
  _max_waiting_in_queue_millis":0,"active_shards_percent_as_number":100.0}      

However, when the DataHub pod runs, it returns an error when trying to connect to OpenSearch.
2026/02/12 23:55:06 Waiting for [https://search-datahub-opensearch-5fgezbrrqkpem4kgtxtk3grltu.ap-southeast-2.es.amazonaws.com:443](https://search-datahub-opensearch-5fgezbrrqkpem4kgtxtk3grltu.ap-southeast-2.es.amazonaws.com/): unexpected HTTP status code: 403.

I’ve already set the param insecure: "true" to skip SSL verification but yet getting the same error.

Here is how to set opensearch in datahub.values.yaml.

datahub:
  global:
    # Production endpoints
    elasticsearch:
      host: "***"
      # insecure=true required due to corporate SSL inspection which replaces AWS OpenSearch certificates
      insecure: "true"
      port: "443"
      useSSL: "true"
      region: "ap-southeast-2"
      engineType: "opensearch"
      iam:
        enabled: true

Here is specs to setup elasticsearch job

elasticsearchSetupJob:
    extraEnvs:
      - name: USE_AWS_ELASTICSEARCH
        value: "true"
      - name: OPENSEARCH_USE_AWS_IAM_AUTH
        value: "true"

  datahubUpgrade:
    extraEnvs:
      - name: USE_AWS_ELASTICSEARCH
        value: "true"
      - name: OPENSEARCH_USE_AWS_IAM_AUTH
        value: "true"
    extraVolumes:
      - name: datahub-tls-keystore
        secret:
          secretName: kafka-datahub
    extraVolumeMounts:
      - name: datahub-tls-keystore
        mountPath: /tls/keystore
        readOnly: true

I’ve set OPENSEARCH_USE_AWS_IAM_AUTH to true to user IAM user and credential to connect to opensearch.

IAM policy

access_policies = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect    = "Allow"
        Principal = "*"
        Action    = "es:*"
        Resource  = "arn:aws:es:${var.aws_region}:${data.aws_caller_identity.datahub.account_id}:domain/datahub-opensearch/*"
      }
    ]
  })

opensearch Policy has domain level access policy as

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::855460960717:user/datahub-opensearch"
      },
      "Action": "es:ESHttp*",
      "Resource": "arn:aws:es:ap-southeast-2:855460960717:domain/datahub-opensearch/*"
    }
  ]
}

Please help me to resolve the issue.

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions