Skip to content

TKG management cluster creation fails due to missing compatibility image #865

@zcahana

Description

@zcahana

Bug description

When attempting to create a TKG (2.4.1) management cluster via the tanzu CLI, the command fails with the following error:

$ tanzu management-cluster create k8s-tkg-c9c5-management \
    --file ~/tkg/management-cluster.yaml \
    --use-existing-bootstrap-cluster kind-k8s-tkg-c9c5-bootstrap-create \
    --timeout 2h --verbose 9
Downloading TKG compatibility file from 'projects.registry.vmware.com/tkg/tkg-compatibility'
Error: unable to ensure prerequisites: unable to ensure tkg BOM file: failed to download TKG compatibility file from the registry: failed to list TKG compatibility image tags: GET https://projects.registry.vmware.com/v2/tkg/tkg-compatibility/tags/list?n=1000: NAME_UNKNOWN: Repository name not known to registry.; map[name:tkg/tkg-compatibility]

Also, if manually attempting to install the management-cluster plugin, this fails at the plugin initialization phase with a similar message:

$ tanzu plugin install management-cluster --target kubernetes --version v0.31.1
[i] Installing plugin 'management-cluster:v0.31.1' with target 'kubernetes'
[i] Plugin binary for 'management-cluster:v0.31.1' found in cache
[!] Warning: Failed to initialize plugin '"management-cluster"' after installation. Downloading TKG compatibility file from 'projects.registry.vmware.com/tkg/tkg-compatibility'
Error: unable to ensure prerequisites: unable to ensure tkg BOM file: failed to download TKG compatibility file from the registry: failed to list TKG compatibility image tags: GET https://projects.registry.vmware.com/v2/tkg/tkg-compatibility/tags/list?n=1000: NAME_UNKNOWN: Repository name not known to registry.; map[name:tkg/tkg-compatibility]
[ok] successfully installed 'management-cluster' plugin

Expected behavior

The tanzu management-cluster create is expected to finish successfully and setup a ready-for-action TKG management cluster.

Steps to reproduce the bug / Relevant debug output

  1. Install dependencies (docker, kind, tanzu CLI)
  2. Create bootstrap cluster via kind:
kind create cluster --name k8s-tkg-c9c5-bootstrap-create && \
kind export kubeconfig --name k8s-tkg-c9c5-bootstrap-create
  1. Setup required network infrastructure (VPC, subnets, ...)
  2. Create management cluster config file:
#! ---------------------------------------------------------------------
#! Basic cluster creation configuration
#! ---------------------------------------------------------------------
CLUSTER_NAME: k8s-tkg-c9c5-management
CLUSTER_PLAN: dev
INFRASTRUCTURE_PROVIDER: aws
ENABLE_AUDIT_LOGGING: false
CLUSTER_CIDR: 100.96.0.0/11
SERVICE_CIDR: 100.64.0.0/13
ENABLE_CEIP_PARTICIPATION: "false"


#! ---------------------------------------------------------------------
#! Node configuration
#! AWS-only MACHINE_TYPE settings override cloud-agnostic SIZE settings.
#! ---------------------------------------------------------------------
CONTROL_PLANE_MACHINE_TYPE: c5.xlarge
NODE_MACHINE_TYPE: c5.xlarge


#! ---------------------------------------------------------------------
#! AWS configuration
#! ---------------------------------------------------------------------
AWS_REGION: us-east-1
AWS_NODE_AZ: us-east-1a
AWS_SSH_KEY_NAME: k8s-tkg-c9c5-ssh-key
AWS_VPC_ID: vpc-027b500695410d42a
AWS_PRIVATE_SUBNET_ID: subnet-0ff40c0fc677cadad
AWS_PUBLIC_SUBNET_ID: subnet-0b182561fd01c9fce
BASTION_HOST_ENABLED: "true"


#! ---------------------------------------------------------------------
#! Identity management configuration
#! ---------------------------------------------------------------------
IDENTITY_MANAGEMENT_TYPE: none


TKG_HTTP_PROXY_ENABLED: "false"


#! ---------------------------------------------------------------------
#! Machine Health Check configuration
#! ---------------------------------------------------------------------
ENABLE_MHC: "false"
ENABLE_MHC_CONTROL_PLANE: true
ENABLE_MHC_WORKER_NODE: true
MHC_UNKNOWN_STATUS_TIMEOUT: 30m
MHC_FALSE_STATUS_TIMEOUT: 30m
  1. Execute command:
tanzu management-cluster create k8s-tkg-c9c5-management \
  --file ~/tkg/management-cluster.yaml \
  --use-existing-bootstrap-cluster kind-k8s-tkg-c9c5-bootstrap-create \
  --timeout 2h --verbose 9

Output of tanzu version

I reproduced this via both tanzu v1.0.0 and v1.5.3:

$ tanzu version
version: v1.0.0
buildDate: 2023-08-08
sha: 006d0429
$ tanzu version
version: v1.5.3
buildDate: 2025-01-29
sha: f73b9ec
arch: amd64

Environment where the bug was observed (cloud, OS, etc)

AWS EC2 instance with ubuntu-server 24.04 (bootstrap machine running tanzu CLI).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions