Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Operator injected with proxy failed to start. #32

Closed
qiuming520 opened this issue Feb 5, 2024 · 12 comments
Closed

The Operator injected with proxy failed to start. #32

qiuming520 opened this issue Feb 5, 2024 · 12 comments
Assignees

Comments

@qiuming520
Copy link

General Question

My Operator : mysql-operator
k8s version : v1.21.5
controller-mesh : v0.1.2
shardingconfig-root :

apiVersion: ctrlmesh.kusionstack.io/v1alpha1
kind: ShardingConfig
metadata:
  name: sharding-root
  namespace: test-mysql
spec:
  root:
    prefix: mysql-operator
    targetStatefulSet: mysql-operator
    canary:
      replicas: 1
      inNamespaces:
      - demo-0a
      - demo-0b
      - demo-0c
    auto:
      everyShardReplicas: 2
      shardingSize: 2
    resourceSelector:
    - relateResources:
      - apiGroups:
        - '*'
        resources:
        - '*'
  controller:
    leaderElectionName: mysql-operator-leader-election

My Operator has 2 containers (operator and orchestrator). When not using patch to add the labels required by ctrlmesh (ctrlmesh.kusionstack.io/enable-proxy: "true", ctrlmesh.kusionstack.io/watching: "true") , can be started successfully. After using patch to request labeling, ctrlmesh-proxy and operator can start normally, but orchestrator fails to start with the following error:

Error starting command: `--kubeconfig=/etc/kubernetes/kubeconfig/fake-kubeconfig.yaml` - fork/exec --kubeconfig=/etc/kubernetes/kubeconfig/fake-kubeconfig.yaml: no such file or directory

kubectl get pod/mysql-operator-0 -oyaml Check the pod and find that the /etc/kubernetes/kubeconfig/fake-kubeconfig.yaml file has been mounted

Who can help me?

@qiuming520
Copy link
Author

/assign @wu8685

@Eikykun
Copy link
Member

Eikykun commented Feb 5, 2024

Thx for your report, @qiuming520

Please check:

  • Whether the configmap named fake-kubeconfig has been automatically created in namespace test-mysql;
  • Whether the file /etc/kubernetes/kubeconfig/fake-kubeconfig.yaml exists in the container directory.

If none of the above issues have occurred, I may need more information, such as the complete pod yaml, pod events, etc.

@qiuming520
Copy link
Author

Thx for your report, @qiuming520

Please check:

  • Whether the configmap named fake-kubeconfig has been automatically created in namespace test-mysql;
  • Whether the file /etc/kubernetes/kubeconfig/fake-kubeconfig.yaml exists in the container directory.

If none of the above issues have occurred, I may need more information, such as the complete pod yaml, pod events, etc.

@Eikykun

  • cm (named fake-kubeconfig) already exists
    image

  • My Operator has 2 containers (operator and orchestrator). For the operator container that started successfully, I can the file /etc/kubernetes/kubeconfig/fake-kubeconfig.yaml. For the orchestrator container that failed to start, it is impossible to enter the inside of the container, so it is impossible to view the internal files. I have tried to manually mount the /etc/kubernetes/kubeconfig/fake-kubeconfig.yaml into My Operator without injecting ctrlmesh-proxy, and it can be started successfully, and both have /etc/kubernetes/kubeconfig/fake-kubeconfig.yaml . After injecting ctrlmesh-proxy, orchestrator failed to start

@qiuming520
Copy link
Author

qiuming520 commented Feb 5, 2024

Thx for your report, @qiuming520

Please check:

  • Whether the configmap named fake-kubeconfig has been automatically created in namespace test-mysql;
  • Whether the file /etc/kubernetes/kubeconfig/fake-kubeconfig.yaml exists in the container directory.

If none of the above issues have occurred, I may need more information, such as the complete pod yaml, pod events, etc.

this is My Operator's pod
mysql-operator.txt

@wu8685
Copy link
Contributor

wu8685 commented Feb 8, 2024

Thx for your report, @qiuming520
Please check:

  • Whether the configmap named fake-kubeconfig has been automatically created in namespace test-mysql;
  • Whether the file /etc/kubernetes/kubeconfig/fake-kubeconfig.yaml exists in the container directory.

If none of the above issues have occurred, I may need more information, such as the complete pod yaml, pod events, etc.

this is My Operator's pod mysql-operator.txt

Can you provide the last log of the orchestrator container when it failed to start with ctrlmesh-proxy injected. You can find it by command kubectl logs pod <your-operator-pod-name> -c orchestrator -p.

@qiuming520
Copy link
Author

Thx for your report, @qiuming520
Please check:

  • Whether the configmap named fake-kubeconfig has been automatically created in namespace test-mysql;
  • Whether the file /etc/kubernetes/kubeconfig/fake-kubeconfig.yaml exists in the container directory.

If none of the above issues have occurred, I may need more information, such as the complete pod yaml, pod events, etc.

this is My Operator's pod mysql-operator.txt

Can you provide the last log of the orchestrator container when it failed to start with ctrlmesh-proxy injected. You can find it by command kubectl logs pod <your-operator-pod-name> -c orchestrator -p.

this is log
image

@Eikykun
Copy link
Member

Eikykun commented Feb 18, 2024

The orchestrator's manager process might not have permissions to read /etc/kubernetes/kubeconfig/fake-kubeconfig.yaml. You can try removing the securityContext from the pod and start the manager process inside the container orchestrator with root privileges to troubleshoot this issue. @qiuming520

@qiuming520
Copy link
Author

You can try removing the securityContext from the pod and start the manager process inside the container orchestrator with root privileges to troubleshoot this issue

I have added securityContext for my containers (operator and orchestrator).

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql-operator
  namespace: test-mysql-1
spec:
  template:
  ... ...
    spec:
      containers:
      - name: operator
        securityContext:
          privileged: true
  ... ...
      - name: orchestrator
        securityContext:
          privileged: true
  ... ...
      securityContext: {}
  ... ...

As you can see in the picture below, the file (/etc/kubernetes/kubeconfig/fake-kubeconfig.yaml) exists and has root privileges in operator container.
image

But the issue still exists~ @Eikykun

@Eikykun
Copy link
Member

Eikykun commented Feb 18, 2024

I tried to reproduce this issue using two sample operator containers, but was not successful. Can you provide a scenario that reproduces this issue in kind? @qiuming520

@qiuming520
Copy link
Author

I tried to reproduce this issue using two sample operator containers, but was not successful. Can you provide a scenario that reproduces this issue in kind?

of course ! my Operator github https://github.com/bitpoke/mysql-operator

my scenario:Some pods use the canary namespace, and some pods use the normally released namespace.
shardingconfig-root :

apiVersion: ctrlmesh.kusionstack.io/v1alpha1
kind: ShardingConfig
metadata:
  name: sharding-root
  namespace: test-mysql-1
spec:
  root:
    prefix: mysql-operator
    targetStatefulSet: mysql-operator
    canary:
      replicas: 1
      inNamespaces:
      - mysql-01
      - mysql-02
      - mysql-03
    auto:
      everyShardReplicas: 1
      shardingSize: 1
    resourceSelector:
    - relateResources:
      - apiGroups:
        - '*'
        resources:
        - '*'
  controller:
    leaderElectionName: mysql-operator-leader-election

@Eikykun
Copy link
Member

Eikykun commented Feb 18, 2024

I tried to reproduce this issue using two sample operator containers, but was not successful. Can you provide a scenario that reproduces this issue in kind?

of course ! my Operator github https://github.com/bitpoke/mysql-operator

my scenario:Some pods use the canary namespace, and some pods use the normally released namespace. shardingconfig-root :

apiVersion: ctrlmesh.kusionstack.io/v1alpha1
kind: ShardingConfig
metadata:
  name: sharding-root
  namespace: test-mysql-1
spec:
  root:
    prefix: mysql-operator
    targetStatefulSet: mysql-operator
    canary:
      replicas: 1
      inNamespaces:
      - mysql-01
      - mysql-02
      - mysql-03
    auto:
      everyShardReplicas: 1
      shardingSize: 1
    resourceSelector:
    - relateResources:
      - apiGroups:
        - '*'
        resources:
        - '*'
  controller:
    leaderElectionName: mysql-operator-leader-election

I briefly looked at the orchestrator container, and it seems that it may not be a standard operator implemented using controller-runtime, as it does not use the --kubeconfig argument. Here, it may be necessary to understand the operating logic of the orchestrator container. If there is a case where a container is not an operator, it might require additional support from a new version.

@qiuming520
Copy link
Author

I tried to reproduce this issue using two sample operator containers, but was not successful. Can you provide a scenario that reproduces this issue in kind?

of course ! my Operator github https://github.com/bitpoke/mysql-operator
my scenario:Some pods use the canary namespace, and some pods use the normally released namespace. shardingconfig-root :

apiVersion: ctrlmesh.kusionstack.io/v1alpha1
kind: ShardingConfig
metadata:
  name: sharding-root
  namespace: test-mysql-1
spec:
  root:
    prefix: mysql-operator
    targetStatefulSet: mysql-operator
    canary:
      replicas: 1
      inNamespaces:
      - mysql-01
      - mysql-02
      - mysql-03
    auto:
      everyShardReplicas: 1
      shardingSize: 1
    resourceSelector:
    - relateResources:
      - apiGroups:
        - '*'
        resources:
        - '*'
  controller:
    leaderElectionName: mysql-operator-leader-election

I briefly looked at the orchestrator container, and it seems that it may not be a standard operator implemented using controller-runtime, as it does not use the --kubeconfig argument. Here, it may be necessary to understand the operating logic of the orchestrator container. If there is a case where a container is not an operator, it might require additional support from a new version.

Thx for your reply to helped me solve confusion~ @Eikykun

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants