Skip to content

Crashes on self-hosted with panic: runtime error: integer divide by zero #182

@orhun

Description

@orhun

My setup is the following:

  • Anteon Self Hosted running via Docker compose
  • Kubernetes running via microk8s
  • Alaz installed via kubectl, fails with the following error:
$ kubectl logs -n anteon alaz-daemonset-sskqx

{"level":"info","tag":"v0.11.3","time":1723187890,"message":"alaz tag"}
{"level":"info","time":1723187890,"message":"k8sCollector initializing..."}
{"level":"info","time":1723187890,"message":"Connected successfully to CRI using endpoint unix:///proc/1/root/run/containerd/containerd.sock"}
panic: runtime error: integer divide by zero

goroutine 47 [running]:
github.com/ddosify/alaz/aggregator.(*ClusterInfo).handleSocketMapCreation(0xc0002dc5b0)
	/app/aggregator/cluster.go:89 +0x33d
created by github.com/ddosify/alaz/aggregator.newClusterInfo in goroutine 1
	/app/aggregator/cluster.go:59 +0x1a9
kubectl describe pod -n anteon alaz-daemonset-sskqx
Name:             alaz-daemonset-sskqx
Namespace:        anteon
Priority:         0
Service Account:  alaz-serviceaccount
Node:             thinkpad/192.168.1.38
Start Time:       Fri, 09 Aug 2024 10:01:44 +0300
Labels:           app=alaz
                  controller-revision-hash=6f9d87bfc4
                  pod-template-generation=1
Annotations:      cni.projectcalico.org/containerID: 003a6554ea84ff581daee5b353ccf9b6619a8febdb6302ce34a566764f0e45f3
                  cni.projectcalico.org/podIP: 10.1.19.183/32
                  cni.projectcalico.org/podIPs: 10.1.19.183/32
Status:           Running
IP:               10.1.19.183
IPs:
  IP:           10.1.19.183
Controlled By:  DaemonSet/alaz-daemonset
Containers:
  alaz-pod:
    Container ID:  containerd://c6c904add2264b0016798d11550f2ff05e683fe713c681c3f3a415e31de9f07c
    Image:         ddosify/alaz:v0.11.3
    Image ID:      docker.io/ddosify/alaz@sha256:08dbbb8ba337ce340a8ba8800e710ff5a2df9612ea258cdc472867ea0bb97224
    Port:          8181/TCP
    Host Port:     0/TCP
    Args:
      --no-collector.wifi
      --no-collector.hwmon
      --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/pods/.+)($|/)
      --collector.netclass.ignored-devices=^(veth.*)$
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    2
      Started:      Fri, 09 Aug 2024 10:18:10 +0300
      Finished:     Fri, 09 Aug 2024 10:18:11 +0300
    Ready:          False
    Restart Count:  8
    Limits:
      memory:  1Gi
    Requests:
      cpu:     1
      memory:  400Mi
    Environment:
      TRACING_ENABLED:             true
      METRICS_ENABLED:             true
      LOGS_ENABLED:                false
      BACKEND_HOST:                http://bore.pub:39548/api-alaz
      LOG_LEVEL:                   1
      MONITORING_ID:               7c6a484a-ec47-46a6-946d-4071ff6cf883
      SEND_ALIVE_TCP_CONNECTIONS:  false
      NODE_NAME:                    (v1:spec.nodeName)
    Mounts:
      /sys/kernel/debug from debugfs (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-df6xh (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   True 
  Initialized                 True 
  Ready                       False 
  ContainersReady             False 
  PodScheduled                True 
Volumes:
  debugfs:
    Type:          HostPath (bare host directory volume)
    Path:          /sys/kernel/debug
    HostPathType:  
  kube-api-access-df6xh:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type     Reason   Age                   From     Message
  ----     ------   ----                  ----     -------
  Warning  BackOff  3m54s (x68 over 18m)  kubelet  Back-off restarting failed container alaz-pod in pod alaz-daemonset-sskqx_anteon(a3d74951-574e-4149-8db3-9749a627f5fd)
alaz.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: alaz-serviceaccount
  namespace: anteon
---
# For alaz to keep track of changes in cluster
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: alaz-role
  namespace: anteon
rules:
- apiGroups:
  - "*"
  resources:
  - pods
  - services
  - endpoints
  - replicasets
  - deployments
  - daemonsets
  - statefulsets
  verbs:
  - "get"
  - "list"
  - "watch"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: alaz-role-binding
  namespace: anteon
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: alaz-role
subjects:
- kind: ServiceAccount
  name: alaz-serviceaccount
  namespace: anteon
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: alaz-daemonset
  namespace: anteon
spec:
  selector:
    matchLabels:
      app: alaz
  template:
    metadata:
      labels:
        app: alaz
    spec:
      hostPID: true
      containers:
      - env:
        - name: TRACING_ENABLED
          value: "true"
        - name: METRICS_ENABLED
          value: "true"
        - name: LOGS_ENABLED
          value: "false"
        - name: BACKEND_HOST
          value: http://bore.pub:39548/api-alaz
        - name: LOG_LEVEL
          value: "1"
        # - name: EXCLUDE_NAMESPACES
        #   value: "^anteon.*"
        - name: MONITORING_ID
          value: 7c6a484a-ec47-46a6-946d-4071ff6cf883
        - name: SEND_ALIVE_TCP_CONNECTIONS  # Send undetected protocol connections (unknown connections)
          value: "false"
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        args:
        - --no-collector.wifi
        - --no-collector.hwmon
        - --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/pods/.+)($|/)
        - --collector.netclass.ignored-devices=^(veth.*)$
        image: ddosify/alaz:v0.11.3
        imagePullPolicy: IfNotPresent
        name: alaz-pod
        ports:
        - containerPort: 8181
          protocol: TCP
        resources:
          limits:
            memory: 1Gi
          requests:
            cpu: "1"
            memory: 400Mi
        securityContext:
          privileged: true 
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        # needed for linking ebpf trace programs
        volumeMounts:
        - mountPath: /sys/kernel/debug
          name: debugfs
          readOnly: false
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: alaz-serviceaccount
      serviceAccountName: alaz-serviceaccount
      terminationGracePeriodSeconds: 30
      # needed for linking ebpf trace programs
      volumes:
      - name: debugfs
        hostPath:
          path: /sys/kernel/debug

Only thing that I did different compared to the documentation was using bore.pub instead of ngrok which shouldn't be a problem I think.

I'm running Arch Linux with the kernel 6.10.1-arch1-1.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions