Skip to content

[BC 行为改变][ABI change] 容器感知CPU数 Add container aware cpu num#6075

Open
dixyes wants to merge 1 commit into
swoole:masterfrom
dixyes:container-awareness
Open

[BC 行为改变][ABI change] 容器感知CPU数 Add container aware cpu num#6075
dixyes wants to merge 1 commit into
swoole:masterfrom
dixyes:container-awareness

Conversation

@dixyes

@dixyes dixyes commented Jun 5, 2026

Copy link
Copy Markdown
Member

背景 Background

Swoole从真实cpu数获取SW_CPU_NUM,并把它用到了默认的reactor_num reactor_num,有的大型节点有256或更多核心数量,这导致默认server在MODE_PROCESS下产生了256线程*256进程,带来了巨大的额外开销

在容器化场景,比如docker或者k8s,容器CPU利用率被限制到了某个值,而swoole还是会用所有核心数,这也导致了巨大开销

Swoole retrieves SW_CPU_NUM from the real CPU count and applies it as the default value for both reactor_num and worker_num. On some large nodes with 256 or more CPU cores, this causes the default server in MODE_PROCESS to spawn 256 threads * 256 processes, leading to massive overhead due to context switching.

In containerized scenarios like Docker or Kubernetes, the container's CPU utilization is often limited to a specific value. However, Swoole still attempts to utilize all host cores based on the absolute hardware count, which also results in significant unnecessary resource overhead.

实现细节 Implementation

分为两个函数 其中available_cpu_num用于获取可用的CPU数量(语义上接近真实cpu数量) container_aware_cpu_num用于获取容器化环境下的可用CPU数量

available_cpu_num作为SW_CPU_NUM而将container_aware_cpu_num用于默认reactor_num reactor_num

The logic is split into two functions: available_cpu_num is used to get the number of available CPUs (semantically close to the physical CPU count), while container_aware_cpu_num is used to determine the available CPU count within containerized environments.

available_cpu_num is now assigned to SW_CPU_NUM, whereas container_aware_cpu_num is applied to determine the default values for reactor_num and worker_num.

available_cpu_num

在Windows下使用GetProcessAffinityMask,在Linux下使用sched_getaffinity,在FreeBSD下使用cpuset_getaffinity,在NetBSD下使用_sched_getaffinity,获取cpuset掩码并计算可用的cpu数量

其它系统使用sysconf(_SC_NPROCESSORS_ONLN)

On Windows, it utilizes GetProcessAffinityMask. On Linux, it uses sched_getaffinity. On FreeBSD, it uses cpuset_getaffinity. On NetBSD, it uses _sched_getaffinity. These methods retrieve the cpuset mask to calculate the actual number of available CPUs.

Other operating systems fall back to using sysconf(_SC_NPROCESSORS_ONLN).

container_aware_cpu_num

在Linux下

  • 使用procfs文件系统获取挂载信息并定位cgroup cgroup2挂载点
  • 在cgroup中遍历并确认所在的cgroup
  • 根据cgroup的cpu.max(v2)或cpu.cfs_quota_us cpu.cfs_period_us(v1)获取可用cpu数量并取整

在Windows下使用QueryInformationJobObject获取信息并取整

在FreeBSD下使用rctl_rule_get获取信息并取整(尽力而为,并不能用,jail里不允许调用这个函数)

Under Linux:

  • Uses the procfs file system to retrieve mount information and locate the cgroup / cgroup2 mount points.
  • Traverses the cgroups to identify and confirm the current cgroup path.
  • Obtains the available CPU count based on cpu.max (v2) or cpu.cfs_quota_us and cpu.cfs_period_us (v1) of the cgroup, then rounds it to an integer.

Under Windows, it uses QueryInformationJobObject to retrieve the information and rounds it to an integer.

Under FreeBSD, it uses rctl_rule_get to obtain the information and rounds it to an integer (implemented on a best-effort basis; it is currently not fully functional as calling this function inside a jail is not permitted)

验证 Verification

在Linux:通过cgexec和docker run 在cgroup v1/v2进行了验证
在FreeBSD:通过rctl进行了验证
在Windows:通过docker Windows container模式进行了验证

On Linux: Verified in cgroup v1/v2 environments using cgexec and docker run.
On FreeBSD: Verified using rctl.
On Windows: Verified using Docker Windows container mode.

其他 Extra

也实现了其他版本/分支(但我就懒得提pr了,需要直接合就行)

I have also implemented this feature for other versions/branches (didn't bother to open separate PRs for them, you can cherry-pick or merge them directly if needed)

6.2
6.1
6.0
5.1
5.0
4.8

@dixyes dixyes force-pushed the container-awareness branch from b1a9d00 to 95b1698 Compare June 5, 2026 03:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant