Skip to content

Conversation

kaxil
Copy link
Member

@kaxil kaxil commented Oct 16, 2025

Workers no longer import the full kubernetes client library (~32-42 MB) when performing routine operations like secret masking and DAG serialization. The kubernetes client is only imported when actually processing kubernetes objects.

With the default 32 LocalExecutor workers, this could reduce memory usage by approximately 1 GB in deployments that don't all use k8s.

Part of #56641 (Kudos to @wjddn279 for investigation)

import sys
import tracemalloc

assert 'kubernetes' not in sys.modules

tracemalloc.start()
snapshot_before = tracemalloc.take_snapshot()

from kubernetes.client import V1EnvVar

snapshot_after = tracemalloc.take_snapshot()

top_stats = snapshot_after.compare_to(snapshot_before, 'traceback')
print("[ Top 10 differences ]")
for stat in top_stats[:10]:
    print(stat)

total = sum(stat.size_diff for stat in top_stats)
print(f"\nTotal memory increase: {total / 1024 / 1024:.2f} MB")

Output: Total memory increase: 41.62 MB


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

Workers no longer import the full kubernetes client library (~32-42 MB)
when performing routine operations like secret masking and DAG
serialization. The kubernetes client is only imported when actually
processing kubernetes objects.

With the default 32 LocalExecutor workers, this could reduce memory usage
by approximately 1 GB in deployments that don't all use k8s.

Part of apache#56641

```py
import sys
import tracemalloc

assert 'kubernetes' not in sys.modules

tracemalloc.start()
snapshot_before = tracemalloc.take_snapshot()

from kubernetes.client import V1EnvVar

snapshot_after = tracemalloc.take_snapshot()

top_stats = snapshot_after.compare_to(snapshot_before, 'traceback')
print("[ Top 10 differences ]")
for stat in top_stats[:10]:
    print(stat)

total = sum(stat.size_diff for stat in top_stats)
print(f"\nTotal memory increase: {total / 1024 / 1024:.2f} MB")
```
Output: Total memory increase: 41.62 MB
@kaxil kaxil added this to the Airflow 3.1.1 milestone Oct 16, 2025
@kaxil kaxil requested a review from potiuk October 16, 2025 02:07
@kaxil kaxil merged commit 4926999 into apache:main Oct 16, 2025
62 checks passed
@kaxil kaxil deleted the skip-k8s-client-import branch October 16, 2025 10:40
snreddygopu pushed a commit to Teradata/airflow that referenced this pull request Oct 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants