Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: ddtrace-run uvicorn ... fails ECS (FARGATE) healtcheck #11913

Open
viktorvsk-dualentry opened this issue Jan 13, 2025 · 5 comments
Open
Assignees
Labels
bug Profiling Continous Profling

Comments

@viktorvsk-dualentry
Copy link

viktorvsk-dualentry commented Jan 13, 2025

Tracer Version(s)

2.19.0rc1

Python Version(s)

3.12

Pip Version(s)

pip 24.2 (poetry 2.0.0)

Bug Report

Hi! I have a very strange issue when I try to enable profiling feature. I host a django app on ECS (FARGATE). I run it with the following command:

poetry run ddtrace-run uvicorn --app-dir=./myapp core.asgi:application --host 0.0.0.0 --port 8000 --workers 1 --lifespan off --log-level debug

These are the envs set:

DD_API_KEY=...
DD_PROFILING_ENABLED=true
DD_ENV=dev
DD_SERVICE=web

What happens in this case is the following:

  1. I can see (some) profiles on Datadog dashboard
  2. But the container never starts and it fails ALB healtcheck and goes to an infinite loop of deployment

(I omit the setup of datadog-agent, because otherwise everything works including traces, logs etc but let me know if there something that could be helpful)

But if I set ENVs to

DD_API_KEY=...
DD_PROFILING_ENABLED=false
DD_ENV=dev
DD_SERVICE=web

And then:

  1. Wait until the service is up and running, verify no profiles sent and everything works as expected
  2. Create one "SSH" session into ECS instance and run DD_PROFILING_ENABLED=true poetry run ddtrace-run uvicorn --app-dir=./myapp core.asgi:application --host 0.0.0.0 --port 8001 --workers 1 --lifespan off --log-level debug (notice, the port is different)
  3. Create another "SSH" session into ECS and run wget http://127.0.0.1:8001/healthcheck - all works fine (and even Profile traces appear on the Datadog side)

I've already tried a lot of things to narrow the issue down. What could I be missing that prevents ALB from succesfull healtch check when turning DD_PROFILING_ENABLED=true on my app? Maybe a better way to debug it?

Thanks in advance!

P.S. I don't want to overcomplicate things but the most frustrating part is that if I run it locally on mac mac connected to the same network - all works fine. Also, I have DD_LOG_LEVEL=DEBUG but I just see a 2-5 errors related to something different and some DEBUG logs not related to PROFILER. Looks like the problem is somewhere at FARGATE, ALB side of ddtrace integration

Reproduction Code

No response

Error Logs

No response

Libraries in Use

aiohappyeyeballs 2.4.4 Happy Eyeballs for asyncio
aiohttp 3.11.11 Async http client/server framework (asyncio)
aiosignal 1.3.2 aiosignal: a list of registered asynchronous callbacks
amqp 5.3.1 Low-level AMQP client for Python (fork of amqplib).
annotated-types 0.7.0 Reusable constraint types to use with typing.Annotated
anthropic 0.42.0 The official Python library for the anthropic API
anyio 4.8.0 High level compatibility layer for multiple asynchronous event loop implementations
asgiref 3.8.1 ASGI specs, helper code, and adapters
asttokens 3.0.0 Annotate AST trees with source code positions
attrs 24.3.0 Classes Without Boilerplate
billiard 4.2.1 Python multiprocessing fork with improvements and bugfixes
boto3 1.35.97 The AWS SDK for Python
boto3-stubs 1.35.97 Type annotations for boto3 1.35.97 generated with mypy-boto3-builder 8.8.0
boto3-stubs-full 1.35.97 All-in-one type annotations for boto3 1.35.97 generated with mypy-boto3-builder 8.8.0
botocore 1.35.97 Low-level, data-driven core of boto 3.
botocore-stubs 1.35.97 Type annotations and code completion for botocore
bytecode 0.16.0 Python module to generate and modify bytecode
celery 5.4.0 Distributed Task Queue.
certifi 2024.12.14 Python package for providing Mozilla's CA Bundle.
cffi 1.17.1 Foreign Function Interface for Python calling C code.
charset-normalizer 3.4.1 The Real First Universal Charset Detector. Open, modern and actively maintained alternative to Chardet.
click 8.1.8 Composable command line interface toolkit
click-didyoumean 0.3.1 Enables git-like did-you-mean feature in click
click-plugins 1.1.1 An extension module for click to enable registering CLI commands via setuptools entry-points.
click-repl 0.3.0 REPL plugin for Click
contextlib2 21.6.0 Backports and enhancements for the contextlib module
cryptography 43.0.3 cryptography is a package which provides cryptographic recipes and primitives to Python developers.
currencyapicom 0.1.1 CurrencyAPI Python Client
dataclasses-json 0.6.7 Easily serialize dataclasses to and from JSON.
ddtrace 2.19.0rc1 Datadog APM client library
decorator 5.1.1 Decorators for Humans
defusedxml 0.7.1 XML bomb protection for Python stdlib modules
deprecated 1.2.15 Python @deprecated decorator to deprecate old python classes, functions or methods.
distro 1.9.0 Distro - an OS platform information API
django 5.1.4 A high-level Python web framework that encourages rapid development and clean, pragmatic design.
django-cors-headers 4.6.0 django-cors-headers is a Django application for handling the server headers required for Cross-Origin Resource Sharing (CORS).
django-dirtyfields 1.9.5 Tracking dirty fields on a Django model instance.
django-extensions 3.2.3 Extensions for Django
django-ipware 7.0.1 A Django application to retrieve user's IP address
django-ninja 1.3.0 Django Ninja - Fast Django REST framework
django-ninja-extra 0.21.8 Django Ninja Extra - Class Based Utility and more for Django Ninja(Fast Django REST framework)
django-sequences 3.0 Generate gapless sequences of integer values.
django-structlog 8.1.0 Structured Logging for Django
django-stubs 5.1.1 Mypy stubs for Django
django-stubs-ext 5.1.1 Monkey-patching and extensions for django-stubs
envier 0.6.1 Python application configuration via the environment
everapi 0.1.1 Everapi Python Client
executing 2.1.0 Get the currently executing AST node of a frame, and other information
finch-api 1.11.0 The official Python library for the Finch API
frozenlist 1.5.0 A list-like structure which implements collections.abc.MutableSequence
gevent 24.11.1 Coroutine-based network library
gprof2dot 2024.6.6 Generate a dot graph from the output of several profilers.
greenlet 3.1.1 Lightweight in-process concurrent programming
h11 0.14.0 A pure-Python, bring-your-own-I/O implementation of HTTP/1.1
httpcore 1.0.7 A minimal low-level HTTP client.
httpx 0.27.2 The next generation HTTP client.
httpx-sse 0.4.0 Consume Server-Sent Event (SSE) messages with HTTPX.
idna 3.10 Internationalized Domain Names in Applications (IDNA)
importlib-metadata 8.5.0 Read metadata from Python packages
iniconfig 2.0.0 brain-dead simple config-ini parsing
injector 0.22.0 Injector - Python dependency injection framework, inspired by Guice
ipython 8.31.0 IPython: Productive Interactive Computing
jedi 0.19.2 An autocompletion tool for Python that can be used for text editors.
jellyfish 1.1.3 Approximate and phonetic matching of strings.
jiter 0.8.2 Fast iterable JSON parser.
jmespath 1.0.1 JSON Matching Expressions
jsonpatch 1.33 Apply JSON-Patches (RFC 6902)
jsonpointer 3.0.0 Identify specific nodes in a JSON document (RFC 6901)
kombu 5.4.2 Messaging library for Python.
langchain 0.3.14 Building applications with LLMs through composability
langchain-anthropic 0.2.4 An integration package connecting AnthropicMessages and LangChain
langchain-community 0.3.14 Community contributed LangChain integrations.
langchain-core 0.3.29 Building applications with LLMs through composability
langchain-openai 0.2.14 An integration package connecting OpenAI and LangChain
langchain-text-splitters 0.3.5 LangChain text splitting utilities
langgraph 0.2.62 Building stateful, multi-actor applications with LLMs
langgraph-checkpoint 2.0.9 Library with base interfaces for LangGraph checkpoint savers.
langgraph-sdk 0.1.51 SDK for interacting with LangGraph API
langsmith 0.2.10 Client library to connect to the LangSmith LLM Tracing and Evaluation Platform.
markupsafe 3.0.2 Safely add untrusted strings to HTML/XML markup.
marshmallow 3.25.1 A lightweight library for converting complex datatypes to and from native Python datatypes.
matplotlib-inline 0.1.7 Inline Matplotlib backend for Jupyter
msgpack 1.1.0 MessagePack serializer
multidict 6.1.0 multidict implementation
mypy 1.14.1 Optional static typing for Python
mypy-extensions 1.0.0 Type system extensions for programs checked with the mypy type checker.
numpy 2.2.1 Fundamental package for array computing in Python
openai 1.59.6 The official Python library for the openai API
opentelemetry-api 1.29.0 OpenTelemetry Python API
orjson 3.10.14 Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy
packaging 24.2 Core utilities for Python packages
parso 0.8.4 A Python Parser
pexpect 4.9.0 Pexpect allows easy control of interactive console applications.
pluggy 1.5.0 plugin and hook calling mechanisms for python
prompt-toolkit 3.0.48 Library for building powerful interactive command lines in Python
propcache 0.2.1 Accelerated property cache
protobuf 5.29.3
psycopg 3.2.3 PostgreSQL database adapter for Python
psycopg-binary 3.2.3 PostgreSQL database adapter for Python -- C optimisation distribution
psycopg2 2.9.10 psycopg2 - Python-PostgreSQL Database Adapter
ptyprocess 0.7.0 Run a subprocess in a pseudo terminal
pure-eval 0.2.3 Safely evaluate AST nodes without side effects
pycparser 2.22 C parser in Python
pycurl 7.45.4 PycURL -- A Python Interface To The cURL library
pydantic 2.9.2 Data validation using Python type hints
pydantic-core 2.23.4 Core functionality for Pydantic validation and serialization
pydantic-settings 2.7.1 Settings management using Pydantic
pygments 2.19.1 Pygments is a syntax highlighting package written in Python.
pyjwt 2.9.0 JSON Web Token implementation in Python
pytest 8.3.4 pytest: simple powerful testing with Python
pytest-profiling 1.8.1 Profiling plugin for py.test
python-dateutil 2.9.0.post0 Extensions to the standard Python datetime module
python-dotenv 1.0.1 Read key-value pairs from a .env file and set them as environment variables
python-ipware 3.0.0 A Python package to retrieve user's IP address
pytz 2024.2 World timezone definitions, modern and historical
pyyaml 6.0.2 YAML parser and emitter for Python
ramp-developer-api-client v1 openapi-clients/ramp-developer-api-client A client library for accessing Ramp Developer API
regex 2024.11.6 Alternative regular expression module, to replace re.
requests 2.32.3 Python HTTP for Humans.
requests-toolbelt 1.0.0 A utility belt for advanced users of python-requests
ruff 0.4.10 An extremely fast Python linter and code formatter, written in Rust.
s3transfer 0.10.4 An Amazon S3 Transfer Manager
sentry-sdk 2.19.2 Python client for Sentry (https://sentry.io)
setuptools 75.8.0 Easily download, build, install, upgrade, and uninstall Python packages
six 1.17.0 Python 2 and 3 compatibility utilities
sniffio 1.3.1 Sniff out which async library your code is running under
sqlalchemy 2.0.37 Database Abstraction Library
sqlparse 0.5.3 A non-validating SQL parser.
stack-data 0.6.3 Extract data from python stack frames and tracebacks for informative displays
structlog 24.4.0 Structured Logging for Python
tenacity 9.0.0 Retry code until it succeeds
tiktoken 0.8.0 tiktoken is a fast BPE tokeniser for use with OpenAI's models
tqdm 4.67.1 Fast, Extensible Progress Meter
traitlets 5.14.3 Traitlets Python configuration system
types-awscrt 0.23.6 Type annotations and code completion for awscrt
types-pyyaml 6.0.12.20241230 Typing stubs for PyYAML
types-requests 2.32.0.20241016 Typing stubs for requests
types-s3transfer 0.10.4 Type annotations and code completion for s3transfer
typing-extensions 4.12.2 Backported and Experimental Type Hints for Python 3.8+
typing-inspect 0.9.0 Runtime inspection utilities for typing module.
tzdata 2024.2 Provider of IANA time zone data
urllib3 2.3.0 HTTP library with thread-safe connection pooling, file post, and more.
uvicorn 0.27.1 The lightning-fast ASGI server.
vine 5.1.0 Python promises.
wcwidth 0.2.13 Measures the displayed width of unicode strings in a terminal
werkzeug 3.1.3 The comprehensive WSGI web application library.
workos 5.11.0 WorkOS Python Client
wrapt 1.17.1 Module for decorators, wrappers and monkey patching.
xmltodict 0.14.2 Makes working with XML feel like you are working with JSON
yarl 1.18.3 Yet another URL library
zipp 3.21.0 Backport of pathlib-compatible object wrapper for zip files
zope-event 5.0 Very basic event publishing system
zope-interface 7.2 Interfaces for Python

Operating System

FARGATE

@marktengi
Copy link

We're experiencing a similar issue in our fargate deployment. We use uwsgi and I noticed that startup times spiked significantly when updating ddtrace from 2.18.0 to 2.18.1 (on python 3.11.8), and this was causing the ALB health checks to fail and continuously cycle the container. Bumping the number of allowable failed health checks allowed the container to come up and start replying to the health check requests and avoid getting killed.

It seems that this is an interaction between DD_PROFILING_ENABLED and DD_IAST_ENABLED on 2.18.1. Disabling either of these options or dropping back down to 2.18.0 results in a more reasonable startup time.

examples

# DD_PROFILING_ENABLED=true; DD_IAST_ENABLED=true; dd-trace==2.18.0
WSGI app 0 (mountpoint='') ready in 27 seconds on interpreter 0x7fb4607bb018 pid: 25 (default app)

# DD_PROFILING_ENABLED=true; DD_IAST_ENABLED=true; dd-trace==2.18.1
WSGI app 0 (mountpoint='') ready in 125 seconds on interpreter 0x7fa2561c5018 pid: 23 (default app)

# DD_PROFILING_ENABLED=true; DD_IAST_ENABLED=false; dd-trace==2.18.1
WSGI app 0 (mountpoint='') ready in 32 seconds on interpreter 0x7fc7e2ecc018 pid: 22 (default app)

# DD_PROFILING_ENABLED=false; DD_IAST_ENABLED=true; dd-trace==2.18.1
WSGI app 0 (mountpoint='') ready in 21 seconds on interpreter 0x7fd283ad7018 pid: 23 (default app)

# DD_PROFILING_ENABLED=false; DD_IAST_ENABLED=false; dd-trace==2.18.1
WSGI app 0 (mountpoint='') ready in 7 seconds on interpreter 0x7f1398dc6018 pid: 24 (default app)

Our uwsgi.ini file is configured as specified in the docs.

@viktorvsk-dualentry viktorvsk-dualentry changed the title [BUG]: [BUG]: ddtrace-run uvicorn ... fails ECS (FARGATE) healtcheck Jan 14, 2025
@viktorvsk-dualentry
Copy link
Author

I have already experience this and increased timeout to 20 seconds and thought it was a lot already cause service deployment went from 3 minutes to 6 minutes, but wow, I didn't think that setting ALB timeout to 60 would fix it. Now the deployment takes 13 minutes though... (Also I downgraded dd-trace to 2.18.0)

Interestingly enough, Celery was not affected at all at no point in time, this deploys in 3 minutes:

poetry run ddtrace-run celery -q --workdir=./myapp -A core.celery.app worker -l debug -P gevent -c 100 $@

But uvicorn (asgi) takes a lot of time

Thanks a lot @marktengi !

I'm not sure should I close this since workaround is found, or we should better leave it cause going from 3 minutes for deployment to 13 and healthchecks from 30 seconds to 70 for Profiles doesn't seem like a great trade in general

@JohnnyBeet
Copy link

@viktorvsk-dualentry I also experienced similar issues with fargate hosted django app (in my case it's wsgi). I think more people will experience this issue and it's worth to keep this ticket open

@JohnnyBeet
Copy link

Seems to be no longer an issue in 2.18.2

@taegyunkim taegyunkim added the Profiling Continous Profling label Jan 22, 2025
@taegyunkim taegyunkim self-assigned this Jan 22, 2025
@taegyunkim
Copy link
Contributor

tl;dr: Please use 2.19.1+ or 2.18.2+ which has fixes for profiling performance/overhead issues.

@viktorvsk-dualentry Do you mind trying 2.19.1? It includes PRs that might fix the slow startup time for uvicorn workers.

@JohnnyBeet [2.18.2] has "Removes a system call from the memory allocation profiler, used to detect forks, which ran on every allocation and resulted in a significant slowdown." this, without this memory profiler would have caused an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Profiling Continous Profling
Projects
None yet
Development

No branches or pull requests

4 participants