Skip to content

fix: protect Crawl4AI nginx exposure#75

Merged
botAGI merged 1 commit into
mainfrom
codex/fix-unauthenticated-api-exposure-issue
Jun 1, 2026
Merged

fix: protect Crawl4AI nginx exposure#75
botAGI merged 1 commit into
mainfrom
codex/fix-unauthenticated-api-exposure-issue

Conversation

@botAGI

@botAGI botAGI commented Jun 1, 2026

Copy link
Copy Markdown
Owner

Motivation

  • Prevent unauthenticated access to the Crawl4AI REST API which allowed API discovery, submission of arbitrary crawl jobs and SSRF against internal/metadata endpoints.
  • Ensure that when Authelia is enabled it cannot be silently bypassed by the Crawl4AI nginx vhost and that Crawl4AI host binding does not default to 0.0.0.0.

Description

  • Require Authelia before enabling the Crawl4AI nginx blocks by updating the activation condition in lib/config.sh so the Crawl4AI markers are only stripped when both ENABLE_CRAWL4AI and ENABLE_AUTHELIA are true.
  • Add Authelia auth_request /authelia-auth handling and an internal /authelia-auth endpoint to the Crawl4AI dedicated-port and mDNS nginx vhosts in templates/nginx.conf.template so requests to :11235 and agmind-crawl.local can be authenticated when enabled.
  • Default the published host bind for the Crawl4AI port to loopback by adding CRAWL4AI_BIND_ADDR and changing the port mapping in templates/docker-compose.yml to ${CRAWL4AI_BIND_ADDR:-127.0.0.1}:${EXPOSE_CRAWL4AI_PORT:-11235}:11235 and add CRAWL4AI_BIND_ADDR=127.0.0.1 to templates/env.lan.template.
  • Add a regression unit test tests/unit/test_crawl4ai_auth_exposure.sh that asserts the nginx activation condition, presence of auth_request and /authelia-auth, explicit /health handling, and the default loopback binding, and update golden fixtures/checksums accordingly.

Testing

  • Ran bash tests/unit/test_crawl4ai_auth_exposure.sh which passed and validates the new activation condition, auth_request markers and default bind settings.
  • Ran bash tests/unit/test_nginx_no_static_proxy_pass.sh which passed and confirmed nginx proxy patterns are unchanged.
  • Verified sha256 checksums for updated golden fixtures with sha256sum -c checksums.sha256 which succeeded for the updated fixtures.
  • Performed a static check bash -n lib/config.sh tests/unit/test_crawl4ai_auth_exposure.sh which returned without syntax errors.
  • bash tests/unit/test_compose_security_invariants.sh was skipped due to missing python3 + PyYAML in the environment, and bash tests/golden/run.sh full_lan --update was skipped due to Docker CLI / Docker Compose not available in the environment.

Codex Task

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9c2e47fc6e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread lib/config.sh
# Security: the Crawl4AI REST API accepts arbitrary crawl URLs and can be
# abused as an SSRF primitive. Only expose it through nginx when Authelia is
# active; otherwise keep the backend reachable solely on the Docker network.
if [[ "${ENABLE_CRAWL4AI:-false}" == "true" && "${ENABLE_AUTHELIA:-false}" == "true" ]]; then

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Don't silently disable Crawl4AI for agents installs

For DEPLOY_PROFILE=agents (and custom installs that select Crawl4AI but decline Authelia), templates/services/registry.yaml still implies ENABLE_CRAWL4AI=true without ENABLE_AUTHELIA=true, and the wizard still advertises agmind-crawl.local. With this new condition, generate_nginx_config deletes every Crawl4AI server block in that configuration, while compose only publishes nginx's 11235 port rather than the Crawl4AI container itself, so the selected service starts but has no advertised/reachable host route. Either force Authelia when Crawl4AI is enabled, or stop enabling/advertising Crawl4AI in non-Authelia profiles.

Useful? React with 👍 / 👎.

@botAGI botAGI merged commit 993535c into main Jun 1, 2026
13 of 16 checks passed
@botAGI botAGI deleted the codex/fix-unauthenticated-api-exposure-issue branch June 1, 2026 20:46
botAGI added a commit that referenced this pull request Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant