Skip to content

Conversation

@Darkheir
Copy link
Collaborator

@Darkheir Darkheir commented Dec 15, 2025

This PR introduces rate limiting for logs sent by connectors and triggers.

It rate limits sending the same log (same message and same level) to the API.

Summary by Sourcery

Introduce rate limiting for trigger logs to avoid repeatedly sending identical log entries to the API.

Enhancements:

  • Add per-level log rate limiting configuration to the Trigger class to control how often identical messages are sent.

Tests:

  • Add unit test to verify that identical logs are rate limited and only sent once within the configured window.
  • Adjust existing batch-full log test to use unique messages per log entry.

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Dec 15, 2025

Reviewer's Guide

Implements per-message, per-level log rate limiting in Trigger, storing last emission times to avoid sending identical logs too frequently, and adds tests to validate the new behavior and preserve existing batching semantics.

Sequence diagram for Trigger.log with rate limiting

sequenceDiagram
    actor Connector
    participant Trigger

    Connector->>Trigger: log(message, level, args, kwargs)
    activate Trigger
    Trigger->>Trigger: compute key = level:message
    Trigger->>Trigger: last_log_time = _last_log_time.get(key, 0)
    Trigger->>Trigger: rate_limit = LOGS_RATE_LIMIT.get(level, 0)
    Trigger->>Trigger: now = time.time()
    alt within rate limit
        Trigger-->>Connector: return (skip sending log)
    else not rate limited
        Trigger->>Trigger: _last_log_time[key] = now
        Trigger->>Trigger: append log dict to _logs
        Trigger-->>Connector: return
    end
    deactivate Trigger
Loading

Class diagram for updated Trigger log rate limiting

classDiagram
    class Trigger {
        <<ModuleItem>>
        +ClassVar dict~str,int~ LOGS_MAX_BATCH_SIZE
        +ClassVar int LOGS_MAX_DELTA
        +ClassVar dict~str,int~ LOGS_RATE_LIMIT
        -bool _critical_log_sent
        -dict~str,float~ _last_log_time
        -list~dict~ _logs
        +log(message str, level LogLevelStr, args, kwargs) None
    }
Loading

File-Level Changes

Change Details Files
Introduce per-level, per-message log rate limiting in Trigger.log and track last emission timestamps.
  • Add LOGS_RATE_LIMIT class variable defining per-level rate limits in seconds, with no limit for critical logs.
  • Initialize an instance-level _last_log_time dict to remember when each (level,message) combination was last logged.
  • In log(), compute a key from level and message, look up last log time and configured rate limit, and skip queuing the log if called again within the rate limit window.
  • Update the last-log timestamp when a log is accepted into the internal _logs buffer.
sekoia_automation/trigger.py
Extend and adjust tests to cover log rate limiting and avoid accidental batching effects from identical messages.
  • Add test_trigger_log_rate_limit to assert that repeated identical info logs within the rate limit are only sent once to the API when batching is disabled.
  • Modify test_trigger_log_batch_full to use unique messages in the loop, ensuring all logs are included in the batch and not suppressed by rate limiting.
tests/test_trigger.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@Darkheir Darkheir force-pushed the feat/logs_rate_limit branch from 08b8a78 to 4e5e324 Compare December 15, 2025 14:46
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • The _last_log_time dict grows a new entry for every distinct level:message combination and is never cleaned up, which could lead to unbounded memory growth over long‑running triggers; consider adding a pruning strategy (e.g., periodic cleanup of old keys) or a size‑bounded cache.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `_last_log_time` dict grows a new entry for every distinct `level:message` combination and is never cleaned up, which could lead to unbounded memory growth over long‑running triggers; consider adding a pruning strategy (e.g., periodic cleanup of old keys) or a size‑bounded cache.

## Individual Comments

### Comment 1
<location> `sekoia_automation/trigger.py:79` </location>
<code_context>
         self._secrets: dict[str, Any] = {}
         self._stop_event = Event()
         self._critical_log_sent = False
+        self._last_log_time: dict[str, float] = {}
         self._logs: list[dict] = []

</code_context>

<issue_to_address>
**suggestion (performance):** Consider bounding or periodically cleaning `_last_log_time` to avoid unbounded growth over long‑running triggers.

Since `_last_log_time` keeps an entry per `level:message` and is never pruned, a long‑running trigger with many distinct messages can leak memory. Please add some eviction (e.g., max size, age-based pruning, or clearing after successful batch sends) to keep this dict bounded over time.

Suggested implementation:

```python
    # Time to wait for stop event to be received
    _STOP_EVENT_WAIT = 120

    def _prune_last_log_time(self) -> None:
        """Prune `_last_log_time` to avoid unbounded growth.

        Removes old entries based on age and bounds the dict size.
        """
        if not getattr(self, "_last_log_time", None):
            return

        now = time.time()

        # Age-based pruning
        max_age = getattr(self, "LOGS_LAST_LOG_TIME_MAX_AGE", 0) or 0
        if max_age > 0:
            keys_to_delete = [
                key
                for key, ts in self._last_log_time.items()
                if now - ts > max_age
            ]
            for key in keys_to_delete:
                self._last_log_time.pop(key, None)

        # Size-based pruning
        max_size = getattr(self, "LOGS_LAST_LOG_TIME_MAX_SIZE", 0) or 0
        if max_size <= 0 or len(self._last_log_time) <= max_size:
            return

        # Remove oldest entries (smallest timestamp) until within bounds
        surplus = len(self._last_log_time) - max_size
        for key, _ in sorted(
            self._last_log_time.items(), key=lambda kv: kv[1]
        )[:surplus]:
            self._last_log_time.pop(key, None)

```

```python
        self._secrets: dict[str, Any] = {}
        self._stop_event = Event()
        self._critical_log_sent = False
        # rate-limit state: last log time per "level:message"
        self._last_log_time: dict[str, float] = {}

```

```python
import time
import requests
    LOGS_MAX_BATCH_SIZE = 50
    LOGS_MAX_DELTA = 5  # seconds
    # Bound `_last_log_time` growth
    LOGS_LAST_LOG_TIME_MAX_SIZE = 1000
    LOGS_LAST_LOG_TIME_MAX_AGE = 3600  # seconds

```

To fully implement the eviction and make it effective, you should also:

1. Call `_prune_last_log_time()` in the method that updates `self._last_log_time` (i.e., wherever you do `self._last_log_time[key] = time.time()` for log rate limiting), directly after updating the timestamp:
   - Example pattern to adjust:
   ```python
   last = self._last_log_time.get(key)
   # ... rate limit logic ...
   self._last_log_time[key] = now
   ```
   should become:
   ```python
   last = self._last_log_time.get(key)
   # ... rate limit logic ...
   self._last_log_time[key] = now
   self._prune_last_log_time()
   ```

2. Optionally, if you have a method that clears `_logs` after a successful batch send, you can also call `_prune_last_log_time()` there to keep the in-memory state small over time, especially for long-running triggers.

You will need to locate the exact logging/rate-limiting method(s) in this file and apply the above pattern to integrate `_prune_last_log_time()` at the right points.
</issue_to_address>

### Comment 2
<location> `tests/test_trigger.py:317-326` </location>
<code_context>
     assert log_request["message"] == "test message"


+def test_trigger_log_rate_limit(mocked_trigger_logs):
+    trigger = DummyTrigger()
+    trigger.LOGS_MAX_BATCH_SIZE = 0
+
+    assert mocked_trigger_logs.call_count == 0
+
+    trigger.log("test message", "info")
+    trigger.log("test message", "info")
+    assert mocked_trigger_logs.call_count == 1
+    assert len(mocked_trigger_logs.last_request.json()["logs"]) == 1
+
+
</code_context>

<issue_to_address>
**suggestion (testing):** Add a test that verifies the rate limit window elapsing allows the same log to be sent again.

This test only verifies that consecutive identical logs are rate-limited. To also validate the time-based behavior, please add a test that controls the rate limit window (e.g., by temporarily setting `LOGS_RATE_LIMIT["info"]` to a small value or monkeypatching `time.time`) and asserts that: (1) the first call is sent, (2) an immediate second call is suppressed, and (3) after `rate_limit` seconds (simulated), a third call with the same message/level is sent again.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +317 to +326
def test_trigger_log_rate_limit(mocked_trigger_logs):
trigger = DummyTrigger()
trigger.LOGS_MAX_BATCH_SIZE = 0

assert mocked_trigger_logs.call_count == 0

trigger.log("test message", "info")
trigger.log("test message", "info")
assert mocked_trigger_logs.call_count == 1
assert len(mocked_trigger_logs.last_request.json()["logs"]) == 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add a test that verifies the rate limit window elapsing allows the same log to be sent again.

This test only verifies that consecutive identical logs are rate-limited. To also validate the time-based behavior, please add a test that controls the rate limit window (e.g., by temporarily setting LOGS_RATE_LIMIT["info"] to a small value or monkeypatching time.time) and asserts that: (1) the first call is sent, (2) an immediate second call is suppressed, and (3) after rate_limit seconds (simulated), a third call with the same message/level is sent again.

@github-actions
Copy link

github-actions bot commented Dec 15, 2025

Test Results

259 tests  +1   255 ✅ +1   2m 6s ⏱️ +3s
  1 suites ±0     4 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit 600751d. ± Comparison against base commit 5afad7f.

♻️ This comment has been updated with latest results.

@codecov
Copy link

codecov bot commented Dec 15, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.58%. Comparing base (5afad7f) to head (600751d).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #204      +/-   ##
==========================================
+ Coverage   91.55%   91.58%   +0.02%     
==========================================
  Files          49       49              
  Lines        3187     3197      +10     
==========================================
+ Hits         2918     2928      +10     
  Misses        269      269              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Darkheir Darkheir force-pushed the feat/logs_rate_limit branch from 4e5e324 to 6a4b636 Compare December 15, 2025 14:57
@socket-security
Copy link

socket-security bot commented Dec 15, 2025

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedcachetools@​6.2.3100100100100100

View full report

@Darkheir Darkheir force-pushed the feat/logs_rate_limit branch from 6a4b636 to d96e57c Compare December 15, 2025 15:10
otetard
otetard previously approved these changes Dec 15, 2025
Signed-off-by: Darkheir <[email protected]>
@Darkheir Darkheir force-pushed the feat/logs_rate_limit branch from d96e57c to 600751d Compare December 15, 2025 15:38
@Darkheir Darkheir merged commit 8af8ab4 into main Dec 15, 2025
11 checks passed
@Darkheir Darkheir deleted the feat/logs_rate_limit branch December 15, 2025 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants