Skip to content

add cache to optimize type checking #2737

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

MintsInc
Copy link
Member

@MintsInc MintsInc commented Aug 1, 2025

What does this PR do?

Context

Jira ticket

The validate_and_convert_types function was causing severe performance degradation when check_return_type=True, resulting in 5.0x slowdown for API calls.

Root Cause: Exponential validation overhead due to deeply nested API responses. A single dashboard response triggered 108,516+ validation calls with recursive validation of the same nested structures repeatedly.

Failed Optimization Attempts

Early Exit Strategies: Primitive type checks, inheritance checks, and exact type matching were ineffective because inputs were raw JSON data requiring conversion to typed models, not already-typed instances.

The core issue was that we couldn't avoid the validation work through early exits, but we could avoid repeating identical validation work through caching.

Changes

ApiClient-Level Validation Cache

  1. Cache Infrastructure: Added _validation_cache to ApiClient.__init__() with configurable size limit (default: 1000 entries)

  2. Cache Integration: Modified ApiClient.deserialize() to pass cache to validate_and_convert_types()

  3. Validation Function Updates: Added request_cache parameter throughout validation call chain with _make_hashable() helper for unhashable types

  4. Gentle FIFO Memory Management: When cache is full, evict 25% of entries while keeping 75% most recent

Design Decisions:

  • ApiClient-level scope: Cache persists across requests within same client instance
  • Conservative defaults: 1000 entry limit balances memory vs. performance
  • Zero breaking changes: Transparent caching, existing APIs unchanged

Tests

Performance Test Setup

Using the exact reproduction case from the original issue:

# main.py - Original test case from the ticket
def get_dashboard(check_return_type, count=10):
    configuration = Configuration(check_return_type=check_return_type)
    with ApiClient(configuration) as api_client:
        api_instance = DashboardsApi(api_client)
        for i in range(count):
            api_instance.get_dashboard("kct-7gx-vz7")  # Specific dashboard ID

# Executed with: dd-auth -- python main.py

Performance Results:

# Before optimization
check_return_type=True  - 15.0s
check_return_type=False - 3.0s   # 5.0x slowdown

# After optimization  
check_return_type=True  - 4.33s
check_return_type=False - 3.09s  # 1.4x slowdown

Additional Notes

Review checklist

Please check relevant items below:

  • This PR includes all newly recorded cassettes for any modified tests.

  • This PR does not rely on API client schema changes.

    • The CI should be fully passing.
  • Or, this PR relies on API schema changes and this is a Draft PR to include tests for that new functionality.

    • Note: CI shouldn't be run on this Draft PR, as its expected to fail without the corresponding schema changes.

Copy link
Member Author

MintsInc commented Aug 1, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@MintsInc MintsInc marked this pull request as ready for review August 1, 2025 15:04
@MintsInc MintsInc requested review from a team as code owners August 1, 2025 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant