Skip to content

fix(embeddings): recover missing FastEmbed ONNX cache#1955

Open
Pouyanpi wants to merge 1 commit into
developfrom
stack/fastembed-missing-onnx-cache
Open

fix(embeddings): recover missing FastEmbed ONNX cache#1955
Pouyanpi wants to merge 1 commit into
developfrom
stack/fastembed-missing-onnx-cache

Conversation

@Pouyanpi

@Pouyanpi Pouyanpi commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator

Description

Retry FastEmbed initialization with an explicit local cache when ONNXRuntime reports a missing model.onnx from a stale cache.

Summary by CodeRabbit

  • Bug Fixes
    • Enhanced robustness of embedding model initialization with automatic recovery from missing model cache scenarios.

Retry FastEmbed initialization with an explicit local cache when ONNXRuntime reports a missing model.onnx from a stale cache.
@Pouyanpi Pouyanpi added this to the v0.23.0 milestone Jun 1, 2026
@Pouyanpi Pouyanpi self-assigned this Jun 1, 2026
@Pouyanpi Pouyanpi added the bug Something isn't working label Jun 1, 2026
@greptile-apps

greptile-apps Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR widens FastEmbed initialization error handling so that onnxruntime.NoSuchFile (and similar cache-miss errors) trigger the existing .cache fallback, not just the older ValueError path. It also refactors the error-matching into a dedicated helper and fixes raise ex to raise for proper traceback preservation.

  • fastembed.py: Catches Exception instead of ValueError, extracts _is_missing_onnx_model_error to match the three known missing-model message patterns, and safely copies kwargs before injecting cache_dir.
  • test_embeddings_fastembed.py: Adds two unit tests for the recovery and re-raise paths; imports onnxruntime.capi.onnxruntime_pybind11_state.NoSuchFile at module level, which can cause test-collection failures in environments where onnxruntime is not installed.

Confidence Score: 4/5

The production change is safe and well-scoped; the only rough edge is in the test file.

The production code is clean: the exception-type widening is necessary, the string-match helper is appropriately specific, and the raise fix is a strict improvement. The only concern is the top-level onnxruntime import in the test file, which can silently break test collection in environments without onnxruntime installed — a condition that did not exist before this PR.

tests/test_embeddings_fastembed.py — the hard onnxruntime import at module level.

Important Files Changed

Filename Overview
nemoguardrails/embeddings/providers/fastembed.py Introduces _is_missing_onnx_model_error helper to detect stale ONNX cache errors by string matching; widens the caught exception type from ValueError to Exception (needed since NoSuchFile is not a ValueError); correctly uses raise instead of raise ex to preserve the original traceback; copies kwargs before mutating to avoid side effects.
tests/test_embeddings_fastembed.py Adds two new unit tests covering the cache-recovery path and unrelated-error re-raise. The top-level from onnxruntime... import NoSuchFile creates a hard dependency that can break test-collection in environments without onnxruntime; using RuntimeError with the same message would be sufficient since the production helper only checks the string content.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant FastEmbedEmbeddingModel
    participant Embedding as fastembed.TextEmbedding
    participant Helper as _is_missing_onnx_model_error

    Caller->>FastEmbedEmbeddingModel: "__init__(embedding_model, **kwargs)"
    FastEmbedEmbeddingModel->>Embedding: "Embedding(embedding_model, **kwargs)"
    alt Success
        Embedding-->>FastEmbedEmbeddingModel: model instance
    else Exception raised
        Embedding-->>FastEmbedEmbeddingModel: raises Exception (e.g. NoSuchFile)
        FastEmbedEmbeddingModel->>Helper: _is_missing_onnx_model_error(ex)
        alt ONNX cache miss detected
            Helper-->>FastEmbedEmbeddingModel: True
            FastEmbedEmbeddingModel->>Embedding: "Embedding(embedding_model, cache_dir=".cache", **kwargs)"
            Embedding-->>FastEmbedEmbeddingModel: model instance (downloads to .cache)
        else Unrelated error
            Helper-->>FastEmbedEmbeddingModel: False
            FastEmbedEmbeddingModel-->>Caller: raise (original exception + traceback)
        end
    end
    FastEmbedEmbeddingModel->>FastEmbedEmbeddingModel: "self.embedding_size = len(embed("test"))"
    FastEmbedEmbeddingModel-->>Caller: model ready
Loading
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
tests/test_embeddings_fastembed.py:18-21
Top-level import of `onnxruntime.capi.onnxruntime_pybind11_state.NoSuchFile` makes the entire test module unimportable in any environment that lacks `onnxruntime`. Since `_is_missing_onnx_model_error` only inspects the string message (not the exception type), there's no need to use the real ONNXRuntime class — a plain `RuntimeError` with the same message text exercises exactly the same code path without the hard dependency.

```suggestion
import pytest

from nemoguardrails.embeddings.providers.fastembed import FastEmbedEmbeddingModel
```

### Issue 2 of 2
tests/test_embeddings_fastembed.py:37-40
After removing the top-level `NoSuchFile` import, the test can use a plain `RuntimeError` with the identical message. The production helper only string-matches on the message, so this change preserves full test coverage while eliminating the `onnxruntime` hard dependency.

```suggestion
            if len(calls) == 1:
                raise RuntimeError(
                    "[ONNXRuntimeError] : 3 : NO_SUCHFILE : Load model from /tmp/model.onnx failed. File doesn't exist"
                )
```

Reviews (1): Last reviewed commit: "fix(embeddings): recover missing FastEmb..." | Re-trigger Greptile

Comment on lines 18 to 21
import pytest
from onnxruntime.capi.onnxruntime_pybind11_state import NoSuchFile

from nemoguardrails.embeddings.providers.fastembed import FastEmbedEmbeddingModel

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Top-level import of onnxruntime.capi.onnxruntime_pybind11_state.NoSuchFile makes the entire test module unimportable in any environment that lacks onnxruntime. Since _is_missing_onnx_model_error only inspects the string message (not the exception type), there's no need to use the real ONNXRuntime class — a plain RuntimeError with the same message text exercises exactly the same code path without the hard dependency.

Suggested change
import pytest
from onnxruntime.capi.onnxruntime_pybind11_state import NoSuchFile
from nemoguardrails.embeddings.providers.fastembed import FastEmbedEmbeddingModel
import pytest
from nemoguardrails.embeddings.providers.fastembed import FastEmbedEmbeddingModel
Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/test_embeddings_fastembed.py
Line: 18-21

Comment:
Top-level import of `onnxruntime.capi.onnxruntime_pybind11_state.NoSuchFile` makes the entire test module unimportable in any environment that lacks `onnxruntime`. Since `_is_missing_onnx_model_error` only inspects the string message (not the exception type), there's no need to use the real ONNXRuntime class — a plain `RuntimeError` with the same message text exercises exactly the same code path without the hard dependency.

```suggestion
import pytest

from nemoguardrails.embeddings.providers.fastembed import FastEmbedEmbeddingModel
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +37 to +40
if len(calls) == 1:
raise NoSuchFile(
"[ONNXRuntimeError] : 3 : NO_SUCHFILE : Load model from /tmp/model.onnx failed. File doesn't exist"
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 After removing the top-level NoSuchFile import, the test can use a plain RuntimeError with the identical message. The production helper only string-matches on the message, so this change preserves full test coverage while eliminating the onnxruntime hard dependency.

Suggested change
if len(calls) == 1:
raise NoSuchFile(
"[ONNXRuntimeError] : 3 : NO_SUCHFILE : Load model from /tmp/model.onnx failed. File doesn't exist"
)
if len(calls) == 1:
raise RuntimeError(
"[ONNXRuntimeError] : 3 : NO_SUCHFILE : Load model from /tmp/model.onnx failed. File doesn't exist"
)
Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/test_embeddings_fastembed.py
Line: 37-40

Comment:
After removing the top-level `NoSuchFile` import, the test can use a plain `RuntimeError` with the identical message. The production helper only string-matches on the message, so this change preserves full test coverage while eliminating the `onnxruntime` hard dependency.

```suggestion
            if len(calls) == 1:
                raise RuntimeError(
                    "[ONNXRuntimeError] : 3 : NO_SUCHFILE : Load model from /tmp/model.onnx failed. File doesn't exist"
                )
```

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@codecov

codecov Bot commented Jun 1, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@coderabbitai

coderabbitai Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

This PR adds automatic fallback recovery when FastEmbedEmbeddingModel encounters a missing ONNX model cache during initialization. A new helper function detects the specific error pattern, triggers a retry with an explicit cache directory, and allows unrelated errors to propagate. Two unit tests verify both the recovery path and correct error re-raising behavior.

Changes

ONNX Model Cache Recovery

Layer / File(s) Summary
Detection helper and retry logic
nemoguardrails/embeddings/providers/fastembed.py
_is_missing_onnx_model_error helper classifies exceptions by inspecting messages for "model.onnx" and missing-file indicators. Model initialization replaces ValueError-only handling with generalized Exception handling that uses the helper to decide whether to retry with cache_dir=".cache" or re-raise.
Recovery and error handling tests
tests/test_embeddings_fastembed.py
Test imports add NoSuchFile and define a fake embedding vector stub. test_recovers_from_missing_onnxruntime_model_cache monkeypatches initialization to fail then succeed and verifies retry with cache directory. test_reraises_unrelated_fastembed_errors validates that non-missing-ONNX errors are propagated unchanged.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 5 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(embeddings): recover missing FastEmbed ONNX cache' directly and specifically describes the main change: handling missing ONNX cache errors in FastEmbed by retrying with an explicit cache directory.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Test Results For Major Changes ✅ Passed Minor bug fix with comprehensive test coverage included; changes are localized (+13/-4) and don't affect numerics, convergence, or normal performance.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch stack/fastembed-missing-onnx-cache

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/test_embeddings_fastembed.py (1)

19-19: ⚡ Quick win

Unconditional module imports are covered by default deps (CI), but still make tests require onnxruntime/fastembed

tests/test_embeddings_fastembed.py imports NoSuchFile and FastEmbedEmbeddingModel at module import time, but pyproject.toml declares both onnxruntime and fastembed under [tool.poetry.dependencies] (non-optional), and CI installs them with poetry install --with dev before running pytest—so collection should not fail in the standard test environment. If the goal is to allow running these tests without the embedding stack, make the imports lazy (inside tests) or use pytest.importorskip.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_embeddings_fastembed.py` at line 19, The test imports (NoSuchFile
and FastEmbedEmbeddingModel) at module scope causing test collection to fail
when onnxruntime/fastembed aren't present; make these imports conditional by
moving them into the test functions or using pytest.importorskip at the top of
the test file to skip the module when dependencies are missing (e.g., call
pytest.importorskip("onnxruntime") and pytest.importorskip("fastembed") or wrap
imports of NoSuchFile and FastEmbedEmbeddingModel inside the specific tests that
use them), ensuring tests/test_embeddings_fastembed.py can be collected without
the embedding stack.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/test_embeddings_fastembed.py`:
- Line 19: The test imports (NoSuchFile and FastEmbedEmbeddingModel) at module
scope causing test collection to fail when onnxruntime/fastembed aren't present;
make these imports conditional by moving them into the test functions or using
pytest.importorskip at the top of the test file to skip the module when
dependencies are missing (e.g., call pytest.importorskip("onnxruntime") and
pytest.importorskip("fastembed") or wrap imports of NoSuchFile and
FastEmbedEmbeddingModel inside the specific tests that use them), ensuring
tests/test_embeddings_fastembed.py can be collected without the embedding stack.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 91a656f4-5503-44a3-95dd-76c7833a0f0f

📥 Commits

Reviewing files that changed from the base of the PR and between a6fc06f and 50673f9.

📒 Files selected for processing (2)
  • nemoguardrails/embeddings/providers/fastembed.py
  • tests/test_embeddings_fastembed.py

@Pouyanpi Pouyanpi closed this Jun 22, 2026
@Pouyanpi Pouyanpi reopened this Jun 22, 2026
@github-actions github-actions Bot added status: needs triage New issues that have not yet been reviewed or categorized. size: S labels Jun 22, 2026
self.model = Embedding(embedding_model, cache_dir=".cache", **kwargs)
if _is_missing_onnx_model_error(ex):
fallback_kwargs = dict(kwargs)
fallback_kwargs["cache_dir"] = ".cache"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make this an absolute path eg:

fallback_kwargs["cache_dir"] = str(Path.home() / ".cache" / "fastembed")

my concern is that is we use a relative path, it will fail in docker containers since cwd is usually / or read only

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or alternatively, log a warning when overriding the value

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working size: S status: needs triage New issues that have not yet been reviewed or categorized.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants