Skip to content

Commit c306bee

Browse files
fix: adding missing abbreviations files for SentenceSplitter (#8660)
* adding missing abbreviations files for SentenceSplitter * fixing tests path
1 parent 91619a7 commit c306bee

File tree

5 files changed

+2075
-2
lines changed

5 files changed

+2075
-2
lines changed

Diff for: .pre-commit-config.yaml

+1
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ repos:
2626
rev: v2.3.0
2727
hooks:
2828
- id: codespell
29+
exclude: "haystack/data/abbreviations"
2930
args: ["--toml", "pyproject.toml"]
3031
additional_dependencies:
3132
- tomli

Diff for: haystack/components/preprocessors/sentence_tokenizer.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -228,7 +228,7 @@ def _read_abbreviations(lang: Language) -> List[str]:
228228
:param lang: The language to read the abbreviations for.
229229
:returns: List of abbreviations.
230230
"""
231-
abbreviations_file = Path(__file__).parent.parent / f"data/abbreviations/{lang}.txt"
231+
abbreviations_file = Path(__file__).parent.parent.parent / f"data/abbreviations/{lang}.txt"
232232
if not abbreviations_file.exists():
233233
logger.warning("No abbreviations file found for {language}. Using default abbreviations.", language=lang)
234234
return []

0 commit comments

Comments
 (0)