Skip to content

Commit

Permalink
Merge #301
Browse files Browse the repository at this point in the history
301: Add math symbols to default separators r=ManyTheFish a=phillitrOSU

# Pull Request

## Related issue
Fixes #300

## What does this PR do?
- Adds all math symbols from https://www.compart.com/en/unicode/category/Sm to the default separator list.

## PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?

Thank you so much for contributing to Meilisearch!


Co-authored-by: Trevor Phillips <[email protected]>
Co-authored-by: Trevor Glenn Phillips <[email protected]>
  • Loading branch information
meili-bors[bot] and phillitrOSU authored Jul 25, 2024
2 parents 9f27d85 + 953eb2e commit 81f0a43
Showing 1 changed file with 14 additions and 1 deletion.
15 changes: 14 additions & 1 deletion charabia/src/separators.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
/// - Pi Initial Punctuation
/// - Po Other Punctuation
/// - Ps Open Punctuation
/// - Sm Math Symbol
/// - Zl Line Separator
/// - Zp Paragraph Separator
/// - Zs Space Separator
Expand Down Expand Up @@ -59,7 +60,19 @@ pub const DEFAULT_SEPARATORS: &[&str] = &[
"𑪠", "𑪡", "𑪢", "𑱁", "𑱂", "𑱃", "𑱄", "𑱅", "𑱰", "𑱱", "𑻷", "𑻸", "𑿿", "𒑰", "𒑱", "𒑲", "𒑳", "𒑴", "𖩮",
"𖩯", "𖫵", "𖬷", "𖬸", "𖬹", "𖬺", "𖬻", "𖭄", "𖺗", "𖺘", "𖺙", "𖺚", "𖿢", "𛲟", "𝪇", "𝪈", "𝪉", "𝪊", "𝪋",
"𞥞", "𞥟", "\n", "\r", "\u{2029}", " ", " ", " ", " ", " ", " ", " ", " ", " ", " ", " ", " ",
" ", " ", "`", "\t"
" ", " ", "`", "\t", "+", "-", "±", "×", "÷", "−", "∓", "∔", "∕", "∖", "∗", "∘", "∙", "√", "∛",
"∜", "∝", "∞", "∟", "∠", "∡", "∢", "∣", "∤", "∥", "∧", "∨", "∩", "∪", "∫", "∬", "∭", "∮", "∯",
"∰", "∱", "∲", "∳", "∴", "∵", "∶", "∷", "∸", "∹", "∺", "∻", "∼", "∽", "∾", "∿", "≀", "≁", "≂",
"≃", "≄", "≅", "≆", "≇", "≈", "≉", "≊", "≋", "≌", "≍", "≎", "≏", "≐", "≑", "≒", "≓", "≔", "≕",
"≖", "≗", "≘", "≙", "≚", "≛", "≜", "≝", "≞", "≟", "≠", "≡", "≢", "≣", "≤", "≥", "≦", "≧", "≨",
"≩", "≪", "≫", "≬", "≭", "≮", "≯", "≰", "≱", "≲", "≳", "≴", "≵", "≶", "≷", "≸", "≹", "≺",
"≻", "≼", "≽", "≾", "≿", "⊀", "⊁", "⊂", "⊃", "⊄", "⊅", "⊆", "⊇", "⊈", "⊉", "⊊", "⊋", "⊌", "⊍",
"⊏", "⊐", "⊑", "⊒", "⊓", "⊔", "⊕", "⊖", "⊗", "⊘", "⊙", "⊚", "⊛", "⊜", "⊝", "⊞", "⊟", "⊠", "⊡",
"⊢", "⊣", "⊤", "⊥", "⊦", "⊧", "⊨", "⊩", "⊪", "⊫", "⊬", "⊭", "⊮", "⊯", "⊰", "⊱", "⊲", "⊳", "⊴",
"⊵", "⊶", "⊷", "⊸", "⊹", "⊺", "⊻", "⊼", "⊽", "⊾", "⊿", "⋀", "⋁", "⋂", "⋃", "⋄", "⋅", "⋆", "⋇",
"⋈", "⋉", "⋊", "⋋", "⋌", "⋍", "⋎", "⋏", "⋐", "⋑", "⋒", "⋓", "⋔", "⋕", "⋖", "⋗", "⋘", "⋙", "⋚",
"⋛", "⋜", "⋝", "⋞", "⋟", "⋠", "⋡", "⋢", "⋣", "⋤", "⋥", "⋦", "⋧", "⋨", "⋩", "⋪", "⋫", "⋬", "⋭",
"⋮", "⋯", "⋰", "⋱", "⋲", "⋳", "⋴", "⋵", "⋶", "⋷", "⋸", "⋹", "⋺", "⋻", "⋼", "⋽", "⋾", "⋿"
];

#[rustfmt::skip]
Expand Down

0 comments on commit 81f0a43

Please sign in to comment.