Ignore HTML tags at search #119
Replies: 7 comments 8 replies
-
Hello @mishushakov! Thanks for your report and your suggestion! |
Beta Was this translation helpful? Give feedback.
-
Might be related to meilisearch/meilisearch#955 |
Beta Was this translation helpful? Give feedback.
-
those problems are not completely related, but i'd assume, that the solution would solve both in my case, i'd like to search a query on HTML pages, but MeiliSearch doesn't recognise XML/HTML, which not only contains text content, but also tags, that should have been stripped |
Beta Was this translation helpful? Give feedback.
-
I'm still in need of this particular feature, any progress? |
Beta Was this translation helpful? Give feedback.
-
Hello everyone! 👋 Quick update: We plan to work on a way to customize the tokenizer separators in the near term (ideally v1.2). Would you find it valuable to list html/xml tags to be considered as separators in an index settings and thus not being indexed by the engine as searchable terms? Is it a critical need since you can already strip the html version in a dedicated field (being listed as a searchableAttribute) and keep the html version for display (not being listed as a searchableAttribute) as a work-around? Do you have difficulties with this? Thanks for your help! cc @ManyTheFish |
Beta Was this translation helpful? Give feedback.
-
Hello, any news about this feature? I'm in the need to search text wrapped with HTML tags. It's weird for the user to have to type |
Beta Was this translation helpful? Give feedback.
-
If I have a text:
If I search "google" it works If this expected or is there currently a setting to make it work? |
Beta Was this translation helpful? Give feedback.
-
Is your feature request related to a problem? Please describe.
The search should ignore HTML tags
Describe the solution you'd like
MeiliSearch should strip HTML tags when processing search requests as Algolia does
Describe alternatives you've considered
Manually stripping HTML tags and storing the resulting text in the index
Beta Was this translation helpful? Give feedback.
All reactions