Add HTML indexing capabilities #505
Replies: 2 comments 1 reply
-
Hi @martonx We already discussed that, and the current statement is that we are not dealing with user data cleaning. However, this will probably be considered in the future when we finish our essential work in progress; it could be in the shape of an indexing pipeline welcoming multiple routines like trimming, HTML sanitization, translation from a third party, etc... For now, the workaround is to have two fields: one in HTML format for displaying the search result and a HTML sanitized version for searching (the latter must be specified in the searchable attributes). |
Beta Was this translation helpful? Give feedback.
-
Hey @martonx! This is related to #474, don't hesitate to upvote it and add some explanation of your use case! |
Beta Was this translation helpful? Give feedback.
-
I had a discussion in Q&A section. I have a feature proposal based on this discussion: meilisearch/meilisearch#2650 (reply in thread)
I hope you'll like the idea to index html content. So the base idea is to send the content html, and the indexer would index only the scraped, cleaned content, not the html elements, classes, tags etc.
Beta Was this translation helpful? Give feedback.
All reactions