Add HTML indexing capabilities #505

martonx · 2022-08-02T19:02:13Z

martonx
Aug 2, 2022

I had a discussion in Q&A section. I have a feature proposal based on this discussion: meilisearch/meilisearch#2650 (reply in thread)

I hope you'll like the idea to index html content. So the base idea is to send the content html, and the indexer would index only the scraped, cleaned content, not the html elements, classes, tags etc.

gmourier · 2022-08-08T10:38:29Z

gmourier
Aug 8, 2022
Maintainer

Hi @martonx

We already discussed that, and the current statement is that we are not dealing with user data cleaning. However, this will probably be considered in the future when we finish our essential work in progress; it could be in the shape of an indexing pipeline welcoming multiple routines like trimming, HTML sanitization, translation from a third party, etc...

For now, the workaround is to have two fields: one in HTML format for displaying the search result and a HTML sanitized version for searching (the latter must be specified in the searchable attributes).

0 replies

ManyTheFish · 2022-08-09T10:00:22Z

ManyTheFish
Aug 9, 2022
Collaborator

Hey @martonx! This is related to #474, don't hesitate to upvote it and add some explanation of your use case!

1 reply

gmourier Aug 9, 2022
Maintainer

Thanks @ManyTheFish and @martonx, I'm locking that one in favor of #474

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Meilisearch

Add HTML indexing capabilities #505

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Meilisearch

Add HTML indexing capabilities #505

martonx Aug 2, 2022

Replies: 2 comments · 1 reply

gmourier Aug 8, 2022 Maintainer

ManyTheFish Aug 9, 2022 Collaborator

gmourier Aug 9, 2022 Maintainer

martonx
Aug 2, 2022

Replies: 2 comments 1 reply

gmourier
Aug 8, 2022
Maintainer

ManyTheFish
Aug 9, 2022
Collaborator

gmourier Aug 9, 2022
Maintainer