Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make max_analyzed_offset configurable and apply it on search #20

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

andreasferber
Copy link

Elasticsearch has a per-index setting index.highlight.max_analyzed_offset that governs how much of the indexed data is analyzed for search result snippet highlighting. Searches have a corresponding max_analyzed_offset option. Without that search option if a search finds results in indexed media files that are larger than the default setting (about 1MB) you get an error.

This PR makes the index setting configurable and ensures that a matching option is set when querying the index to avoid running into abovementioned error condition.

@splitbrain
Copy link
Member

What's the exact error you get? I would assume that elastic simply does not return a search snippet, so the error would be some undefined field warning in the plugin itself?

@andreasferber
Copy link
Author

I don't remember the exact error message, but it was a genuine Elasticsearch error, not something on the PHP/Dokuwiki end.

See also the Elasticsearch documentation:

max_analyzed_offset
By default, the maximum number of characters analyzed for a highlight request is bounded by the value defined in the index.highlight.max_analyzed_offset setting, and when the number of characters exceeds this limit an error is returned. [...]

(from https://www.elastic.co/guide/en/elasticsearch/reference/current/highlighting.html#max-analyzed-offset)

@splitbrain
Copy link
Member

If I understand this correctly, there are two values. A maximum set at index level and a maximum set at query level.

If no query level max is set, and the index can't find a highlight before the index.max an error is thrown.

If a query.max is set, and the index can't find a highlight before query.max, no highlight is returned and no error is thrown.

query.max should be smaller than index.max, but your change suggests that even a query.max > index.max will suppress the error.

I have not been able to reproduce this issue, yet because I haven't found the right combination of document and query.

Since the plugin creates the index, I think it should set a sensible index.max and query.max. I don't think an option would be needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants