Does qdrant support prefiltering on the fields in the payload before doing the vector embedding match? #322
-
if there are lots of documents, we would want to first filter them on some fields like category etc, and then do vector similarity on a text field, wanted to know if qdrant supports this kind of prefiltering. Or does it do the vector similarity on the entire documents and then filter based on the fields specified in the query? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 8 replies
-
Hi @dingusagar, thanks for the great question! The short answer: we do filtering during the vector search. For different search scenario Qdrant query planner prefers different strategies of searching. In case if filter query is restrictive - it became much faster to just retrieve vectors by filtering conditions and then re-score them by similarity. In more complicated scenario - it perform search using vector index. But unlike other implementations we do not calculate the filtering condition for all the documents in advance - we check conditions dynamically during the traversal of HNSW graph. That allow us to limit the number of condition checks by the order of magnitude comparing with usual pre-filtering approach. There are more details on how we treat HNSW filtering in the documentation https://qdrant.tech/documentation/indexing/#filtrable-index P.S. I am going to convert this issue into discussion. |
Beta Was this translation helpful? Give feedback.
Hi @dingusagar, thanks for the great question! The short answer: we do filtering during the vector search.
More detailed answer: it depends.
For different search scenario Qdrant query planner prefers different strategies of searching. In case if filter query is restrictive - it became much faster to just retrieve vectors by filtering conditions and then re-score them by similarity.
In more complicated scenario - it perform search using vector index. But unlike other implementations we do not calculate the filtering condition for all the documents in advance - we check conditions dynamically during the traversal of HNSW graph. That allow us to limit the number of condition checks by the or…