Boosting scores based on Array Attribute Matching in Vespa #35844

im-atulya · 2026-02-11T13:33:20Z

im-atulya
Feb 11, 2026

I am working with a schema where a field mcat_tree is defined as an array (eg. "mcat_tree": ["189194", "192170", "PID 19217", "189194R", "189194P"]). It is defined as an attribute field for memory/scale reasons. My goal is simple: In the ranking phase, I want to check if a specific string ("191984") or any one of the string from input array (["12345", "191984"]) (passed as a query parameter) exists in that array. If it exists, I want to apply a significant boost (e.g., +100 to the score).

The Problem: While searching/filtering in YQL using contains works perfectly, I’ve struggled to find a clean, performant way to do this inside a rank-profile using only the array attribute. Nothing seems to be working out for one reason or the other - the only solution that has worked reliably is duplicating the data into a mapped tensor: field mcat_tree_tensor type tensor(mcat_tree{})

And then ranking using: sum(query(my_param) * attribute(mcat_tree_tensor))

Questions for the Community:

Is this the intended pattern? Is creating a mirrored tensor the recommended "Vespa way" for ranking against array elements, or is there a way to use the array attribute directly in an expression that I’m missing?

Memory Bloat: Tensors are powerful but memory-heavy. If I have millions of documents with 10-20 strings per array, and 3 array field per document, is the memory overhead of a mapped tensor the "price of entry" for this logic?

Future Roadmap: Is there a plan to allow simpler element based checks in ranking expressions for array attributes (e.g., an in or contains operator) to avoid the tensor conversion?

I’d love to hear how others handle "parameter-based boosting against multivalued array type attributes" without ballooning up their RAM usage.

Answered by andreer

Feb 11, 2026

Hi!

Instead of duplicating the field you could use tensorFromLabels(attribute,dimension): see https://docs.vespa.ai/en/reference/ranking/rank-features.html#document-features.

But what you probably want is to use one of the multi-value attribute rank features: https://docs.vespa.ai/en/reference/ranking/rank-features.html#features-for-indexed-multivalue-string-fields

View full answer

andreer · 2026-02-11T13:41:53Z

andreer
Feb 11, 2026
Collaborator

Hi!

Instead of duplicating the field you could use tensorFromLabels(attribute,dimension): see https://docs.vespa.ai/en/reference/ranking/rank-features.html#document-features.

But what you probably want is to use one of the multi-value attribute rank features: https://docs.vespa.ai/en/reference/ranking/rank-features.html#features-for-indexed-multivalue-string-fields

0 replies

omni-front · 2026-03-12T14:51:45Z

omni-front
Mar 12, 2026

not sure about this one tbh. maybe check the Vespa docs or ask in their forums if no one else chimes in here.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Boosting scores based on Array Attribute Matching in Vespa #35844

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Boosting scores based on Array Attribute Matching in Vespa #35844

Uh oh!

im-atulya Feb 11, 2026

Replies: 2 comments

Uh oh!

andreer Feb 11, 2026 Collaborator

Uh oh!

omni-front Mar 12, 2026

im-atulya
Feb 11, 2026

andreer
Feb 11, 2026
Collaborator

omni-front
Mar 12, 2026