Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pluggable Ranking Collectors #2

Open
joel-bernstein opened this issue Jan 13, 2014 · 10 comments
Open

Pluggable Ranking Collectors #2

joel-bernstein opened this issue Jan 13, 2014 · 10 comments
Assignees

Comments

@joel-bernstein
Copy link
Contributor

It would be useful if users could write and plugin their own ranking collectors.
This ticket points to two different design options:

First design:
The first design in this ticket allows users to inject ranking collectors into Heliosearch through an extension of the PostFilter and DelegatingCollector framework.

To add your own ranking collector you extend the new abstract Ranker class and then return a DelegatingCollector that collects a DocList/DocSet rather then filters docs. The DelegatingCollector.finish() method now is passed a reference to the QueryResult so that it can add the DocList and DocSet when finish is called.

You can then plugin your Ranker impl through the QueryParserPlugin mechanism.

The SolrIndexSearcher checks after the call to DelegatingCollector.finish(QueryResult) to see if the QueryResult has a DocList set. If it finds the DocList, it returns without going though the steps of gathering the DocList from the TopDocsCollectors.

The QueryComponent.mergeIds() method has been changed from private to protected so it can be overridden to handle custom ranking logic when merging the docIds from distributed searches.

The initial impl was committed to the plugrank branch in my Heliosearch git repo.

https://github.com/joelbernstein2013/heliosearch/tree/plugrank
https://github.com/joelbernstein2013/heliosearch/compare/plugrank

Second design

The second design allows users to inject ranking collectors through a new CollectorFactory interface. A CollectorFactory can be added to the QueryCommand through a custom search component. The initial impl shows how the SolrIndexSearcher uses the CollectorFactory to plug in a TopDocsCollector if the QueryCommand has the CollectoryFactory set.

I'm leaning towards this design. I think it's cleaner. The user would have to write a custom search component to set the CollectorFactory into the QueryCommand, but if you're advanced enough to write your own TopDocsCollector this should be no problem.

Initial impl was added to the plugrank2 branch on my Heliosearch fork.

https://github.com/joelbernstein2013/heliosearch/tree/plugrank2
https://github.com/joelbernstein2013/heliosearch/compare/plugrank2

@joel-bernstein
Copy link
Contributor Author

I was looking for a way to link to the commit history for a branch. If anyone knows the trick to this let me know and I'll add the link to this ticket.

@VadimKirilchuk
Copy link

Is it what your are searching for?
https://github.com/joelbernstein2013/heliosearch/compare/plugrank

2014/1/13 joelbernstein2013 [email protected]

I was looking for a way to link to the commit history for a branch. If
anyone knows the trick to this let me know and I'll add the link to this
ticket.


Reply to this email directly or view it on GitHubhttps://github.com//issues/2#issuecomment-32197294
.

@joel-bernstein
Copy link
Contributor Author

Thanks Vadim. I'll use the link you provided and the link to the branch source tree for each ticket.

@VadimKirilchuk
Copy link

Ok. It's "compare" button right to "pull request" button on branch tree
page https://github.com/joelbernstein2013/heliosearch/tree/plugrank

It's worth to add a test, so the others could understand what you are
trying to implement.

2014/1/13 joelbernstein2013 [email protected]

Thanks Vadim. I'll use the link you provided and the link to the branch
source tree for each ticket.


Reply to this email directly or view it on GitHubhttps://github.com//issues/2#issuecomment-32198483
.

@ghost ghost assigned joel-bernstein Jan 13, 2014
@joel-bernstein
Copy link
Contributor Author

I agree this needs a test for people to understand it.

yonik pushed a commit that referenced this issue Jan 15, 2014
As a fake commit, this also closes github pull requests #1 #2 #3 #6 #10

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1555587 13f79535-47bb-0310-9956-ffa450edef68
@yonik yonik closed this as completed in f60a042 Jan 15, 2014
@yonik
Copy link
Member

yonik commented Jan 15, 2014

Reopening - looks like my merge-up of trunk closed this accidentally.

@yonik
Copy link
Member

yonik commented Jan 15, 2014

I agree, approach 2 looks best (just looking at the changes to SolrIndexSearcher)

@VadimKirilchuk
Copy link

Yes, not very cool.. There should be a way to avoid it..

2014/1/15 Yonik Seeley [email protected]

Reopening - looks like my merge-up of trunk closed this accidentally.


Reply to this email directly or view it on GitHubhttps://github.com//issues/2#issuecomment-32391851
.

@yonik yonik reopened this Jan 15, 2014
@joel-bernstein
Copy link
Contributor Author

Ok, seems like we've got a consensus for design #2.

I just added a pluggable MergeStrategy for handling the merge of the docId's from the shards.

joel-bernstein@ff2bd2b

Like the CollectorFactory you need to add this class to the ResponseBuilder from a custom search component.

Now you can control both the local ranking and the distributed merge.

Any thoughts on this design?

@joel-bernstein
Copy link
Contributor Author

Next I'll write some tests to see how it plays out when you plugin a Collector and MergeStrategy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants