Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HSEARCH-5021 Improve support for custom value bridges for vector fields #3863

Conversation

marko-bekhta
Copy link
Member

@marko-bekhta marko-bekhta commented Dec 11, 2023

https://hibernate.atlassian.net/browse/HSEARCH-5021

This is just a draft for now so we can talk over the code 😃

@marko-bekhta marko-bekhta force-pushed the feat/HSEARCH-5021-Improve-support-for-custom-value-bridges-for-vector-fields branch from 9c8b8b0 to ec08362 Compare December 11, 2023 16:50
@yrodiere
Copy link
Member

Hey @marko-bekhta is this ready for review?

@marko-bekhta
Copy link
Member Author

hmm, good question 😃, looking at the changes, I don't see something to add...

@marko-bekhta marko-bekhta marked this pull request as ready for review December 20, 2023 15:45
Copy link
Member

@yrodiere yrodiere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From https://hibernate.atlassian.net/browse/HSEARCH-5021:

At the moment (as of https://hibernate.atlassian.net/browse/HSEARCH-4948 ), if someone was to write a custom binder for vector fields with a custom field type:

  1. They need to know about the number of dimensions, because that’s a mandatory parameter of IndexFieldTypeFactory#asVector… but they don't have access to the number passed to @VectorField#dimension.
  2. If they want to force the number of dimensions (because their bridge always produces vectors with a certain number of dimensions), users of the binder will still need to pass something to @VectorField#dimension in the mapping. Even though that value will never be used…

Does this PR addresses any of that? I'm honestly not sure :x

Perhaps it would be best to add/find an integration test for each of those:

  1. A custom bridge that turns for example a List<Float> into a float[], and declares the field type explicitly as projectable by default, but doesn't know about the number of dimensions. Users must provide the number of dimensions in @VectorField.
  2. A custom bridge that turns for example a Triplet<Float, Float, Float> into a float[], and declares the field type explicitly with dimensions=3. Users must not be forced to provide a number of dimensions in @VectorField.

I think org.hibernate.search.integrationtest.mapper.pojo.mapping.definition.VectorFieldIT#customBridge_explicitFieldType already addresses the second item.

I can't find something about the first, though... Which I guess makes sense, since org.hibernate.search.engine.backend.types.dsl.IndexFieldTypeFactory#asVector forces you to pass a number of dimensions ATM?

@marko-bekhta
Copy link
Member Author

I can't find something about the first, though... Which I guess makes sense, since org.hibernate.search.engine.backend.types.dsl.IndexFieldTypeFactory#asVector forces you to pass a number of dimensions ATM?

Right. And the initial idea was to add annotation attributes (@VectorField ones in particular) to the parameters so that the user would have access to dimension value from the annotation and could pass it to asVector call ... but then you were saying that that would be a users-writes-their-own-annotation-and-processor case so I assumed that that covers it and nothing to be done about it 😄...

Now if we want to allow calls like asVector() without an initial dimension value ... I suppose we can do that, but that would mean reverting the changes we've done in the other PR... let me put something together and we can check it tomorrow.

@hibernate-github-bot
Copy link

hibernate-github-bot bot commented Dec 20, 2023

Thanks for your pull request!

This pull request appears to follow the contribution rules.

› This message was automatically generated.

@marko-bekhta marko-bekhta force-pushed the feat/HSEARCH-5021-Improve-support-for-custom-value-bridges-for-vector-fields branch 2 times, most recently from 21b743a to fdf049c Compare December 21, 2023 07:47
Copy link
Member

@yrodiere yrodiere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now if we want to allow calls like asVector() without an initial dimension value ... I suppose we can do that

Thanks, looks like this addresses the issue completely! A few comments below.

but that would mean reverting the changes we've done in the other PR...

Yeah sorry about that. I guess I failed to consider this use case when I suggested that change in the initial PR...

the initial idea was to add annotation attributes (@Vectorfield ones in particular) to the parameters so that the user would have access to dimension value from the annotation and could pass it to asVector call ...

Right, and that created a few problems.

but then you were saying that that would be a users-writes-their-own-annotation-and-processor case so I assumed that that covers it and nothing to be done about it 😄...

Yeah I was just talking about the specific case where the binder needs to get the number of dimensions as a parameter. Not about cases where the binder doesn't need to know about the number of dimensions (because it's implied by input).

@marko-bekhta marko-bekhta force-pushed the feat/HSEARCH-5021-Improve-support-for-custom-value-bridges-for-vector-fields branch from fdf049c to 32972ef Compare December 21, 2023 09:21
@marko-bekhta
Copy link
Member Author

yeah, no worries 😃 and sorry that I didn't catch it that we should make these changes from your initial review 🙈 😃. I've applied your suggestions and cleaned up PropertyMappingVectorFieldOptionsStepImpl a bit more.

@marko-bekhta marko-bekhta force-pushed the feat/HSEARCH-5021-Improve-support-for-custom-value-bridges-for-vector-fields branch from 32972ef to ccf283a Compare December 21, 2023 14:30
Copy link

Copy link
Member

@yrodiere yrodiere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@yrodiere yrodiere merged commit b08e96e into hibernate:main Dec 21, 2023
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants