Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store object metadata in metabases for SearchV2 service #3080

Merged
merged 1 commit into from
Feb 11, 2025

Conversation

cthulhu-rider
Copy link
Contributor

@cthulhu-rider cthulhu-rider commented Jan 13, 2025

  • store metadata on Put
  • unfiltered
  • test scenarios for filtered
  • filtered
  • attributes
  • test sorting
  • removal
  • migration

@cthulhu-rider cthulhu-rider force-pushed the object-searchv2/metabases branch 4 times, most recently from 9624b00 to 3540889 Compare January 14, 2025 13:17
@cthulhu-rider cthulhu-rider force-pushed the object-searchv2/metabases branch 2 times, most recently from f16c1c1 to 8957af6 Compare January 16, 2025 15:38
@cthulhu-rider cthulhu-rider force-pushed the object-searchv2/metabases branch 3 times, most recently from c29b7e5 to 00c7b83 Compare January 20, 2025 13:56
@carpawell carpawell mentioned this pull request Jan 20, 2025
@cthulhu-rider cthulhu-rider force-pushed the object-searchv2/metabases branch 5 times, most recently from 22664c2 to 4bb5abf Compare January 24, 2025 11:30
@cthulhu-rider cthulhu-rider force-pushed the object-searchv2/metabases branch 3 times, most recently from c2c593b to 0904251 Compare February 5, 2025 14:10
@cthulhu-rider
Copy link
Contributor Author

it seems working except case when primary filtered attribute is not the 1st requested one. Querying the primary attribute is almost always needed, e.g. with NE/PREFIX/NUM filters. It's only redundant with EQ, but for now a primary attribute request can be required. In the future, if the requirement is relaxed, the client's behavior will not be broken and he will be able to slightly optimize the search query

i also dont plan to support migration and GC in this PR so to not block testing with fresh storages

i need some time to review TODOs, fix linters and beautify the code, but it is rdy overall

@cthulhu-rider cthulhu-rider force-pushed the object-searchv2/metabases branch 3 times, most recently from eb9036f to 871972a Compare February 6, 2025 11:03
Copy link

codecov bot commented Feb 6, 2025

Codecov Report

Attention: Patch coverage is 71.83099% with 180 lines in your changes missing coverage. Please review.

Project coverage is 22.92%. Comparing base (5279df2) to head (9e2e544).
Report is 19 commits behind head on master.

Files with missing lines Patch % Lines
pkg/local_object_storage/metabase/metadata.go 72.36% 115 Missing and 53 partials ⚠️
pkg/local_object_storage/metabase/put.go 55.00% 6 Missing and 3 partials ⚠️
pkg/local_object_storage/metabase/containers.go 0.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3080      +/-   ##
==========================================
+ Coverage   22.47%   22.92%   +0.45%     
==========================================
  Files         751      750       -1     
  Lines       57804    58710     +906     
==========================================
+ Hits        12991    13462     +471     
- Misses      43929    44308     +379     
- Partials      884      940      +56     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@cthulhu-rider cthulhu-rider force-pushed the object-searchv2/metabases branch 3 times, most recently from 8dffd53 to 5d513b6 Compare February 6, 2025 11:50
@cthulhu-rider cthulhu-rider marked this pull request as ready for review February 6, 2025 11:55
return nil, "", fmt.Errorf("empty attribute #%d", i)
}
if attrs[i] == object.FilterContainerID || attrs[i] == object.FilterID {
return nil, "", fmt.Errorf("prohibited attribute %s", attrs[i])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These checks (including the ones above of course) need to be separated out, they're about semantic request validation, so the Search handler should perform them first and if successful run searches everywhere (including local) that could fail only for some internal (not related to request itself) reasons.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dropped

primSeekPrefix = primSeekKey[:1+len(primAttr)+len(utf8Delimiter)]
valID := primSeekKey[len(primSeekPrefix):]
if len(valID) <= oid.Size {
return nil, nil, fmt.Errorf("%w: too small VAL_OID len %d", errInvalidCursor, len(valID))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also a part of the initial search parameter check.

Copy link
Contributor Author

@cthulhu-rider cthulhu-rider Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possible with #3080 (comment). Done


// Search selects up to count container's objects from the given container
// matching the specified filters.
func (db *DB) Search(cnr cid.ID, fs object.SearchFilters, attrs []string, cursor string, count uint16) ([]client.SearchResultItem, string, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should meta package use SDK's client package?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SDK is a basest in the node

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sdk in general -- agree, but the client package -- not sure

@cthulhu-rider cthulhu-rider force-pushed the object-searchv2/metabases branch 5 times, most recently from 8085434 to c59f430 Compare February 10, 2025 11:21
primMatcher == object.MatchNumGT || primMatcher == object.MatchNumGE {
var err error
if primSeekKey, primSeekPrefix, err = seekKeyForAttribute(primAttr, fs[0].Value()); err != nil {
return nil, nil, fmt.Errorf("invalid primary filter value: %w", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be checked before going into searchInBucket?

Copy link
Contributor Author

@cthulhu-rider cthulhu-rider Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be, dont think this is sufficient for now. I already have TODO bout this

There is a need to serve `ObjectService.SearchV2` RPC by the SN. In
order not to expand the structure and configuration of the node, the
best place to store metadata is metabase.

Metabases are extended with per-container object metadata buckets. For
each object, following indexes are created:
 - OID;
 - attribute->OID;
 - OID->attribute.

Integers are stored specifically to reach lexicographic comparisons
without decoding.

New `Search` method is provided: it allows to filter out container's
objects and receive specified attributes. Count is also limited, op is
paged via cursor. In other words, the method follows SearchV2 behavior
within single metabase.

Refs #3058.

Signed-off-by: Leonard Lyubich <[email protected]>
@cthulhu-rider cthulhu-rider force-pushed the object-searchv2/metabases branch from c59f430 to 9e2e544 Compare February 10, 2025 14:16
if s == "" {
return nil, nil
}
b := make([]byte, 1+base64.StdEncoding.DecodedLen(len(s)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isnt DecodeString about the same but shorter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DecodeString creates new buffer on his own, this isnt desired here


// Search selects up to count container's objects from the given container
// matching the specified filters.
func (db *DB) Search(cnr cid.ID, fs object.SearchFilters, attrs []string, cursor string, count uint16) ([]client.SearchResultItem, string, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sdk in general -- agree, but the client package -- not sure

@cthulhu-rider cthulhu-rider merged commit 64dac4c into master Feb 11, 2025
21 of 22 checks passed
@cthulhu-rider cthulhu-rider deleted the object-searchv2/metabases branch February 11, 2025 08:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants