Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reenable search result counts in portal #217

Open
7 tasks done
ACharbonneau opened this issue Aug 25, 2021 · 7 comments
Open
7 tasks done

Reenable search result counts in portal #217

ACharbonneau opened this issue Aug 25, 2021 · 7 comments

Comments

@ACharbonneau
Copy link
Contributor

ACharbonneau commented Aug 25, 2021

Summary

For quite some time, the portal has disabled the calculation and display of search result counts to avoid resource exhaustion and timeout errors. We would like to restore the counts display as it is very helpful to the data exploration UX.

Status

Work continues on this, focusing on a different query and DB denormalization strategy. The work is more general and should speed up the recordset searches in general, with an initial goal to then reenable dynamic counts based on these faster queries.

We're integrating under the umbrella of "array ops" at the engineering level:

  • Extend ermrest API with a new syntax for the facet choices to be expressed as an existentially quantified list of choices to the actual query predicate "column equals any value in a, b, c, ...". These are previously expressed as a disjunction "column = a OR column = b OR ..."
  • Extend ermrest to support another PostgreSQL index type and apply index-accelerated filter predicate operators when evaluating the new quantified list filter syntax on array columns.
  • Revise the portal model and ETL process to augment the core C2M2 entity types with denormalized and suitably indexed arrays of vocabulary terms for the facets.
  • Extend the chaise+ermrestjs UI layer to be able to use the new quantified value list filter syntax with ermrest.
  • Extend the chaise+ermrestjs UI layer to accept new configuration for a "fast filter source" to bypass the normal cross-tabular relationships to express facet filter criteria in terms of the denormalized array instead.
  • Adjust the portal configuration to enable the "fast filter sources" for most or all facets.
  • Adjust the portal configuration to enable result counts based on these accelerated queries.

The objective of these changes is to change the query regime used by the deriva stack to minimize (and in many cases eliminate) the need for table joins. Instead, the use of facets will express existentially quantified value list constraints against multiple arrays in the core C2M2 table (e.g. file or biosample) and PostgreSQL will be able to use a query plan intersecting the per-facet array columns' indexes.

Original issue text

The portal does not tell the user a count of how many records match the current query criteria.

image

I suspect this was part of Karls optimization work, and I vaguely remember discussing it, but I don't remember the details. Unfortunately, it makes it impossible to tell how many results I have when i'm searching unless I get it below 25. That's really a problem. We need to find a way to give the user an idea what they've searched without making the portal crash :/

@karlcz
Copy link
Contributor

karlcz commented Aug 25, 2021

That's right, this was to avoid the expensive full scans which generate counts. It is trivial to turn back on, but it's an all-or-nothing choice I am afraid. We do not have any cheaper way to do it on the drawing board...

@karlcz karlcz self-assigned this Aug 25, 2021
@ACharbonneau
Copy link
Contributor Author

Given that we don't have any real users right now, I think I'd err on the side of not crashing the portal, but I would like to try to get something on the drawing board :)

@RLC-DCPPC
Copy link
Contributor

Need this if at all possible for April demo (maybe on collections page)

@RLC-DCPPC
Copy link
Contributor

This is related to a deliverable due by April 30, 2023

@karlcz karlcz changed the title Portal doesn't tell me how many results I have Reenable search result counts in portal May 24, 2022
@karlcz
Copy link
Contributor

karlcz commented Jul 26, 2022

I've updated the issue description with a summary of the current work on this result counts topic.

@karlcz
Copy link
Contributor

karlcz commented Dec 7, 2022

The initial work for this is deployed in the app-dev catalog "1" test environment. This uses the new "fast filter" queries for the main result set and the result set count in the recordset app.

We are continuing to investigate additional optimizations we might be able to apply in the chaise UI to make more use of the new "fast filter" query forms. This could reduce the cost of dynamic updates to the individual faceting controls and selection modals from the left filtering side-bar.

@karlcz
Copy link
Contributor

karlcz commented Dec 13, 2022

Another round of optimizations were deployed to app-dev. This needs more real-world testing for performance with different usage of the recordset app, but looks good so far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants