Add datetime range aliases for optimized index filtering by Gomez324 · Pull Request #537 · stac-utils/stac-fastapi-elasticsearch-opensearch

Gomez324 · 2025-11-23T04:16:33Z

Related Issue(s):

Description:

Until now, only the datetime field had aliases. This change adds aliases for start_datetime and end_datetime when USE_DATETIME=false, which enables optimized filtering when searching by these fields. It improves performance because Elasticsearch/OpenSearch can now route queries to the appropriate indices instead of scanning a larger number of them.

When USE_DATETIME=true, the system works as before with datetime-based aliases only.

Example with use_datetime=false:
Index A with aliases:
{
"start_datetime": "items_start_datetime_new-collection_2020-02-08",
"end_datetime": "items_end_datetime_new-collection_2020-02-16"
}
Index B with aliases:
{
"start_datetime": "items_start_datetime_new-collection_2020-02-12",
"end_datetime": "items_end_datetime_new-collection_2020-02-17"
}
Index C with aliases:
{
"start_datetime": "items_start_datetime_new-collection_2020-02-18",
"end_datetime": "items_end_datetime_new-collection_2020-02-20"
}

When a user searches in the range start_datetime/end_datetime = 2020-02-10 / 2020-02-16, Index A and Index B will be queried because both indices overlap with the requested range. Index C will be excluded because it does not intersect with that time window.

Previously, all indices could have been selected, but the new aliases allow the query engine to efficiently identify which indices overlap with the target range and avoid scanning unrelated ones, such as Index C.

To enable this feature, set USE_DATETIME=false in your configuration. If you want to keep the previous behavior with datetime aliases, set USE_DATETIME=true.

PR Checklist:

Code is formatted and linted (run pre-commit run --all-files)
Tests pass (run make test)
Documentation has been updated to reflect changes, if applicable
Changes are added to the changelog

Gomez324 · 2025-11-28T09:37:27Z

Hi @jonhealy1, Will you have time soon to do a code review?

jonhealy1 · 2025-11-28T10:46:42Z

@Gomez324 I will make time this weekend. Can you fix the conflicts? Thanks

jonhealy1 · 2025-11-29T03:03:58Z

stac_fastapi/elasticsearch/stac_fastapi/elasticsearch/database_logic.py

+                "gte": None,
+                "lte": datetime_search.get("lte") if not USE_DATETIME else None,
+            },
+        }


This added code complicates the core database logic by tightly coupling it to a specific indexing strategy. Please move this calculation into the IndexSelector (the actual consumer) to keep the core method focused solely on query construction.

jonhealy1 · 2025-11-29T03:05:54Z

stac_fastapi/opensearch/pyproject.toml

    "opensearch-py[async]~=2.8.0",
    "uvicorn~=0.23.0",
    "starlette>=0.35.0,<0.36.0",
+    "redis==6.4.0",


Redis should not be installed in the core package as most Users probably won't use Redis. It can be installed with pip install stac-fastapi-elasticsearch[redis] or with dev

jonhealy1 · 2025-11-29T03:06:32Z

stac_fastapi/opensearch/stac_fastapi/opensearch/database_logic.py

+
+        if not datetime_search:
+            return search, result_metadata
+


See other comment on Elasticsearch version of this code.

jonhealy1 · 2025-11-29T03:11:24Z

stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/search_engine/managers.py

            raise HTTPException(
                status_code=status.HTTP_400_BAD_REQUEST,
-                detail="Product datetime is required for indexing",
+                detail="Product 'start_datetime', 'datetime' and 'end_datetime' is required for indexing",


This validation logic violates the STAC specification in two ways:

It creates a mandatory requirement for start_datetime and end_datetime, which are optional fields in the spec.

It rejects items where datetime is null (but start/end are present), which is explicitly allowed for interval data.

Please refactor this to handle standard STAC items (single datetime) and interval items (null datetime) correctly.

@jonhealy1 I agree with you. However, if indexes are to be created based on start_datetime, then that field must always be required.

What if we tie this validation to the existing USE_DATETIME setting?

If USE_DATETIME=true (Default): We allow items that only have a datetime field. In these cases, we can derive the index partition name from the datetime field instead of raising a 400 error.

If USE_DATETIME=false: Then strict enforcement of start_datetime is appropriate.

This ensures we support standard STAC items (point-in-time) without forcing users to reconfigure or reformat their data.

@jonhealy1

Good idea. I'll need some more time to implement it, but it is doable.

If USE_DATETIME is true, then datetime is required, and the aliases will work as they do now using only datetime, so the migration tool will not be needed? And if it is false, then start_datetime and end_datetime are required, while datetime becomes optional?

@Gomez324 Sounds good! Yes, I think migration scripts would not be needed.

jonhealy1 · 2025-11-29T03:17:08Z

stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/database/index.py

+        datetime_alias = index_dict.get("datetime")
+
+        if not start_datetime_alias:
+            continue


This line effectively makes all existing production indexes invisible to the API. Current indexes do not have start_datetime aliases.

Where is the migration plan to backfill aliases on historical data?

Without a migration, this change breaks backwards compatibility and will return 0 results for existing datasets.

jonhealy1 · 2025-11-29T03:20:03Z

stac_fastapi/elasticsearch/pyproject.toml

    "elasticsearch[async]~=8.19.1",
    "uvicorn~=0.23.0",
    "starlette>=0.35.0,<0.36.0",
+    "redis==6.4.0",


Same here - let's not install redis here. It's an optional feature.

jonhealy1 · 2025-11-30T02:54:50Z

@Gomez324 In the description for this PR, you state that Index B (12th-17th) lies outside the requested range (10th-16th) and would be skipped.

This description implies incorrect behavior. STAC API searches rely on Intersection, not Containment. Since Index B overlaps with the search window, it must be queried; otherwise, valid items from the 12th to the 16th would be hidden from the user.

Looking at the code in check_criteria, it appears you are correctly implementing intersection logic (which contradicts your description). Please update the PR description to avoid confusion, as the current example implies the feature is broken.

Gomez324 · 2025-12-02T13:48:19Z

Hey @jonhealy1 I've fixed the code according to the suggestions, it's ready for a CR.

stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/search_engine/selection/cache_manager.py

stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/search_engine/inserters.py

stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/search_engine/managers.py

stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/search_engine/inserters.py

stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/search_engine/managers.py

jonhealy1

Thanks for the updates. Since stac-fastapi-core is a public library, changing the method signatures in core.py is technically a Breaking Change that would force a Major Version bump (v3.0.0), which we want to avoid right now.

Please revert the changes to stac_fastapi/core/core.py.

To keep your optimization, simply handle the pass-through in the Elasticsearch plugin:

In database_logic.py, return (search, datetime_string) to satisfy the existing tuple signature.

In client.py (execute_search), accept the string via the existing datetime_search argument.

This keeps the optimization self-contained in the Elasticsearch package and saves us from versioning headaches in Core.

Gomez324 · 2025-12-08T14:06:34Z

@jonhealy1 I'm not sure if this is what you meant, but I made the changes. Could you take a look

jonhealy1

Just revert these two changes in core

stac_fastapi/core/stac_fastapi/core/core.py

jonhealy1

Nice work thanks

Gomez324 · 2025-12-08T15:45:29Z

Thanks, and thank you for the solid CR as well. I will let you know once our QA tests it.

Gomez324 · 2026-01-22T10:18:53Z

@jonhealy1 QA has already tested everything on our end and I got an approve from him

jonhealy1

@Gomez324 Looks great - just one question

jonhealy1 · 2026-01-22T11:34:29Z

stac_fastapi/tests/api/test_api_datetime_filtering.py

+#     all_aliases = set()
+#     for index_info in indices.values():
+#         all_aliases.update(index_info.get("aliases", {}).keys())
+#     assert all(alias in all_aliases for alias in expected_aliases)


@Gomez324 Should we delete these commented out tests?

@jonhealy1 Thanks for catching that, those tests should be uncommented. I’ve already fixed it

There is one more issue related to a race condition, but I’ll describe it in a separate issue together with a proposed solution (I didn’t want to include it in this MR because it would already be quite large, and this race condition was also present in the previous version).

jonhealy1

@Gomez324 Nice work!. As soon as you create the issue you talked about - I will merge this!

Gomez324 · 2026-01-22T12:29:46Z

@jonhealy1 #583

### v6.10.0 - 2026-01-22 #### Added - Added Helm chart for ES or OS in-cluster deployment [#455](#455) - Added configurable hidden item filtering via HIDE_ITEM_PATH environment variable. [#566](#566) #### Changed - Added `PUT /catalogs/{catalog_id}` endpoint to update existing catalogs. Allows modification of catalog metadata (title, description, etc.) while preserving internal fields like parent_ids and catalog relationships. [#573](#573) - Added catalog poly-hierarchy support with hierarchical catalog endpoints (`GET /catalogs/{catalog_id}/catalogs` and `POST /catalogs/{catalog_id}/catalogs`), enabling unlimited nesting levels and allowing catalogs to belong to multiple parent catalogs simultaneously. Includes cursor-based pagination and performance optimizations. [#573](#573) - Added end_datetime alias for datetime-based indexes with use_datetime=false, so that start_datetime/end_datetime queries select a smaller range of indexes (limiting the end) [#537](#537) **PR Checklist:** - [x] Code is formatted and linted (run `pre-commit run --all-files`) - [x] Tests pass (run `make test`) - [x] Documentation has been updated to reflect changes, if applicable - [x] Changes are added to the changelog

…ed indexing strategy. (#595) A fix to my previous MR(#537) related to indices based on datetime / start_datetime / end_datetime. During a rebase, one line was changed. Additionally, I’m adding a fallback and logging for index selection. PR Checklist: - [ ] Code is formatted and linted (run `pre-commit run --all-files`) - [ ] Tests pass (run `make test`) - [ ] Documentation has been updated to reflect changes, if applicable - [ ] Changes are added to the changelog

Gomez324 requested a review from jonhealy1 November 23, 2025 04:53

Gomez324 force-pushed the CAT-1476 branch from b1ce7e5 to a654f29 Compare November 28, 2025 12:04

jonhealy1 requested changes Nov 29, 2025

View reviewed changes

Gomez324 force-pushed the CAT-1476 branch from 00ed491 to 39c38fe Compare December 2, 2025 13:39

ZSzCF reviewed Dec 4, 2025

View reviewed changes

jonhealy1 requested changes Dec 7, 2025

View reviewed changes

jonhealy1 requested changes Dec 8, 2025

View reviewed changes

stac_fastapi/core/stac_fastapi/core/core.py Outdated Show resolved Hide resolved

stac_fastapi/core/stac_fastapi/core/core.py Outdated Show resolved Hide resolved

jonhealy1 self-requested a review December 8, 2025 15:40

jonhealy1 previously approved these changes Dec 8, 2025

View reviewed changes

Gomez324 dismissed jonhealy1’s stale review via 917ddab December 8, 2025 20:03

Gomez324 force-pushed the CAT-1476 branch 3 times, most recently from a6508a5 to 0d50357 Compare December 15, 2025 20:11

Gomez324 added 11 commits December 15, 2025 21:38

before tests

25ed91c

Add additional temporal aliases

1dee8a7

fix

f945b1a

fix

c89bd7a

black

5b0e602

pre-commit

8d393de

cr

c82163c

cr

88cf861

core,database_logic

96e1974

cr

1aabf49

cache refresh during index removal

349a5ca

Gomez324 added 2 commits December 15, 2025 21:42

pre-commit run

28e7d2f

mappings

0272646

Gomez324 force-pushed the CAT-1476 branch from 0d50357 to 0272646 Compare December 15, 2025 20:44

Gomez324 added 8 commits December 15, 2025 22:42

blocking datetime updates

11c0e2e

item collection change

f3f5f6d

create index for early date or extend index for eraly date

1653d8c

test

c520bd2

is_closed_index and first_index

099ddee

unwanted alias removal

a2ad105

fix

10851e1

fixes

d3f5cce

Gomez324 force-pushed the CAT-1476 branch 2 times, most recently from 9ebaccd to d3f5cce Compare January 18, 2026 23:35

Gomez324 added 2 commits January 19, 2026 01:35

rebase

d6eb378

CAT-1476 - Added end_datetime alias for datetime-based indexes

9ac8317

Merge branch 'main' into CAT-1476

60febf6

jonhealy1 requested changes Jan 22, 2026

View reviewed changes

Gomez324 added 2 commits January 22, 2026 12:45

tests

1f5407f

pre-commit

815567d

jonhealy1 self-requested a review January 22, 2026 12:09

jonhealy1 approved these changes Jan 22, 2026

View reviewed changes

jonhealy1 merged commit 71fdca3 into main Jan 22, 2026
8 checks passed

jonhealy1 deleted the CAT-1476 branch January 22, 2026 12:31

jonhealy1 mentioned this pull request Jan 24, 2026

v6.10.0 #584

Merged

4 tasks

This was referenced Feb 1, 2026

Fixed end_datetime selection when splitting indices in a datetime-based indexing strategy #591

Closed

Fixed end_datetime selection when splitting indices in a datetime-based indexing strategy. #595

Merged

Conversation

Gomez324 commented Nov 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Gomez324 commented Nov 28, 2025

Uh oh!

jonhealy1 commented Nov 28, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonhealy1 Nov 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonhealy1 commented Nov 30, 2025

Uh oh!

Gomez324 commented Dec 2, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jonhealy1 left a comment

Choose a reason for hiding this comment

Uh oh!

Gomez324 commented Dec 8, 2025

Uh oh!

jonhealy1 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jonhealy1 left a comment

Choose a reason for hiding this comment

Uh oh!

Gomez324 commented Dec 8, 2025

Uh oh!

Gomez324 commented Jan 22, 2026

Uh oh!

jonhealy1 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonhealy1 left a comment

Choose a reason for hiding this comment

Uh oh!

Gomez324 commented Jan 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Gomez324 commented Nov 23, 2025 •

edited

Loading

jonhealy1 Nov 29, 2025 •

edited

Loading