Skip to content

Commit

Permalink
Merge pull request #9 from Dewberry/bugfix/general_improvements
Browse files Browse the repository at this point in the history
Bugfix/general improvements
  • Loading branch information
slawler authored Mar 3, 2025
2 parents c726f5b + ef3b64b commit b25b559
Show file tree
Hide file tree
Showing 12 changed files with 252 additions and 91 deletions.
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -171,4 +171,7 @@ cython_debug/
.pypirc
catalogs
holding
.DS_Store
.DS_Store

bighorn
indian-creek
Empty file modified build-local.sh
100755 → 100644
Empty file.
Empty file modified docs/make.sh
100755 → 100644
Empty file.
17 changes: 16 additions & 1 deletion docs/source/change_log.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,21 @@
.. note::
Go to the `Releases <https://github.com/Dewberry/stormhub/releases.html>`__ page for a list of all releases.

Release v0.1.0
==============

**Tag:** v0.1.0

**Published at:** 2025-01-31T21:13:10Z

**Author:** github-actions[bot]

**Release Notes:**

Summary^^^^^^^^
This feature release adds routines for developing STAC catalogs using AORC data. Source code was ported from existing internal libraries and refactored for this initial release.


Release v0.1.0rc1
=================

Expand All @@ -12,6 +27,6 @@ Release v0.1.0rc1

**Release Notes:**


Initial code compilation from existing internal libraries, adapted for STAC catalogs.


Expand Down
38 changes: 19 additions & 19 deletions docs/source/tech_summary.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,14 @@ Storm Transposition Module

Data Source
-----------
The Analysis Of Record for Calibration (AORC) dataset is available on the AWS `Registry of Open Data <https://registry.opendata.aws/noaa-nws-aorc/>`_, and provides the
source data for precipitation and temperature data used in this module. (Other sources may be added but are not currently available). This gridded / houlry data is available for the CONUS
beginning on 1972-02-01 and is updated regularly.
The Analysis Of Record for Calibration (AORC) dataset is available on the AWS `Registry of Open Data <https://registry.opendata.aws/noaa-nws-aorc/>`_, and provides the
source data for precipitation and temperature data used in this module. (Other sources may be added but are not currently available). This gridded / hourly data is available for the CONUS
beginning on 1972-02-01 and is updated regularly.


The primary inputs used for the data development workflow are as follows.

Watershed = The waterhsed which will be used for hydrologic modeling.
Watershed = The watershed which will be used for hydrologic modeling.

Transposition Area = A region containing the watershed that has been developed as a hydro-meteorologically homogenous region.

Expand Down Expand Up @@ -46,10 +46,10 @@ cells of the watershed within the transposition region. The centroid location an
.. image:: ./images/2011-event.png


3. **Iterate over the period of record or desired date range**: In order to process multiple dates for a range (from start_date - end_date), there is an optional argumen `check_every_n_hours`. If set to 1, the process will sum up the storm duration for every hour from the start_date
3. **Iterate over the period of record or desired date range**: In order to process multiple dates for a range (from start_date - end_date), there is an optional argument `check_every_n_hours`. If set to 1, the process will sum up the storm duration for every hour from the start_date
to the end_date. For a 72-hour event, this would require processing 350,400 datasets (every hour for the period) for 40 years of record and would represent the most precise estimate to aid in identifying the start hour for the event. To save in processing
time and data, an alternate interval can be used. For example, selecting `check_every_n_hours` = 24 would result in 14,600 datasets processed for the same 40 year period.

check_every_n_hours = 6 (This would get check the totals every 6 hours, or 4 times a day)


Expand All @@ -67,15 +67,15 @@ After processing the data for every date in the requested date range, a csv is c
|1979-02-01T18 | 0.02 | 0.02 | 0.02 | -92.66397464708892 |40.50038658823523|
+------------------------+------------+----------+----------+---------------------+-----------------+

4. **Top events and date declustering** With the staticics in place, user settings can be used to create a STAC collection for the watershed / transpositon region / storm duration using the following inputs.
4. **Top events and date declustering** With the statistics in place, user settings can be used to create a STAC collection for the watershed / transpositon region / storm duration using the following inputs.

min_precip_threshold = 2 (Defaults to 1, this can be used to filter out events based on a minimum threshold)

top_n_events = 440 (This will be the total # of events in the collection. 440 would represent the top 10 events for 44 years)

To avoid double counting what is essentially the same storm because the N hour duration for several consecutive periods may result in a top storm, results of the query are iterated and added to a list,
a process filters storms to be skipped if there is any temporal overlap with a storm already existing in the list (the overlap is determined using the start time and duration of the top storm). As shown
in these images, these records are considered to be a single storm, and would be declustered, wherein the day with the greater mean precipitation would be included in the top storms collection and the other
in these images, these records are considered to be a single storm, and would be declustered, wherein the day with the greater mean precipitation would be included in the top storms collection and the other
would be dropped.


Expand All @@ -91,7 +91,7 @@ would be dropped.


5. The following additional arguments are available.

.. code:: bash
specific_dates # Can be provided to resume processing in the event of a failure or other use cases
Expand All @@ -102,23 +102,23 @@ would be dropped.
Results
Results
-------

A Storm Catalog is created containing a copy of the watershed, transposition domain, and the *valid transpositon domain* which is the space within the transposition domain wherein a
watershed can be transposed without encountering null space (i.e. part of the watershed extending outside of the trnasposition domain).
A Storm Catalog is created containing a copy of the watershed, transposition domain, and the *valid transpositon domain* which is the space within the transposition domain wherein a
watershed can be transposed without encountering null space (i.e. part of the watershed extending outside of the transposition domain).

.. image:: ./images/catalog.png


STAC Collections will be added to the catalog for each storm duration requested. The collection will include relevant data including summary statistics, plots, and other assets to rpovide
STAC Collections will be added to the catalog for each storm duration requested. The collection will include relevant data including summary statistics, plots, and other assets to provide
context and metadata for the data.

.. image:: ./images/storm-collection.png


The collection is compised of STAC Items, which provide links to source data and derivative products. For example, a model speciric timeseries file may be required for hydrologic modeling.
These files can be created and added to the event item alongside metadata and other information. Assets may include additional data required for modeling (i.e. temperature data, also available via AORC).
The collection is composed of STAC Items, which provide links to source data and derivative products. For example, a model specific timeseries file may be required for hydrologic modeling.
These files can be created and added to the event item alongside metadata and other information. Assets may include additional data required for modeling (i.e. temperature data, also available via AORC).
.. image:: ./images/storm-item.png


Expand All @@ -129,5 +129,5 @@ These files can be created and added to the event item alongside metadata and ot
This feature was evaluated and used in pilot projects, does not currently exist in this repository, but may be incorporated in the future.


Where possible, `NOAA Atlas-14 precipitation frequency estimates <https://hdsc.nws.noaa.gov/hdsc/pfds/pfds_gis.html>`_ may be considered to normalize the average accumulation for each storm.
.. image:: ./images/2yr03da.PNG
Where possible, `NOAA Atlas-14 precipitation frequency estimates <https://hdsc.nws.noaa.gov/hdsc/pfds/pfds_gis.html>`_ may be considered to normalize the average accumulation for each storm.
.. image:: ./images/2yr03da.png
13 changes: 7 additions & 6 deletions docs/source/user_guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
Getting Started
################

This section provides a high level overview for using stormhub for production, including starting the stormhub server and create objects.
This section provides a high level overview for using stormhub for production, including starting the stormhub server and creating objects.

Installation
------------
Expand All @@ -18,20 +18,21 @@ have Python already installed and setup:
Note that it is highly recommended to create a python `virtual environment
<https://docs.python.org/3/library/venv.html>`_ to install, test, and run
stormhub.
stormhub. It is also recommended to avoid use of Windows Subsystem for Linux (WSL)
as issues can arise with the parallel processing within stormhub.


Starting the server
-------------------

For convenience, a local file server is provided. This server is not necessary for data
production, but is useful for visualizing and exploring the data.
production, but is useful for visualizing and exploring the data.

**Start the stormhub file:**

.. code-block:: bash
stormhub-server <path-to-local-dir>
stormhub-server <path-to-local-dir>
Local file server is useful for interacting with STAC browser for viewing the data locally. This is not required....
Expand All @@ -44,7 +45,7 @@ Local file server is useful for interacting with STAC browser for viewing the da
Workflows
---------

A config file shown in below includes the information required to create a new catalog.
A config file shown below includes the information required to create a new catalog.

.. code-block:: json
Expand All @@ -62,7 +63,7 @@ A config file shown in below includes the information required to create a new c
}
The following snippet provides an example of how to build and create a storm catalog. Requires an example watershed and transposition domain (examples availble in the `repo <https://github.com/Dewberry/stormhub/tree/main/catalogs/example-input-data>`_).
The following snippet provides an example of how to build and create a storm catalog. Requires an example watershed and transposition domain (examples available in the `repo <https://github.com/Dewberry/stormhub/tree/main/catalogs/example-input-data>`_).

.. code-block:: python
Expand Down
15 changes: 12 additions & 3 deletions stormhub/hydro_domain.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
from datetime import datetime
from typing import Any, Union

import logging
import fiona.errors
import geopandas as gpd
from pystac import Item
from pystac.extensions.projection import ProjectionExtension
from shapely.geometry import Polygon, mapping, shape
from shapely.geometry import Polygon, mapping, shape, MultiPolygon

HYDRO_DOMAIN_DESCRIPTION = "hydro_domain:description"
HYDRO_DOMAIN_TYPE = "hydro_domain:type"
Expand Down Expand Up @@ -121,7 +121,16 @@ def load_geometry(self, geometry_source: Union[str, Polygon]) -> Polygon:
else:
raise ValueError("geometry_source must be a file path or a Polygon object")

if len(gdf) != 1 or not isinstance(gdf.geometry.iloc[0], Polygon):
if len(gdf) != 1:
raise ValueError("The geometry must contain a single polygon")

geometry = gdf.geometry.iloc[0]
if isinstance(geometry, MultiPolygon) and len(geometry.geoms) == 1:
logging.warning("Multipolygon type detected, attempting conversion to Polygon.")
geometry = geometry.geoms[0]
gdf.at[gdf.index[0], "geometry"] = geometry

if not isinstance(geometry, Polygon):
raise ValueError("The geometry must contain a single polygon")

try:
Expand Down
9 changes: 8 additions & 1 deletion stormhub/met/aorc/aorc.py
Original file line number Diff line number Diff line change
Expand Up @@ -267,6 +267,13 @@ def aorc_thumbnail(
edgecolor="gray",
)
ax.add_patch(valid_area_plt_polygon)
watershed_plt_polygon = patches.Polygon(
np.column_stack(self.watershed_geometry.exterior.coords.xy),
lw=0.7,
facecolor="none",
edgecolor="gray",
)
ax.add_patch(watershed_plt_polygon)
transposed_watershed_plt_polygon = patches.Polygon(
np.column_stack(self._transposed_watershed.exterior.coords.xy),
lw=1,
Expand All @@ -283,7 +290,7 @@ def aorc_thumbnail(
filename = f"{self.item_id}.thumbnail.png"
fn = os.path.join(self.local_directory, filename)
fig.savefig(fn, bbox_inches="tight")
asset = Asset(fn, media_type=MediaType.PNG, roles=["thumbnail"])
asset = Asset(filename, media_type=MediaType.PNG, roles=["thumbnail"])
self.add_asset("thumbnail", asset)
if return_fig:
return fig
Expand Down
Loading

0 comments on commit b25b559

Please sign in to comment.