Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1pt] Update CatFIM metadata API pulls and add duplicate site filtering, add CatFIM vis files #1379

Merged
merged 21 commits into from
Jan 10, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
bb55fb8
Add dual API pull and site filtering to CatFIM.
EmilyDeardorff Dec 16, 2024
9dcb6ad
Add CatFIM analysis Jupyter Notebooks.
EmilyDeardorff Dec 16, 2024
353c869
Clean up CatFIM vis notebook.
EmilyDeardorff Dec 17, 2024
292c14f
Create mapping functions for CatFIM.
EmilyDeardorff Dec 18, 2024
3d53817
Putting back in pre-saved pickle file for metadata
RobHanna-NOAA Dec 20, 2024
df5e3d2
Update CatFIM metadata script.
EmilyDeardorff Dec 23, 2024
ed2b686
Merge branch 'dev-catfim-v2-2-site-filtering' of https://github.com/N…
EmilyDeardorff Dec 23, 2024
62cde94
Move CatFIM vis functions to separate file.
EmilyDeardorff Dec 24, 2024
2c9702f
Update CatFIM eval notebooks.
EmilyDeardorff Dec 30, 2024
00cfd68
Delete tools/catfim/notebooks/TEMP_get_metadata.ipynb
EmilyDeardorff Jan 2, 2025
fd50dae
Update CHANGELOG.md
EmilyDeardorff Jan 3, 2025
f64c771
Update CHANGELOG.md again
EmilyDeardorff Jan 3, 2025
670de91
Linting.
EmilyDeardorff Jan 3, 2025
7697026
Cleaned up eval catfim metadata notebook.
EmilyDeardorff Jan 9, 2025
c1dba78
Add CatFIM qgis preset to symbology folder.
EmilyDeardorff Jan 9, 2025
22f45e4
Update CHANGELOG.md to reflect symbology file and remove checkpoint f…
EmilyDeardorff Jan 9, 2025
cb158f4
Merge branch 'dev' of https://github.com/NOAA-OWP/inundation-mapping …
Jan 10, 2025
8b97ecb
Fixing merge with dev issues
Jan 10, 2025
7d8412b
Merge branch 'dev' of https://github.com/NOAA-OWP/inundation-mapping …
Jan 10, 2025
da1d3fd
merged with dev
Jan 10, 2025
de05c0f
fix changelog
Jan 10, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
917 changes: 917 additions & 0 deletions config/symbology/qgis/catfim_library.qml

Large diffs are not rendered by default.

20 changes: 19 additions & 1 deletion docs/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,25 @@
All notable changes to this project will be documented in this file.
We follow the [Semantic Versioning 2.0.0](http://semver.org/) format.

## v4.5.13.6 - 2025-1-10 - [PR#1387](https://github.com/NOAA-OWP/inundation-mapping/pull/1387)
## v4.5.13.7 - 2025-01-10 - [PR#1379](https://github.com/NOAA-OWP/inundation-mapping/pull/1379)

There are many sites in non-CONUS regions (AK, PR, HI) where we would like to run CatFIM but they are being excluded because they are not NWM forecast points. This update brings back the double API pull and adds in some code to filter out duplicate (and NULL) lids from the metadata lists.

### Additions
- `inundation-mapping/tools/catfim/vis_categorical_fim.py`: Functions for reading in, processing, and visualizing CatFIM results.
- `inundation-mapping/tools/catfim/notebooks/vis_catfim_cross_section.ipynb`: A new Jupyter notebook for viewing and analyzing CatFIM results.
- `inundation-mapping/tools/catfim/notebooks/eval_catfim_metadata.ipynb`: A new Jupyter notebook for evaluating metadata and results from CatFIM runs.
- `inundation-mapping\config/symbology/qgis/catfim_library.qml`: Symbology preset for viewing CatFIM library in QGIS.


### Changes

- `inundation-mapping/tools/catfim/generate_categorical_fim_flows.py`: Re-implements the dual API call and filters out duplicate sites.


<br/><br/>

## v4.5.13.6 - 2025-01-10 - [PR#1387](https://github.com/NOAA-OWP/inundation-mapping/pull/1387)

Fixes two issues in test_cases:
1. An error in `synthesize_test_cases` and `run_test_case` if any directories of the 5 benchmark sources (BLE, NWS, IFC, USGS, or ras2fim) do not exist. This issue was originally discovered and fixed in #1178, but is being elevated to its own PR here. Fixes #1386.
Expand Down
95 changes: 49 additions & 46 deletions tools/catfim/generate_categorical_fim_flows.py
Original file line number Diff line number Diff line change
Expand Up @@ -646,7 +646,7 @@ def __load_nwm_metadata(

FLOG.trace(metadata_url)

all_meta_lists = []
output_meta_list = []
# Check to see if meta file already exists
# This feature means we can copy the pickle file to another enviro (AWS?) as it won't need to call
# WRDS unless we need a smaller or modified version. This one likely has all nws_lid data.
Expand All @@ -655,40 +655,31 @@ def __load_nwm_metadata(
FLOG.lprint(f"Meta file already downloaded and exists at {nwm_metafile}")

with open(nwm_metafile, "rb") as p_handle:
all_meta_lists = pickle.load(p_handle)
output_meta_list = pickle.load(p_handle)

else:
meta_file = os.path.join(output_catfim_dir, "nwm_metafile.pkl")

FLOG.lprint(f"Meta file will be downloaded and saved at {meta_file}")

if lid_to_run != "all":
# single lid for now

# must_include_value variable not yet tested
# must_include_value = 'nws_data.rfc_forecast_point' if lid_to_run not in ['HI', 'PR', 'AK'] else None
all_meta_lists, ___ = get_metadata(
if lid_to_run != "all": # TODO: Deprecate LID options (in favor of HUC list functionlity)
# Single lid for now (deprecated)
output_meta_list, ___ = get_metadata(
metadata_url,
select_by='nws_lid',
selector=[lid_to_run],
must_include='nws_data.rfc_forecast_point',
upstream_trace_distance=nwm_us_search,
downstream_trace_distance=nwm_ds_search,
)
else:
# This gets all records including AK, HI and PR, but only if they ahve forecast points

# Note: Nov 2024: AK has 152 sites with forecast points, but after research
# non of the AK sites that nws_data.rfc_forecast_point = False so we can
# exclude them.
# Previously we allowed HI and PR sites to come in and most were failing.
# So, we will include HI and PR as well here

# We can not just filter out based on dup lids as depending on which
# metadata load they are on, dup lid records will have different data
else:
# Dec 2024: Running two API calls: one to get all forecast points, and another
# to get all points (non-forecast and forecast) for the OCONUS regions. Then,
# duplicate LIDs are removed.

# orig_meta_lists, ___ = get_metadata(
all_meta_lists, ___ = get_metadata(
# Get all forecast points
forecast_point_meta_list, ___ = get_metadata(
metadata_url,
select_by='nws_lid',
selector=['all'],
Expand All @@ -697,41 +688,53 @@ def __load_nwm_metadata(
downstream_trace_distance=nwm_ds_search,
)

# If we decided to put HI and PR back in we can do two loads one
# with the flag and one without. Then iterate through the meta_lists
# results and filtered out based on the state full value. Then
# call WRDS again with those specific states and simply concat them.
# Get all points for OCONUS regions (HI, PR, and AK)
oconus_meta_list, ___ = get_metadata(
metadata_url,
select_by='state',
selector=['HI', 'PR', 'AK'],
must_include=None,
upstream_trace_distance=nwm_us_search,
downstream_trace_distance=nwm_ds_search,
)

# Append the lists
unfiltered_meta_list = forecast_point_meta_list + oconus_meta_list

# filtered_meta_data = []
# for metadata in orig_meta_lists:
# df = pd.json_normalize(metadata)
# state = df['nws_data.state'].item()
# lid = df['identifiers.nws_lid'].item()
# if state.lower() not in ["alaska", "hawaii", "puerto rico"]:
# filtered_meta_data.append(metadata)
# print(f"len(all_meta_lists) is {len(all_meta_lists)}")

# must_include='nws_data.rfc_forecast_point',
# Filter the metadata list
output_meta_list = []
unique_lids, duplicate_lids = [], [] # TODO: remove?
duplicate_meta_list = []
nonelid_metadata_list = [] # TODO: remove

# Nov 2024: We used to call them site specific and may add them back in but ok to leave then
for i, site in enumerate(unfiltered_meta_list):
nws_lid = site['identifiers']['nws_lid']

# islands_list, ___ = get_metadata(
# metadata_url,
# select_by='state',
# selector=['HI', 'PR'],
# must_include=None,
# upstream_trace_distance=nwm_us_search,
# downstream_trace_distance=nwm_ds_search,
# )
if nws_lid is None:
# No LID available
nonelid_metadata_list.append(site) # TODO: replace with Continue

# Append the lists
# all_meta_lists = filtered_all_meta_list + islands_list
elif nws_lid in unique_lids:
# Duplicate LID
duplicate_lids.append(nws_lid)
duplicate_meta_list.append(site) # TODO: remove extra lists

# print(f"len(all_meta_lists) is {len(all_meta_lists)}")
else:
# Unique/unseen LID that's not None
unique_lids.append(nws_lid)
output_meta_list.append(site)

FLOG.lprint(f'{len(duplicate_lids)} duplicate points removed.')
FLOG.lprint(f'Filtered metadatada downloaded for {len(output_meta_list)} points.')

# ----------

with open(meta_file, "wb") as p_handle:
pickle.dump(all_meta_lists, p_handle, protocol=pickle.HIGHEST_PROTOCOL)
pickle.dump(output_meta_list, p_handle, protocol=pickle.HIGHEST_PROTOCOL)

return all_meta_lists
return output_meta_list


if __name__ == '__main__':
Expand Down
Loading
Loading