You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A prototype will allow us to test the .duckdb database format. We are interested in understanding how the user experience is for this packaging from the perspective of lab members and end users running analyses. How does this new file format interact with existing ARTIS ecosystem tools? We need a real example that we can start querying and understanding what other documentation or tools are needed, or if this is even a good idea in the first place.
Future vision
Moving forward we will ingest new data and rerun the ARTIS model ~1x / year. There are 2 "modes" for the model depending on which data is input into the model. These "modes" are:
FAO (Food and Agriculture Organization of the United Nations) and
SAU (Sea Around Us).
Both ARTIS model "modes" produce similarly named (if not the same) output files. Ultimately we will have 2 .duckdb databases to distinguish the two ARTIS database "modes". This will ensure that end users are querying and analyzing a set of files that all correspond with the exact same model run.
To-do List
Create 2 .duckdb databases based on the table outlines below. [https://github.com/Seafood-Globalization-Lab/artis-model/wiki/ARTIS-Database-Tables](ARTIS tables descriptions in wiki)
ARTIS_FAO_[model-version]_[run-date].duckdb
trade
primary database table
sometimes also referred to as "snet"
KNB has this data in the data/trade/ directory divided up by HS version and year (e.g. artis_midpoint_HS02_2016.csv)
consumption
primary database table
KNB has this data in the data/consumption/ directory divided up by HS version and year (e.g. consumption_midpoint_HS12_2012.csv)
Description
A prototype will allow us to test the
.duckdb
database format. We are interested in understanding how the user experience is for this packaging from the perspective of lab members and end users running analyses. How does this new file format interact with existing ARTIS ecosystem tools? We need a real example that we can start querying and understanding what other documentation or tools are needed, or if this is even a good idea in the first place.Future vision
Moving forward we will ingest new data and rerun the ARTIS model ~1x / year. There are 2 "modes" for the model depending on which data is input into the model. These "modes" are:
FAO
(Food and Agriculture Organization of the United Nations) andSAU
(Sea Around Us).Both ARTIS model "modes" produce similarly named (if not the same) output files. Ultimately we will have 2
.duckdb
databases to distinguish the two ARTIS database "modes". This will ensure that end users are querying and analyzing a set of files that all correspond with the exact same model run.To-do List
Create 2
.duckdb
databases based on the table outlines below. [https://github.com/Seafood-Globalization-Lab/artis-model/wiki/ARTIS-Database-Tables](ARTIS tables descriptions in wiki)ARTIS_FAO_[model-version]_[run-date].duckdb
data/trade/
directory divided up byHS
version andyear
(e.g.artis_midpoint_HS02_2016.csv
)data/consumption/
directory divided up byHS
version andyear
(e.g.consumption_midpoint_HS12_2012.csv
)data/attribute tables/code_max_resolved.csv
data/attribute tables/countries.csv
data/attribute tables/products.csv
data/attribute tables/sciname.csv
data/attribute tables/standardized_fao_pop.csv
ARTIS_SAU_[model-version]_[run-date].duckdb
snet_midpoint_all_hs_all_years.parquet
2024_09_12_SAU_consumption_midpoint.parquet
files-for-duckdb/sql_database/baci.csv
files-for-duckdb/sql_database/code_max_resolved.csv
files-for-duckdb/sql_database/countries.csv
files-for-duckdb/sql_database/products.csv
files-for-duckdb/sql_database/sciname.csv
Originally posted by @theamarks in Seafood-Globalization-Lab/artis-model#26 (comment)
The text was updated successfully, but these errors were encountered: