Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create ARTIS duckdb prototype #1

Open
16 tasks
theamarks opened this issue Feb 11, 2025 · 0 comments
Open
16 tasks

Create ARTIS duckdb prototype #1

theamarks opened this issue Feb 11, 2025 · 0 comments
Assignees
Labels
🪄 enhancement New functionality or feature request

Comments

@theamarks
Copy link
Member

theamarks commented Feb 11, 2025

Description

A prototype will allow us to test the .duckdb database format. We are interested in understanding how the user experience is for this packaging from the perspective of lab members and end users running analyses. How does this new file format interact with existing ARTIS ecosystem tools? We need a real example that we can start querying and understanding what other documentation or tools are needed, or if this is even a good idea in the first place.

Future vision

Moving forward we will ingest new data and rerun the ARTIS model ~1x / year. There are 2 "modes" for the model depending on which data is input into the model. These "modes" are:

  • FAO (Food and Agriculture Organization of the United Nations) and
  • SAU (Sea Around Us).

Both ARTIS model "modes" produce similarly named (if not the same) output files. Ultimately we will have 2 .duckdb databases to distinguish the two ARTIS database "modes". This will ensure that end users are querying and analyzing a set of files that all correspond with the exact same model run.

To-do List

Create 2 .duckdb databases based on the table outlines below. [https://github.com/Seafood-Globalization-Lab/artis-model/wiki/ARTIS-Database-Tables](ARTIS tables descriptions in wiki)

  • ARTIS_FAO_[model-version]_[run-date].duckdb

    • trade
      • primary database table
      • sometimes also referred to as "snet"
      • KNB has this data in the data/trade/ directory divided up by HS version and year (e.g. artis_midpoint_HS02_2016.csv)
    • consumption
      • primary database table
      • KNB has this data in the data/consumption/ directory divided up by HS version and year (e.g. consumption_midpoint_HS12_2012.csv)
    • code_max_resolved
      • attribute table
      • KNB data/attribute tables/code_max_resolved.csv
    • countries
      • attribute table
      • KNB data/attribute tables/countries.csv
    • products
      • attribute table
      • KNB data/attribute tables/products.csv
    • sciname
      • attribute table
      • KNB data/attribute tables/sciname.csv
    • standardized_sau_pop
      • attribute table
      • KNB data/attribute tables/standardized_fao_pop.csv
  • ARTIS_SAU_[model-version]_[run-date].duckdb

    • trade
      • primary database table
      • sometimes also referred to as "snet"
      • Google Drive folder has this data aggregated into a single file snet_midpoint_all_hs_all_years.parquet
    • consumption
      • primary database table
      • Google Drive folder has this data aggregated into a single file 2024_09_12_SAU_consumption_midpoint.parquet
    • baci
      • attribute table
      • Google Drive files-for-duckdb/sql_database/baci.csv
    • code_max_resolved
      • attribute table
      • Google Drive files-for-duckdb/sql_database/code_max_resolved.csv
    • countries
      • attribute table
      • Google Drive files-for-duckdb/sql_database/countries.csv
    • products
      • attribute table
      • Google Drive files-for-duckdb/sql_database/products.csv
    • sciname
      • attribute table
      • Google Drive files-for-duckdb/sql_database/sciname.csv

Originally posted by @theamarks in Seafood-Globalization-Lab/artis-model#26 (comment)

@theamarks theamarks added the 🪄 enhancement New functionality or feature request label Feb 11, 2025
@theamarks theamarks moved this to 🏷 Ready in ARTIS Maintence & Analysis Feb 11, 2025
@theamarks theamarks added this to the Ship ARTIS in duckdb milestone Feb 11, 2025
@theamarks theamarks moved this from 🏷 Ready to 🏗 In Progress in ARTIS Maintence & Analysis Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪄 enhancement New functionality or feature request
Projects
Status: 🏗 In Progress
Development

No branches or pull requests

2 participants