-
Notifications
You must be signed in to change notification settings - Fork 43
Add fetch_available_statistical_variables
#229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add fetch_available_statistical_variables
#229
Conversation
Including tests for `fetch_available_statistical_variables` and the helper `group_variables_by_entity`
Awesome!! Great idea!
…On Mon, Mar 31, 2025 at 12:43 AM Jorge Rivera ***@***.***> wrote:
This PR is part of a group of PRs which will bring some key features from
the Data Commons website API to the client library.
*Fetch available statistical variables*: this is an implementation of this
feature of the DC website
<https://github.com/datacommonsorg/website/blob/master/server/lib/fetch.py#L344-L363>
.
In short, this PR:
- Adds a fetch_available_statistical_variables method to the
ObservationEndpoint. This method fetches all statvars which have
observations for one or more entities.
It also includes tests for the method and a helper function which
structures the data by entity : statvar_list.
Example usage:
from datacommons_client import DataCommonsClient
dc = DataCommonsClient(dc_instance="datacommons.one.org")
variables = dc.observation.fetch_available_statistical_variables(
entity_dcids=["africa", "country/TGO"]
)
{'africa': ['Area_FloodEvent',
'eia/INTL.7-1-TJ.A',
'Annual_Emissions_CarbonDioxideEquivalent100YearGlobalWarmingPotential_SteelManufacturing',
'eia/INTL.35-12-QBTU.A',
'sdg/EN_MAT_DOMCMPG.PRODUCT--MF421',
'sdg/EN_MAT_DOMCMPG.PRODUCT--MF21',
'sdg/SG_DMK_PARLCC_LC.AGE--Y0T45__SEX--M__PARLIAMENTARY_COMMITTEES--PC_DEFENCE',
'eia/INTL.65-2-QBTU.A',
'eia/INTL.57-1-MT.A',
'eia/INTL.68-2-MT.A',
...
],
'country/TGO': ['sdg/SG_GEN_EQPWN',
'Count_Person_25OrMoreYears_Female_MastersDegreeOrHigher_AsFractionOf_Count_Person_25OrMoreYears_Female',
'sdg/SE_ADT_ACTS.TYPE_OF_SKILL--SKILL_ICTPST',
'Amount_Debt_GBP_LenderArabBankforEconomicDevinAfrica_AsAFractionOf_Amount_Debt_LenderArabBankforEconomicDevinAfrica',
'worldBank/UIS_R_2_GPV_G2',
'worldBank/UIS_GTVP_2_GPV_F',
'DiffRelativeToAvg_1980_2010_MaxTemp_Daily_Hist_5PctProb_Greater_Atleast1DayADecade_CMIP6_Ensemble_SSP245',
'MaxTemp_Daily_Hist_50PctProb_Greater_Atleast1DayAYear_CMIP6_HADGEM3-GC31-LL_SSP585',
'AmountPrincipalRepayment_Debt_LongTermExternalDebt_LenderEuropeanEconomicCommunity',
'worldBank/SE_SEC_NENR',
...
]}
*Note*
Looking through the website code I discovered that it was possible for
variable.dcids to be empty. Keeping them empty is what enables getting
all the StatVars with observations. However, empty variable.dcids
contradicts the official docs
<https://docs.datacommons.org/api/rest/v2/observation.html#:~:text=for%20allowable%20values.-,variable.dcids,-REQUIRED>
which say its required. FYI @kmoscoe <https://github.com/kmoscoe>.
------------------------------
You can view, comment on, or merge this pull request online at:
#229
Commit Summary
- 24cbd05
<24cbd05>
Add `fetch_available_statistical_variables`
File Changes
(4 files <https://github.com/datacommonsorg/api-python/pull/229/files>)
- *M* datacommons_client/endpoints/observation.py
<https://github.com/datacommonsorg/api-python/pull/229/files#diff-2f6f5222e02c9429c1e19de9eb99b3efaf5abe87755205cbcf10538551d321bf>
(23)
- *M* datacommons_client/tests/endpoints/test_observation_endpoint.py
<https://github.com/datacommonsorg/api-python/pull/229/files#diff-ce05c9554817c89e5133ba60697155d3c158c166529631f95940e369eee7d301>
(49)
- *A* datacommons_client/tests/test_utils.py
<https://github.com/datacommonsorg/api-python/pull/229/files#diff-f51526f26e377e75f00640423bcca975db04077ee947d66615246f86873d1011>
(40)
- *M* datacommons_client/utils/data_processing.py
<https://github.com/datacommonsorg/api-python/pull/229/files#diff-61f66e6b6e1b8fa64c61cf98fcb394123e942f1f401e2a6b668a5a0b25dccb42>
(20)
Patch Links:
- https://github.com/datacommonsorg/api-python/pull/229.patch
- https://github.com/datacommonsorg/api-python/pull/229.diff
—
Reply to this email directly, view it on GitHub
<#229>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BHMM7UC55RSGWLJEP7FVC6L2XDW2VAVCNFSM6AAAAAB2DTPJTCVHI2DSMVQWIX3LMV43ASLTON2WKOZSHE2TSNZRGIZDMMQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Wow, that is amazing, never knew it. Will fix the REST docs today! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this addition Jorge! This will be helpful when we introduce the new client library into the website
This PR is part of a group of PRs which will bring some key features from the Data Commons website API to the client library (e.g #230, #231).
Fetch available statistical variables: this is an implementation of this feature of the DC website.
In short, this PR:
fetch_available_statistical_variables
method to the ObservationEndpoint. This method fetches all statvars which have observations for one or more entities.It also includes tests for the method and a helper function which structures the data by entity : statvar_list.
Example usage:
Note
Looking through the website code I discovered that it was possible for
variable.dcids
to be empty. Keeping them empty is what enables getting all the StatVars with observations. However, emptyvariable.dcids
contradicts the official docs which say its required. FYI @kmoscoe.