-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add fetch_entity_names
method
#230
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Jorge!
I think this feature is useful, and is a cleaner implementation than we have on the website today.
But, there's still some ambiguity around:
- If i request a particular language and have english as a fallback, how do i tell from the response if i'm seeing the requested language or the fallback?
- Some smaller places may not have a "name" field today in the KG- example: https://datacommons.org/browser/wikidataId/Q4104190 . This is arguably a data issue - maybe in this example we should be setting "name" to the Bulgarian name, but the issue remains. I think on the website we return the DCID instead of the name, but im not sure if that's the best behavior.
What do you think?
- magic strings to constants - change fallback logic to any language (not en boolean)
Thanks @dwnoble!
It's a very fair question. I don't have a simple suggestion... but can see two potential routes:
What do you think of either option? Or maybe you see another route?
In most of the REST API, if something doesn't exist it comes back empty. We don't currently infer or fill missing data on the client library side. I think I'd be in favour of keeping that behaviour consistent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dwnoble!
- If i request a particular language and have english as a fallback, how do i tell from the response if i'm seeing the requested language or the fallback?
It's a very fair question. I don't have a simple suggestion... but can see two potential routes:
- It's on my list to implement logging. Right now everything happens a bit too silently for my liking. So a solution could be logging a info or warning when the fallback is used.
- we add an additional key to the response, which tracks any entities for which the fallback was used. That means we still keep a convenient mapping of entity_id -> name, but if the 'fallback' key is present (or if it has items, depending on how we implement), then the fallback was used for the countries contained in that list.
What do you think of either option? Or maybe you see another route?
- Some smaller places may not have a "name" field today in the KG- example: https://datacommons.org/browser/wikidataId/Q4104190 . This is arguably a data issue - maybe in this example we should be setting "name" to the Bulgarian name, but the issue remains. I think on the website we return the DCID instead of the name, but im not sure if that's the best behavior.
In most of the REST API, if something doesn't exist it comes back empty. We don't currently infer or fill missing data on the client library side. I think I'd be in favour of keeping that behaviour consistent.
Agreed that things happen a bit too silently right now, and that our website api currently makes some assumptions that are probably not obvious to users. Extra logging sounds good, but I'd lean toward adding more metadata to the response. What do you think about something like this?
{
"africa": {
"value": "Afrika",
"property": "nameWithLanguage", // Could be 'name' if English was requested or used as fallback
"language": "de" // The actual language of the value returned
},
"wikidataId/Q2608785": { // Example using fallback
"value": "La Democracia", // The fallback name
"property": "name", // Assuming 'name' is English and was the fallback
// Optional: Explicitly note the request vs result?
// "requestedLanguage": "de"
}
// ... other entities
}
@dwnoble Here's an implementation (with additional tests) for the type of return you suggested.
It's definitely more explicit, and overall I think it's the right way to go. At first, I wanted something more directly usable for analysts who may want to quickly add names to the data they got. With this more complete dictionary, they would have to unpack Thanks for the suggestion, Dan! |
Accidentally hit close- apologies :) |
This PR is part of a group of PRs which will bring some key features from the Data Commons website API to the client library (e.g #229, #231)
Fetch entity names: this is an implementation of a few DC website features:
/api/place/name
name
property.nameWithLanguage
property (the website automatically resolves the user's locale, this method takes it as an argument)In short, this PR:
fetch_entity_names
method toNodeEndpoint
. This method takes one or moreentity_dcids
and fetches their name. It defaults to English, but other languages (fr
,es
, etc) can be selected using the optionallanguage
argument. Optionally, users can request to fallback to a particular language if the requested language is not available (to avoid sparse results in some languages).Example usage:
And with the optional fallback to english:
In the background, this method uses
name
ornameWithLanguage
depending on what the user requests.name
is much cheaper thannameWithLanguage
since it only contains one value per entity. The method only requests and parsesnameWithLanguage
if a non-english language is selected.A previous version used english as the fallback, available via a boolean. The latest version implements a more generic, Optional,
fallback_language
which takes a string (default toNone
for no fallback)