-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Description
- An example record for a dataset pointing to the individual files: http://identifiers.org/rest/metadata/reactome:R-HSA-446203
- Their REST API implementation: https://github.com/identifiers-org/rest
- List of their services endpoints: http://identifiers.org/restws
Cons
- ATM I do not see anything neuroimaging (according to search) but situation might change, especially if we register our
datalad:prefix ("Register" DataLad with identifiers.org? datalad-registry#67)
Analysis/possible difficulties
- I do not see yet how to discover individual IDs/datasets for a particular prefix (sent out a question via their web interface; the answer was: not at the moment, but it sounded to them as an interesting feature so might come at some point)
- Not all prefixes relate to "datasets", but some are known as "(data) collections": https://www.ebi.ac.uk/miriam/main/collections/
- I do not think there is any versioning, but most probably it is assumed that an identifier points to immutable dataset
- There will be a lot of datasets. So we would need some sensible structure/hierarchy. First level would be the identifier. Then we could partition even further splitting IDs on
/and-. - There seems to be no "filename" information provided. So we would have choices:
- like a default git-annex behavior - just use the entire url to compose a unique filename
- one from the URL (often from Content-Disposition header field) - but that might lead to conflicts since we would allow only for a flat structure:
- we could preanalyze the entire list of those first and see if conflicts arise. If there are conflicts, try to deduce somehow disambiguating structure. but that is unreliable in case a dataset record changes with more files etc
- just add an arbitrary, or based on some metadata?, numeric index in addition
Reactions are currently unavailable