Skip to content

Darwin Core data

Peter Desmet edited this page Aug 29, 2018 · 7 revisions

The Darwin Core data are the standardized data files generated by your mapping script, representing the core and extension files for a Darwin Core Archive. Basically the tasty dinner you just cooked with the recipe.

Where is it located?

The Darwin Core data can be found in the data/processed directory and are named for the core or extensions they represent. You can upload these files to the Integrated Publishing Toolkit (IPT) for data publication.

THESE FILES SHOULD NEVER BE EDITED MANUALLY. If you want to update them, adapt your mapping script, so the workflow from source data to Darwin Core data remains reproducible.

Darwin Core Archive

A Darwin Core Archive for a checklist dataset can contain the following files:

File Status Description
taxon.csv Required Core data file for Darwin Core Taxon: each row is a taxon.
distribution.csv Optional Extension data file for Species Distribution: each row is a geographic distribution of a taxon.
description.csv Optional Extension data file for Taxon Description: each row is a text based description of a taxon.
references.csv Optional Extension data file for Literature References: each row is a literature reference for a taxon.
speciesprofile.csv Optional Extension data file for Species Profile: each row contains a number of characteristics for a taxon.
vernacularname.csv Optional Extension data file for Vernacular Names: each row is a vernacular name for a taxon.
meta.xml Required Metadata file defining the structure and relationships between the core and any extensions. Can be automatically produced by the IPT.
eml.xml Required  Metadata file describing the dataset. Can be automatically produced by the IPT.

For detailed information on how to publish a Darwin Core Archive checklist, see the GBIF Best Practices in Publishing Species Checklists.

Creating Darwin Core checklist data

The checklist recipe is set up to create the files taxon.csv (see Taxon core) and distribution.csv (see Distribution extension). See Publishing data how to upload those in the IPT (where you'll also create meta.xml and eml.xml).

For inspiration on how to create the other extension files listed above, see the examples. Even more GBIF supported extensions are documented in http://rs.gbif.org/extension/gbif/1.0/.

Creating Darwin Core occurrence / sampling-event data

While the checklist recipe was developed for checklist data (hence the name) its setup and structure can be used for Occurrence and Sampling-event data as well. We did not write documentation for it, but here's the gist of it:

Occurrence data

  1. Replace the Source data with your occurrence data file
  2. Adapt the Mapping script to map to Darwin Core Occurrence and any of the extensions you want to use (e.g. Darwin Core Measurement Or Facts).
  3. Write the data to data/processed/occurrence.csv (+ extension files)

Sampling-event data

  1. Replace the Source data with your sampling-event data file
  2. Adapt the Mapping script to map to Darwin Core Event and any of the extensions you want to use (e.g. Darwin Core Occurrence and Darwin Core Measurement Or Facts).
  3. Write the data to data/processed/event.csv (+ extension files)