Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

align names from GBIF download occurrence downloads used in Chesshire, P.R., Fischer, E.E., Dowdy, N.J., Griswold, T.L., Hughes, A.C., Orr, M.C., Ascher, J.S., Guzman, L.M., Hung, K.-L.J., Cobb, N.S. and McCabe, L.M. (2023), Completeness analysis for over 3000 United States bee species identifies persistent data gap. Ecography e06584. https://doi.org/10.1111/ecog.06584 #16

Open
jhpoelen opened this issue Mar 30, 2023 · 3 comments

Comments

@jhpoelen
Copy link
Member

jhpoelen commented Mar 30, 2023

from Chesshire, P.R., Fischer, E.E., Dowdy, N.J., Griswold, T.L., Hughes, A.C., Orr, M.C., Ascher, J.S., Guzman, L.M., Hung, K.-L.J., Cobb, N.S. and McCabe, L.M. (2023), Completeness analysis for over 3000 United States bee species identifies persistent data gap. Ecography e06584. https://doi.org/10.1111/ecog.06584

via https://figshare.com/projects/Completeness_analyses_for_over_3000_United_States_bee_species_identifies_persistent_data_gaps/138673

GBIF.org (3 February 2021) GBIF Occurrence Download https://doi.org/10.15468/dl.6cxfsw

GBIF.org (3 February 2021) GBIF Occurrence Download https://doi.org/10.15468/dl.b9rfa7

GBIF.org (3 February 2021) GBIF Occurrence Download https://doi.org/10.15468/dl.w2nndm

@jhpoelen jhpoelen changed the title align names from GBIF download occurrence downloads from Chesshire, P.R., Fischer, E.E., Dowdy, N.J., Griswold, T.L., Hughes, A.C., Orr, M.C., Ascher, J.S., Guzman, L.M., Hung, K.-L.J., Cobb, N.S. and McCabe, L.M. (2023), Completeness analysis for over 3000 United States bee species identifies persistent data gap. Ecography e06584. https://doi.org/10.1111/ecog.06584 align names from GBIF download occurrence downloads used in Chesshire, P.R., Fischer, E.E., Dowdy, N.J., Griswold, T.L., Hughes, A.C., Orr, M.C., Ascher, J.S., Guzman, L.M., Hung, K.-L.J., Cobb, N.S. and McCabe, L.M. (2023), Completeness analysis for over 3000 United States bee species identifies persistent data gap. Ecography e06584. https://doi.org/10.1111/ecog.06584 Mar 30, 2023
@jhpoelen
Copy link
Member Author

see related name alignment workflow configuration at:

https://github.com/jhpoelen/name-alignment-Chesshire-2023 (currently running)

@jhpoelen
Copy link
Member Author

jhpoelen commented Mar 30, 2023

@seltmann @jtmiller28 I made a name alignment configuration for three GBIF download DOIs at https://github.com/jhpoelen/name-alignment-Chesshire-2023. I almost forgot that preston supports resolving these dois to their associated data, so you can plug the download dois directly into the name alignment workflow using:

    - url: https://doi.org/10.15468/dl.b9rfa7
      enabled: true
      type: application/dwca
    - url: https://doi.org/10.15468/dl.6cxfsw
      enabled: true
      type: application/dwca
    - url: https://doi.org/10.15468/dl.w2nndm
      enabled: true
      type: application/dwca

fingers cross to have the workflow complete in time before Github actions cuts off the workflow . . .

@jhpoelen
Copy link
Member Author

jhpoelen commented Apr 3, 2023

Note that all three DOIs are marked for deletion as documented by associated GBIF metadata records:

<https://api.gbif.org/v1/occurrence/download/0182006-200613084148143> <http://purl.org/pav/hasVersion> <hash://sha256/1c5d8a7399793a634a0dde32f3a94ccf64199f010d7f93baa422c2e1dbb98b2f> <urn:uuid:cc39f686-2a6a-459b-b73f-beae32361598> .
<https://api.gbif.org/v1/occurrence/download/0182032-200613084148143> <http://purl.org/pav/hasVersion> <hash://sha256/23d7c875420bea71d24c1ec3ba127f91eff5b368744de14824de0fc4fc090bb2> <urn:uuid:073e2d93-1704-4157-a61a-a05170a115d5> .
<https://api.gbif.org/v1/occurrence/download/0182076-200613084148143> <http://purl.org/pav/hasVersion> <hash://sha256/6555d581e0ce75c77740811e547da726297d02369b149893faf531f132a2aff0> <urn:uuid:e2b784df-c64d-416e-ba86-f24b07135871> .

obtained on 2023-03-31 .

{
  "key": "0182006-200613084148143",
  "doi": "10.15468/dl.6cxfsw",
  "license": "http://creativecommons.org/licenses/by-nc/4.0/legalcode",
  "request": {
    "predicate": {
      "type": "and",
      "predicates": [
        {
          "type": "or",
          "predicates": [
            {
              "type": "equals",
              "key": "BASIS_OF_RECORD",
              "value": "PRESERVED_SPECIMEN",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "BASIS_OF_RECORD",
              "value": "UNKNOWN",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "BASIS_OF_RECORD",
              "value": "HUMAN_OBSERVATION",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "BASIS_OF_RECORD",
              "value": "MATERIAL_SAMPLE",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "BASIS_OF_RECORD",
              "value": "MACHINE_OBSERVATION",
              "matchCase": false
            }
          ]
        },
        {
          "type": "or",
          "predicates": [
            {
              "type": "equals",
              "key": "COUNTRY",
              "value": "US",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "COUNTRY",
              "value": "CA",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "COUNTRY",
              "value": "MX",
              "matchCase": false
            }
          ]
        },
        {
          "type": "or",
          "predicates": [
            {
              "type": "equals",
              "key": "TAXON_KEY",
              "value": "4345",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "TAXON_KEY",
              "value": "4334",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "TAXON_KEY",
              "value": "7905",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "TAXON_KEY",
              "value": "7901",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "TAXON_KEY",
              "value": "7908",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "TAXON_KEY",
              "value": "7911",
              "matchCase": false
            }
          ]
        }
      ]
    },
    "sendNotification": true,
    "format": "DWCA",
    "type": "OCCURRENCE",
    "verbatimExtensions": []
  },
  "created": "2021-02-03T17:50:18.533+00:00",
  "modified": "2021-02-03T18:00:50.416+00:00",
  "eraseAfter": "2021-08-03T17:50:18.453+00:00",
  "status": "SUCCEEDED",
  "downloadLink": "https://api.gbif.org/v1/occurrence/download/request/0182006-200613084148143.zip",
  "size": 600597802,
  "totalRecords": 2472496,
  "numberDatasets": 196
}
{
  "key": "0182032-200613084148143",
  "doi": "10.15468/dl.b9rfa7",
  "license": "http://creativecommons.org/licenses/by/4.0/legalcode",
  "request": {
    "predicate": {
      "type": "and",
      "predicates": [
        {
          "type": "or",
          "predicates": [
            {
              "type": "equals",
              "key": "COUNTRY",
              "value": "US",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "COUNTRY",
              "value": "MX",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "COUNTRY",
              "value": "CA",
              "matchCase": false
            }
          ]
        },
        {
          "type": "equals",
          "key": "DATASET_KEY",
          "value": "e4d3fc77-1d94-495b-96ff-3fe8b8f7a3bd",
          "matchCase": false
        },
        {
          "type": "or",
          "predicates": [
            {
              "type": "equals",
              "key": "TAXON_KEY",
              "value": "4334",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "TAXON_KEY",
              "value": "7911",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "TAXON_KEY",
              "value": "4345",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "TAXON_KEY",
              "value": "7908",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "TAXON_KEY",
              "value": "7905",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "TAXON_KEY",
              "value": "7901",
              "matchCase": false
            }
          ]
        }
      ]
    },
    "sendNotification": true,
    "format": "DWCA",
    "type": "OCCURRENCE",
    "verbatimExtensions": []
  },
  "created": "2021-02-03T18:21:59.548+00:00",
  "modified": "2021-02-03T18:32:45.439+00:00",
  "eraseAfter": "2021-08-03T18:21:59.474+00:00",
  "status": "SUCCEEDED",
  "downloadLink": "https://api.gbif.org/v1/occurrence/download/request/0182032-200613084148143.zip",
  "size": 47693201,
  "totalRecords": 178715,
  "numberDatasets": 1
}
{
  "key": "0182076-200613084148143",
  "doi": "10.15468/dl.w2nndm",
  "license": "http://creativecommons.org/licenses/by-nc/4.0/legalcode",
  "request": {
    "predicate": {
      "type": "and",
      "predicates": [
        {
          "type": "equals",
          "key": "DATASET_KEY",
          "value": "e05f6e7d-418e-4407-8e0f-7b8ccf21109e",
          "matchCase": false
        },
        {
          "type": "or",
          "predicates": [
            {
              "type": "equals",
              "key": "TAXON_KEY",
              "value": "4334",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "TAXON_KEY",
              "value": "4345",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "TAXON_KEY",
              "value": "7911",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "TAXON_KEY",
              "value": "7908",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "TAXON_KEY",
              "value": "7901",
              "matchCase": false
            },
            {
              "type": "equals",
              "key": "TAXON_KEY",
              "value": "7905",
              "matchCase": false
            }
          ]
        }
      ]
    },
    "sendNotification": true,
    "format": "DWCA",
    "type": "OCCURRENCE",
    "verbatimExtensions": []
  },
  "created": "2021-02-03T19:18:46.687+00:00",
  "modified": "2021-02-03T19:20:03.899+00:00",
  "eraseAfter": "2021-08-03T19:18:46.611+00:00",
  "status": "SUCCEEDED",
  "downloadLink": "https://api.gbif.org/v1/occurrence/download/request/0182076-200613084148143.zip",
  "size": 2624689,
  "totalRecords": 11654,
  "numberDatasets": 1
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant