This repository provides a dockerised executor of the sema.subyt uplifting for the analysis results of emobon.
The packaged artefacts from this work are available at https://github.com/orgs/emo-bon/packages. Or they can be built locally by following the steps in the developer docs.
To use this one only needs
- to decide what image to run
- either a published release package
- or a local build
- to set the i/o for the process
- mainly the ro-crate folder to work on (mapped as docker-volume
/rocrateroot
) - essential environment variables to pass
- mainly the ro-crate folder to work on (mapped as docker-volume
- pass all of the above in a call to
docker run
In detail:
$ version="latest" # or pick an available release tag from https://github.com/orgs/emo-bon/packages
# (optionally) verify availability by manual pull
$ docker pull ghcr.io/emo-bon/emobon_arup:${version} # should pull the image without errors
# variable setting to inject
### Path to the root directory where analysis results are stored
$ rocrateroot="../path_to_analysis_results_repo/crate_results_folder_X"
### Base domain URL for the data repository
$ DOMAIN='https://data.emobon.embrc.eu'
### Name of the repository where the analysis results are stored
$ REPO_NAME='analysis-results-cluster01-crate'
### Unique identifier for the sample during analysis, assigned by emo-bon
$ REF_CODE='EMOBON00172'
### Accession number for the sample in the European Nucleotide Archive (ENA)
$ ENA_NR='test_ENAnummer'
### Identifier for the Observatory
$ OBS_ID='VB'
### Environmental package ID, categorizing the type of environment sampled: Wa (water), Se (sediment)
$ ENVPACKAGE_ID='Wa'
### Identifier for the material sample used in the analysis
$ SOURCE_MAT_ID='test_source_mat_id'
# actually run it
$ docker run --rm --name "emo-bon_arup" --volume ${rocrateroot}:/rocrateroot --env SOURCE_MAT_ID=${source_mat_id} ghcr.io/emo-bon/emobon_arup:${version}
Taking this route assumes you have this project checked out with git, and have built it locally. You might want to check the "developer info" section below for how to do that.
# (optionally) verify if you have the local image available
$ docker images |grep emobon_arup # should return a matching image
# variable setting to inject
$ rocrateroot="../path_to_analysis_results_repo/crate_results_folder_X"
$ source_mat_id="YourRefHere"
# actually run it
$ docker run --rm --name "emo-bon_arup" --volume ${rocrateroot}:/rocrateroot --env SOURCE_MAT_ID=${source_mat_id} emobon_arup:latest
The way to interact with this process is by
- providing actual content to work on by mapping some folder to the docker
VOLUME /rocrateroot
- providing specific environment variables that can be picked up during execution.
These are explained to more detail below.
In addition to the above the process is driven by extra files that are built into the docker-image under the /arup
folder. In particular the /arup/work.yml
that contains the central instructions to be executed. And the templates present in the /arup/templates/
folder.
The sources for these are in the root of this project.
The reading and writing of actual files happens in the folder that is mapped to this volume.
It is expected to reflect the top of a folder that is to make up the ro-crate of one so called analysis-result
-- the shared output of one MetaGoFlow execution.
In this folder the process will look for and use the files listed as in
put in the work.yml
instructions. In return it will produce the ones mentioned as out
put in the same folder.
See the work.yml
instructions to learn what actual in
and out
paths, relative to /rocrateroot
are being refered to.
The content and values of the instructions in the work.yml
file can be tuned by the usage of environment variables. Using the yaml tag !resolve
string values can be made to replace occurences of { env_variable_name }
by the value of the environment variable with that name.
See the work.yml
instructions to learn what actual { env_variable_name }
s are being used.
Additionally the environment variable ARUP_WORK
allows to specify a path to a custom work.yml
file of your own. This path should be absolute (expressed in docker-image space) or relative to its /rocrateroot
. This feature is mainly there to allow for testing our own build process, use with caution.
To build your own local image, or to get involved in furthering this work: See Contributors Guide