Skip to content

Commit d4eb322

Browse files
wow-such-codesven1103luiskuhnKochTobi
authored
Release/1.7.0 (#78)
* Adds completed version of Omero imaging data registration script: * General metadata is stored in openBIS (imaging experiments and samples) * Data and additional metadata key-value-pairs are stored in OMERO * Add readme for Omero etl script * Added some comments to code * Minor code cleanup Co-authored-by: Sven F <[email protected]> Co-authored-by: luiskuhn <[email protected]> Co-authored-by: luiskuhn <[email protected]> Co-authored-by: Tobias Koch <[email protected]>
1 parent 67552ae commit d4eb322

File tree

7 files changed

+404
-93
lines changed

7 files changed

+404
-93
lines changed

CHANGELOG.md

+5
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,10 @@
11
# Changelog
22

3+
## 1.7.0 2021-03-19
4+
5+
* Provides fully tested functionality to register generic imaging data, with OMERO server support (v5.4.10). [Link to PR](https://github.com/qbicsoftware/etl-scripts/pull/78)
6+
* Uses an omero-importer-cli (with Bio-formats) for image file registration into an OMERO server instance
7+
* Uses an initial version of the openBIS-OMERO metadata model
38

49
## 1.6.0 2021-01-22
510

README.md

+54
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ Formats:
4444
- [NGS single-end / paired-end data with metadata (deprecated)](#ngs-single-end--paired-end-data-with-metadata)
4545
- [Attachment Data](#attachment-data)
4646
- [Mass Spectrometry mzML conversion and registration](#mass-spectrometry-mzml-conversion-and-registration)
47+
- [Imaging data with an OMERO server instance](#imaging-data-with-an-omero-server-instance)
4748

4849
### NGS single-end / paired-end data
4950

@@ -279,3 +280,56 @@ Q_MS_LCMS_METHODS - openBIS code from the vocabulary of LCMS methods
279280
technical_replicate - free text to denote replicates
280281

281282
workflow_type - DDA or DIA
283+
284+
285+
### Imaging data with an OMERO server instance
286+
287+
**Responsible dropbox:**
288+
[QBiC-register-omero-metadata](drop-boxes/register-omero-metadata)
289+
290+
**Resulting data model in openBIS**
291+
For each tissue sample multiple images (the data files) can be created, so multiple Q_BMI_GENERIC_IMAGING_RUN samples are created and attached to that tissue sample
292+
...Q_BIOLOGICAL_SAMPLE -> one Q_BMI_GENERIC_IMAGING_RUN per data file
293+
294+
**Expected data structure**
295+
In every use case, the data structure needs to contain a top folder around the respective data in order to accommodate metadata files.
296+
297+
The sample code found in the top folder is of type `Q_BIOLOGICAL_SAMPLE` (tissue imaging).
298+
299+
**Valid file types**:
300+
Valid files in the folder are any imaging files that can be handled by the OMERO server
301+
302+
**Incoming structure overview:**
303+
```
304+
QABCD002A8
305+
|-- QABCD002A8
306+
| |-- Est-B1a.lif
307+
| |-- Image_1.czi
308+
| |-- Image_2.czi
309+
| |-- Image7246.tif
310+
| |-- metadata_3.tsv
311+
| |-- rubisco_avg.mrc
312+
| `-- tomogram_x.mrc
313+
|-- QABCD002A8.sha256sum
314+
`-- source_dropbox.txt
315+
```
316+
317+
The metadata file, ending in `.tsv` has tab-separated columns:
318+
```
319+
IMAGE_FILE_NAME IMAGING_MODALITY IMAGED_TISSUE INSTRUMENT_MANUFACTURER INSTRUMENT_USER IMAGING_DATE
320+
tomogram_x.mrc NCIT_C18113 cell FEI Dr. Horrible 01.03.2021
321+
rubisco_avg.mrc NCIT_C18113 cell FEI Max Mustermann 01.04.2021
322+
Image7246.tif NCIT_C18216 leaf Zeiss Max Mustermann 23.02.2021
323+
Est-B1a.lif NCIT_C17753 root Zeiss Max Mustermann 01.02.2021
324+
Image_1.czi NCIT_C17753 leaf Zeiss Max Mustermann 11.02.2021
325+
Image_2.czi NCIT_C17753 leaf Zeiss Max Mustermann 01.02.2021
326+
```
327+
328+
column name | description
329+
--------------|----------------
330+
`IMAGE_FILE_NAME`| one of the file names found in the incoming folder per line
331+
`IMAGING_MODALITY`| Ontology Identifier for the imaging modality, currently from the [NCI Thesaurus](https://ncit.nci.nih.gov/ncitbrowser/pages/home.jsf?version=21.02d). **Examples:** NCIT_C18113 (Cryo-Electron Microscopy), NCIT_C18216 (Transmission Electron Microscopy), NCIT_C17753 (Confocal Microscopy)
332+
`IMAGED_TISSUE` | the imaged tissue
333+
`INSTRUMENT_MANUFACTURER` | the imaging instrument manufacturer
334+
`INSTRUMENT_USER` | the person who measured the data file using the imaging instrument
335+
`IMAGING_DATE` | the date of the measurement in dd.mm.yyyy format (days and months with leading zeroes)

drop-boxes/register-attachments-dropbox/register-attachment-dropbox.py

+4-4
Original file line numberDiff line numberDiff line change
@@ -84,10 +84,10 @@ def process(transaction):
8484
sa = transaction.getSampleForUpdate(sampleID)
8585
space = sa.getSpace()
8686
if not attachmentReady:
87-
expID = '/' + space + '/' + project + '/'+ project+'_INFO'
88-
exp = transaction.getExperimentForUpdate(expID)
89-
if not exp:
90-
exp = transaction.createNewExperiment(expID, "Q_PROJECT_DETAILS")
87+
infoSampleID = "/"+space+"/"+code
88+
sa = transaction.getSampleForUpdate(infoSampleID)
89+
if not sa:
90+
exp = transaction.createNewExperiment('/' + space + '/' + project + '/'+ project+'_INFO', "Q_PROJECT_DETAILS")
9191
sa = transaction.createNewSample('/' + space + '/'+ code, "Q_ATTACHMENT_SAMPLE")
9292
sa.setExperiment(exp)
9393
info = None

drop-boxes/register-omero-metadata/backendinterface.py

+52-20
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,6 @@
1414
1515
"""
1616

17-
1817
def omero_connect(usr, pwd, host, port):
1918
"""
2019
Connects to the OMERO Server with the provided username and password.
@@ -182,9 +181,7 @@ def register_image_file_with_dataset_id(file_path, dataset_id, usr, pwd, host, p
182181
ds_id = dataset_id
183182

184183
if ds_id != -1:
185-
186184
cmd = "omero-importer -s " + host + " -p " + str(port) + " -u " + usr + " -w " + pwd + " -d " + str(int(ds_id)) + " " + file_path
187-
188185
proc = subprocess.Popen(cmd,
189186
stdout=subprocess.PIPE,
190187
stderr=subprocess.PIPE,
@@ -193,17 +190,20 @@ def register_image_file_with_dataset_id(file_path, dataset_id, usr, pwd, host, p
193190

194191
std_out, std_err = proc.communicate()
195192

196-
if int(proc.returncode) == 0:
197-
198-
fist_line = std_out.splitlines()[0]
199-
image_ids = fist_line[6:].split(',')
193+
# the terminal output of the omero-importer tool provides a lot of information on the registration process
194+
# we are looking for a line with this format: "Image:id_1,1d_2,id_3,...,id_n"
195+
# where id_1,...,id_n are a list of ints, which denote the unique OMERO image IDs for the image file
196+
# (one file can have many images)
200197

198+
if int(proc.returncode) == 0:
199+
for line in std_out.splitlines():
200+
if line[:6] == "Image:":
201+
image_ids = line[6:].split(',')
202+
break
201203
else:
202-
image_ids = -1
203-
204+
image_ids = []
204205
else:
205-
image_ids = -1
206-
206+
image_ids = []
207207
return image_ids
208208

209209

@@ -315,11 +315,18 @@ def get_image_array(conn, image_id):
315315

316316
return hypercube
317317

318-
################################
319-
320318
def add_annotations_to_image(conn, image_id, key_value_data):
321319
"""
322-
TODO
320+
This function is used to add key-value pair annotations to an image
321+
Example:
322+
key_value_data = [["Drug Name", "Monastrol"], ["Concentration", "5 mg/ml"]]
323+
add_annotations_to_image(conn, image_id, key_value_data)
324+
Args:
325+
conn: Established Connection to the OMERO Server via a BlitzGateway
326+
image_id (int): An OMERO image ID
327+
key_value_data (list of lists): list of key-value pairs
328+
Returns:
329+
int: not relevant atm
323330
"""
324331

325332
import omero
@@ -339,15 +346,19 @@ def add_annotations_to_image(conn, image_id, key_value_data):
339346

340347

341348
#########################
342-
##app
343349

344350
from optparse import OptionParser
351+
import ConfigParser
352+
353+
config = ConfigParser.RawConfigParser()
354+
config.read("imaging_config.properties")
355+
345356

346357
###OMERO server info
347-
USERNAME = "usr"
348-
PASSWORD = "pwd"
349-
HOST = "host"
350-
PORT = 4064
358+
USERNAME = config.get('OmeroServerSection', 'omero.username')
359+
PASSWORD = config.get('OmeroServerSection', 'omero.password')
360+
HOST = config.get('OmeroServerSection', 'omero.host')
361+
PORT = int(config.get('OmeroServerSection', 'omero.port'))
351362

352363

353364
def get_args():
@@ -358,6 +369,10 @@ def get_args():
358369
parser.add_option('-p', '--project', dest='project_id', default="None", help='project id for dataset id retrieval')
359370
parser.add_option('-s', '--sample', dest='sample_id', default="None", help='sample id for dataset id retrieval')
360371

372+
parser.add_option('-i', '--image', dest='image_id', default="None", help='image id for key-value pair annotation')
373+
parser.add_option('-a', '--annotation', dest='ann_str', default="None", help='annotation string')
374+
375+
361376
(options, args) = parser.parse_args()
362377
return options
363378

@@ -373,9 +388,26 @@ def get_args():
373388
id_str = id_str + id_i + " "
374389

375390
print id_str
376-
else:
391+
392+
elif args.project_id != "None":
377393

378394
conn = omero_connect(USERNAME, PASSWORD, HOST, str(PORT))
379395
ds_id = get_omero_dataset_id(conn, str(args.project_id), str(args.sample_id))
380396

381397
print ds_id
398+
399+
elif args.image_id != "None":
400+
401+
conn = omero_connect(USERNAME, PASSWORD, HOST, str(PORT))
402+
403+
#string format: key1::value1//key2::value2//key3::value3//...
404+
key_value_data = []
405+
pair_list = args.ann_str.split("//")
406+
for pair in pair_list:
407+
key_value = pair.split("::")
408+
key_value_data.append(key_value)
409+
410+
411+
add_annotations_to_image(conn, str(args.image_id), key_value_data)
412+
413+
print "0"

drop-boxes/register-omero-metadata/image_registration_process.py

+76-13
Original file line numberDiff line numberDiff line change
@@ -7,30 +7,47 @@
77
from subprocess import Popen, PIPE
88

99
barcode_pattern = re.compile('Q[a-zA-Z0-9]{4}[0-9]{3}[A-Z][a-zA-Z0-9]')
10+
conda_home_path = "/home/qeana10/miniconda2/"
11+
omero_lib_path = "/home/qeana10/openbis/servers/core-plugins/QBIC/1/dss/drop-boxes/register-omero-metadata/OMERO.py-5.4.10-ice36-b105"
12+
etl_home_path = "/home/qeana10/openbis/servers/core-plugins/QBIC/1/dss/drop-boxes/register-omero-metadata/"
13+
1014

1115
class ImageRegistrationProcess:
1216

13-
def __init__(self, transaction, env_name="omero_env_0", project_code="", sample_code=""):
17+
def __init__(self, transaction, env_name="omero_env_0", project_code="", sample_code="", conda_path=None, omero_path=None, etl_path=None):
1418

1519
self._transaction = transaction
1620
self._incoming_file_name = transaction.getIncoming().getName()
21+
self._search_service = transaction.getSearchService()
1722

1823
self._project_code = project_code
1924
self._sample_code = sample_code
2025

26+
### set exec. env
27+
self._conda_path = conda_home_path
28+
if not conda_path is None:
29+
self._conda_path = conda_path
30+
31+
self._omero_path = omero_lib_path
32+
if not omero_path is None:
33+
self._omero_path = omero_path
34+
35+
self._etl_path= etl_home_path
36+
if not etl_path is None:
37+
self._etl_path = etl_path
38+
2139
self._init_cmd_list = []
22-
self._init_cmd_list.append('eval "$(/home/qeana10/miniconda2/bin/conda shell.bash hook)"')
40+
self._init_cmd_list.append('eval "$(' + self._conda_path + 'bin/conda shell.bash hook)"')
2341
self._init_cmd_list.append('conda activate ' + env_name)
2442

25-
self._init_cmd_list.append('export OMERO_PREFIX=/home/qeana10/openbis/servers/core-plugins/QBIC/1/dss/drop-boxes/register-omero-metadata/OMERO.py-5.4.10-ice36-b105')
43+
self._init_cmd_list.append('export OMERO_PREFIX=' + self._omero_path)
2644
self._init_cmd_list.append('export PYTHONPATH=$PYTHONPATH:$OMERO_PREFIX/lib/python')
2745

28-
#now use the omero-importer app packaged in the conda env
29-
#self._init_cmd_list.append('export PATH=$PATH:/home/qeana10/openbis/servers/core-plugins/QBIC/1/dss/drop-boxes/register-omero-metadata/OMERO.server-5.4.10-ice36-b105/bin')
30-
self._init_cmd_list.append('export PATH=$PATH:/home/qeana10/miniconda2/envs/' + env_name + '/bin')
46+
# now use the omero-importer app packaged in the conda env
47+
self._init_cmd_list.append('export PATH=$PATH:' + self._conda_path + 'envs/' + env_name + '/bin')
3148

32-
#move to the dir where backendinterface.py lives
33-
self._init_cmd_list.append('cd /home/qeana10/openbis/servers/core-plugins/QBIC/1/dss/drop-boxes/register-omero-metadata/')
49+
# move to the dir where backendinterface.py lives for exec.
50+
self._init_cmd_list.append('cd ' + self._etl_path)
3451

3552
def fetchOpenBisSampleCode(self):
3653
found = barcode_pattern.findall(self._incoming_file_name)
@@ -43,7 +60,17 @@ def fetchOpenBisSampleCode(self):
4360
raise SampleCodeError(self._sample_code, "The sample code seems to be invalid, the checksum could not be confirmed.")
4461

4562
return self._project_code, self._sample_code
46-
63+
64+
def searchOpenBisSample(self, sample_code):
65+
# find specific sample
66+
sc = SearchCriteria()
67+
sc.addMatchClause(SearchCriteria.MatchClause.createAttributeMatch(SearchCriteria.MatchClauseAttribute.CODE, sample_code))
68+
foundSamples = self._search_service.searchForSamples(sc)
69+
if len(foundSamples) == 0:
70+
raise SampleNotFoundError(sample_code, "Sample could not be found in openBIS.")
71+
sample = foundSamples[0]
72+
return sample
73+
4774
def _isValidSampleCode(self, sample_code):
4875
try:
4976
id = sample_code[0:9]
@@ -73,6 +100,7 @@ def requestOmeroDatasetId(self, project_code=None, sample_code=None):
73100
return ds_id
74101

75102
def registerImageFileInOmero(self, file_path, dataset_id):
103+
76104
cmd_list = list(self._init_cmd_list)
77105
cmd_list.append( "python backendinterface.py -f " + file_path + " -d " + str(dataset_id) )
78106

@@ -84,23 +112,26 @@ def registerImageFileInOmero(self, file_path, dataset_id):
84112
out, err = process.communicate( commands )
85113

86114
id_list = str(out).split()
115+
for img_id in id_list:
116+
if not img_id.isdigit():
117+
return []
87118

88119
return id_list
89120

90121

91122
def triggerOMETiffConversion(self):
92123
pass
93124

94-
#ToDo Check if Metadata file is provided as was suggested in test.tsv provided by LK
95-
def extractMetadataFromTSV(self, tsvFilePath):
125+
#ToDo Check if Metadata file is provided as defined
126+
def extractMetadataFromTSV(self, tsv_file_path):
96127
tsvFileMap = {}
97128
try:
98-
with open(tsvFilePath) as tsvfile:
129+
with open(tsv_file_path) as tsvfile:
99130
reader = csv.DictReader(tsvfile, delimiter='\t', strict=True)
100131
for row in reader:
101132
tsvFileMap.update(row)
102133
except IOError:
103-
print "Error: No file found at provided filepath " + tsvFilePath
134+
print "Error: No file found at provided filepath " + tsv_file_path
104135
except csv.Error as e:
105136
print 'Could not gather the Metadata from TSVfile %s, in line %d: %s' % (tsvfile, reader.line_num, e)
106137

@@ -109,6 +140,30 @@ def extractMetadataFromTSV(self, tsvFilePath):
109140
def registerExperimentDataInOpenBIS(self):
110141
pass
111142

143+
def registerOmeroKeyValuePairs(self, image_id, property_map):
144+
"""Registers the property map as key-value pairs in the OMERO server.
145+
"""
146+
147+
cmd_list = list(self._init_cmd_list)
148+
149+
# string format: key1::value1//key2::value2//key3::value3//...
150+
key_value_str = ""
151+
for key in property_map.keys():
152+
key_value_str = key_value_str + str(key) + "::" + str(property_map[key]) + "//"
153+
key_value_str = key_value_str[:len(key_value_str)-2] #remove last two chars
154+
155+
cmd_list.append( "python backendinterface.py -i " + str(image_id) + " -a " + key_value_str )
156+
157+
commands = ""
158+
for cmd in cmd_list:
159+
commands = commands + cmd + "\n"
160+
161+
process = Popen( "/bin/bash", shell=False, universal_newlines=True, stdin=PIPE, stdout=PIPE, stderr=PIPE )
162+
out, err = process.communicate( commands )
163+
164+
165+
return 0
166+
112167

113168
class SampleCodeError(Exception):
114169

@@ -120,4 +175,12 @@ def __init__(self, sample_code, message):
120175
def test(self):
121176
pass
122177

178+
class SampleNotFoundError(Exception):
179+
180+
def __init__(self, sample_code, message):
181+
self.sample_code = sample_code
182+
self.message = message
183+
super().__init__(self.message)
123184

185+
def test(self):
186+
pass

0 commit comments

Comments
 (0)