Skip to content

Commit efd924b

Browse files
Merge pull request #460 from bio2rdf/release3
integrate linkedSPLs
2 parents fc9bea9 + 6a28a5c commit efd924b

File tree

5 files changed

+8662
-34
lines changed

5 files changed

+8662
-34
lines changed

linkedSPLs/LinkedSPLs-activeMoiety/README

+1-2
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ Please ensure following mappings are available at specified path. They are shoul
3333
(5) RxNORM to DrOn: LinkedSPLs-update/mappings/DrOn-to-RxNorm/cleaned-dron-chebi-rxcui-ingredient.txt
3434
(6) RxNORM to NDFRT (EPC): LinkedSPLs-update/mappings/pharmacologic_class_indexing/EPC_extraction_most_recent.txt
3535

36-
(7) OMOP concept Id from OHDSI or query OMOP CDM V5 (GeriOMOP) by SQL query
36+
(7) OMOP concept Id from OHDSI or query OMOP CDM V5 (GeriOMOP) by SQL query (it's available if just updated linkedSPLs core graph)
3737

3838
query OMOP CDM V5 (GeriOMOP) by SQL query below:
3939
SELECT cpt.CONCEPT_ID as omopid, cpt.CONCEPT_CODE as rxcui FROM
@@ -45,7 +45,6 @@ right click result table and export to csv ('|' delimited)
4545
save name as: active-ingredient-omopid-rxcui-<DATE>.dsv
4646
to dir: LinkedSPLs-activeMoiety/mappings/active-ingredient-omopid-rxcui.dsv
4747

48-
01/18/2017: 17049 results
4948

5049
################################################################################
5150
Procedures to get active moieties RDF graph

linkedSPLs/LinkedSPLs-clinicalDrug/README

+5-7
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,11 @@ PRE-CONDITIONS:
1212
Mappings from linkedSPLs core update
1313
------------------------------------------------------------------------
1414

15-
(1) Dron to rxcui: linkedSPLs/LinkedSPLs-update/mappings/DrOn-to-RxNorm/cleaned-dron-to-rxcui-drug-<DATE>.txt
15+
(1) Dron to rxcui: LinkedSPLs-update/mappings/DrOn-to-RxNorm/cleaned-dron-to-rxcui.txt
1616

17-
(2) Dailymed setid and rxcui: LinkedSPLs-update/mappings/RxNORM-mapping/converted_rxnorm_mappings_<DATE>.txt converted_rxnorm_mappings.txt
17+
(2) Dailymed setid and rxcui:
1818

19-
$ cat converted_rxnorm_mappings.txt | cut -f1,2 -d\| | sort | uniq > setid_rxcui.txt
19+
$ cat LinkedSPLs-update/mappings/RxNORM-mapping/converted_rxnorm_mappings.txt | cut -f1,2 -d\| | sort | uniq > LinkedSPLs-clinicalDrug/mappings/setid_rxcui.txt
2020

2121
(3) Dailymed setid and drug fullname
2222

@@ -25,9 +25,9 @@ $ use linkedSPLs;
2525

2626
SELECT setId, fullName FROM linkedSPLs.structuredProductLabelMetadata INTO OUTFILE '/tmp/setid_fullname.txt' FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n';
2727

28-
$ cp /tmp/setid_fullname.txt ../linkedSPLs/LinkedSPLs-clinicalDrug/mappings/
28+
$ cp /tmp/setid_fullname.txt LinkedSPLs-clinicalDrug/mappings/
2929

30-
(4) mappings of omopid and rxcui
30+
(4) mappings of omopid and rxcui (it's available if just updated linkedSPLs core graph)
3131

3232
query OMOP CDM V5 (GeriOMOP) by SQL query below:
3333

@@ -36,8 +36,6 @@ CONCEPT cpt
3636
WHERE
3737
cpt.CONCEPT_CLASS = 'Clinical Drug';
3838

39-
01/18/2017: 120561 results
40-
4139
right click result table and export to delimited ('|' delimited, none Left or Right Enclosure)
4240
save name as: clinical-drug-omopid-rxcui-<DATE>.dsv
4341
to dir: LinkedSPLs-clinicalDrug/mappings/

linkedSPLs/LinkedSPLs-update/README

+20-21
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,19 @@
11
CODE TO GENERATE THE SQL AND LINKED-DATA VERSION OF LINKEDSPLS
22
Authors: Richard Boyce, Greg Gardner, Yifan Ning
33

4-
Last updated Date: 01/18/2017
4+
Last updated Date: 07/26/2017
55

66
################################################################################
77
OVERVIEW
88
################################################################################
99

10-
The University of Pittsburgh Linked Structured Product Label (SPL) repository renders sections from the package inserts (product labels) of FDA-approved drugs as published in the SPL data standard and provided by the National Library of Medicine's DailyMed resource. Currently, only data from the product labels of prescription drugs is provided. This site's SPL data is updated weekly and all SPLs retain DailyMed versioning data so that researchers can record the provenance of the text and sections they work with. The Linked SPL resource currently contains 33928 SPLs for products containing more than 2,300 active ingredients.
10+
The University of Pittsburgh Linked Structured Product Label (SPL) repository renders sections from the package inserts (product labels) of FDA-approved drugs as published in the SPL data standard and provided by the National Library of Medicine's DailyMed resource. Currently, only data from the product labels of prescription drugs is provided. This site's SPL data is updated weekly and all SPLs retain DailyMed versioning data so that researchers can record the provenance of the text and sections they work with. The Linked SPL resource currently contains 32136 SPLs for products containing 2,289 active ingredients.
1111

1212

1313
Most recent update at the time of creating this file:
14-
- Labels for prescription drugs downloaded from http://dailymed.nlm.nih.gov/dailymed/spl-resources.cfm Jan 18, 2017
14+
- Labels for prescription drugs downloaded from http://dailymed.nlm.nih.gov/dailymed/spl-resources.cfm July 26, 2017
1515

16-
updated in Jan 18 2017 08:00:31 AM EST
16+
updated in July 26 2017 08:00:31 AM EST
1717

1818
- The file load-dailymed-spls/TableSchema.sql has the RDB schema needed to load the data. Note that, for manually way of update mappings, not all data is loaded using Python, read the rest of the README to see other data and how it is loaded. Recommended way of update and load mappings is run the ant program by following commands:
1919

@@ -100,11 +100,9 @@ Put in folder at "bio2rdf/linkedSPLs/LinkedSPLs-update/data/dailymed-mappings/"
100100

101101
Download pharmacologic_class_indexing_spl_files.zip from http://dailymed.nlm.nih.gov/dailymed/spl-resources-all-indexing-files.cfm
102102

103-
Put in folder at "bio2rdf/linkedSPLs/LinkedSPLs-update/data/dailymed-indexings/"
103+
Put in folder at "bio2rdf/linkedSPLs/LinkedSPLs-update/data/dailymed-indexings/pharmacologic_class_indexing_spl_files"
104104

105-
unzip XMLs to folder "pharmacologic_class_indexing_spl_files"
106-
107-
$ cd pharmacologic_class_indexing_spl_files; unzip \*.zip; rm *.zip
105+
$ unzip pharmacologic_class_indexing_spl_files.zip -d pharmacologic_class_indexing_spl_files
108106

109107
--------------------------------------------------------
110108
FDA (Preferred terms, UNIIs):
@@ -118,9 +116,9 @@ Downloads UNII List ('UNIIs <DATE> Names.txt' as UNII lists)
118116
(2) UNIIs_Records
119117
Downloads UNII Data ('UNIIs <DATE> Records.txt' as UNII records)
120118

121-
Keep in directory LinkedSPLs-update/data/FDA
122-
123-
(replace whitespace ' ' in file name to underscore '_')
119+
Put in directory LinkedSPLs-update/data/FDA
120+
Unzip UNII_Data.zip and UNIIs.zip in LinkedSPLs-update/data/FDA/
121+
Replace whitespace ' ' in file names to underscore '_'
124122

125123
Edit LinkedSPLs-update/data-source.properties to reset FDA_UNII_NAMES and FDA_UNII_RECORDS
126124

@@ -130,15 +128,16 @@ Drugbank (Drug bank Id) :
130128

131129
Download from http://www.drugbank.ca/downloads
132130

133-
download drugbank.xml as drugbankX.X and keep in directory LinkedSPLs-update/data/DrugBank/drugbank.xml
131+
download and unzip drugbank.xml as drugbankX.X in directory:
132+
LinkedSPLs-update/data/DrugBank/drugbank.xml
134133

135134
--------------------------------------------------------
136135
UMLS (rxcui):
137136
--------------------------------------------------------
138137

139138
Download RXNORM mappings (full rxnorm) from UMLS at "http://www.nlm.nih.gov/research/umls/rxnorm/docs/rxnormfiles.html"
140139

141-
keep in directory: "LinkedSPLs-update/data/UMLS"
140+
Unzip and keep in directory: "LinkedSPLs-update/data/UMLS/rxnorm-full/"
142141

143142
------------------------------------------------------------------------
144143
PharmagxTable && FDAPharmgxTableToOntologyMap
@@ -152,8 +151,8 @@ put at "LinkedSPLs-update/mappings/FDA-pharmacogenetic-info-mapping/"
152151

153152
Edit data-source.properties
154153
ex.
155-
BIOMARKER = mappings/FDA-pharmacogenetic-info-mapping/biomarker-to-ontology-mapping-07242015.xlsx
156-
GENETIC= genetic-biomarker-table-update-07242015.csv
154+
BIOMARKER = mappings/FDA-pharmacogenetic-info-mapping/biomarker-to-ontology-mapping-01262017.csv
155+
GENETIC= mappings/FDA-pharmacogenetic-info-mapping/FDA_PGx_Table.csv
157156

158157
--------------------------------------------------------
159158
umlsdbmi (DronId for drug and ingredient):
@@ -164,21 +163,21 @@ Code: https://bitbucket.org/uamsdbmi/dron/src
164163
Download repository from "https://bitbucket.org/uamsdbmi/dron/downloads"
165164

166165
unzip repository and then:
167-
copy dron-rxnorm.owl at data/umasdbmi/dron-rxnorm.owl
168-
copy dron-ingredient.owl at data/umasdbmi/dron-ingredient.owl
166+
copy dron-rxnorm.owl from repository to data/umasdbmi/dron-rxnorm.owl
167+
copy dron-ingredient.owl from repository to data/umasdbmi/dron-ingredient.owl
169168

170169
dron-rxnorm.owl for drug product
171170
dron-ingredient.owl for active ingredients
172171

173-
174172
------------------------------------------------------------------------
175173
OMOP concept Id from OHDSI or query OMOP CDM V5 (GeriOMOP) by SQL query
176174
-----------------------------------------------------------------------
177175

178176
Download from "https://github.com/OHDSI/KnowledgeBase/tree/master/LAERTES/terminology-mappings/StandardVocabToRxNorm/imeds_drugids_to_rxcuis.csv"
179177

180-
OR
178+
OR query OMOP CDM vocabulary Version 5 (below example via client sqldeveloper)
181179

180+
(1) For drug product
182181
SELECT cpt.CONCEPT_ID as omopid, cpt.CONCEPT_CODE as rxcui FROM
183182
CONCEPT cpt
184183
WHERE
@@ -188,7 +187,7 @@ right click result table and export to delimited ('|' delimited, none Left or Ri
188187
save name as: clinical-drug-omopid-rxcui-<DATE>.dsv
189188
to dir: LinkedSPLs-clinicalDrug/mappings/
190189

191-
AND
190+
(2) For active ingredient
192191

193192
query OMOP CDM V5 (GeriOMOP) by SQL query below:
194193
SELECT cpt.CONCEPT_ID as omopid, cpt.CONCEPT_CODE as rxcui FROM
@@ -242,7 +241,7 @@ TESTING THE D2R SERVER ON THE DEVELOPMENT MACHINE
242241
------------------------------------------------------------
243242

244243
cd /home/PITT/rdb20/Downloads/D2R/d2rq-0.8.1
245-
./d2r-server --verbose -b http://130.49.206.86:2021/ -p 2021 dailymed_d2r_map_config_d2rq_8_1.n3
244+
./d2r-server --verbose -b http://<host>:<port>/ -p 2021 dailymed_d2r_map_config_d2rq_8_1.n3
246245

247246
------------------------------------------------------------
248247
MOVING THE DEVELOPMENT MODE RDB DATA TO THE PUBLIC SERVER

linkedSPLs/LinkedSPLs-update/data-source.properties

+4-4
Original file line numberDiff line numberDiff line change
@@ -8,13 +8,13 @@ SCHEMA_SQL = load-dailymed-spls/TableSchema.sql
88

99
## Mapping data sources:
1010

11-
FDA_UNII_NAMES = data/FDA/UNIIs_10Nov2016_Names.txt
12-
FDA_UNII_RECORDS = data/FDA/UNIIs_10Nov2016_Records.txt
11+
FDA_UNII_NAMES = data/FDA/UNIIs_28Apr2017_Names.txt
12+
FDA_UNII_RECORDS = data/FDA/UNIIs_28Apr2017_Records.txt
1313
DRUGBANK_XML = data/DrugBank/drugbank.xml
1414
PG_CLASS_INDEXING_SPLS = data/dailymed-indexing/pharmacologic_class_indexing_spl_files/
1515
RXNORM_SETID = data/dailymed-mappings/rxnorm_mappings.txt
16-
RXNCONSO_RRF = data/UMLS/rrf/RXNCONSO.RRF
17-
RXNORM_FULL_SCHEMA = data/UMLS/scripts/mysql/Table_scripts_mysql_rxn.sql
16+
RXNCONSO_RRF = data/UMLS/rxnorm-full/rrf/RXNCONSO.RRF
17+
RXNORM_FULL_SCHEMA = data/UMLS/rxnorm-full/scripts/mysql/Table_scripts_mysql_rxn.sql
1818
UNIIS_RXCUIS_FROM_UMLS = data/UMLS/UNIIs-Rxcuis-from-UMLS.txt
1919
DRON_RXCUI_DRUG = data/umasdbmi/dron-rxnorm.owl
2020
DRON_RXCUI_INGREDIENT = data/umasdbmi/dron-ingredient.owl

0 commit comments

Comments
 (0)