Skip to content

Commit 2fcb634

Browse files
author
Richard Boyce
committed
edit table schema, remove UMLS deprecated tables, update README
1 parent 6539d4d commit 2fcb634

File tree

13 files changed

+8034
-18190
lines changed

13 files changed

+8034
-18190
lines changed

linkedSPLs/LinkedSPLs-update/README

Lines changed: 43 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -37,23 +37,29 @@ Update all linkedSPLs mappings by command below
3737
$ ant linkedSPLs-update
3838

3939
Update piece by piece (recommended)
40+
$ ant loadDailymedSPLsToRDB
41+
$ ant load-loincSection
4042

4143
$ ant load-FDAPreferredSubstanceToUNII
4244
$ ant load-FDA_UNII_to_ChEBI
45+
$ ant load-FDAPreferredSubstanceToRxNORM
46+
$ ant load-FDA_SUBSTANCE_TO_DRUGBANK_BIO2RDF
47+
$ ant load-SPLSetIDToRxNORM
48+
$ ant load-RXNORM_NDFRT_INGRED_Table
49+
$ ant load-FDA_EPC_Table
4350
$ ant load-ChEBI_DRUGBANK_BIO2RDF
44-
$ ant loadDailymedSPLsToRDB
45-
$ ant load-DrOn_RXCUI_DRUG
46-
$ ant load-DrOn_RXCUI_INGREDIENT
47-
$ ant load-FDA_EPC_Table
51+
4852
$ ant load-FDAPharmgxTable
49-
$ ant load-FDAPharmgxTableToOntologyMap
50-
$ ant load-FDAPreferredSubstanceToRxNORM
51-
$ ant load-FDAPreferredSubstanceToRxNORM-restAPI
52-
$ ant load-FDA_SUBSTANCE_TO_DRUGBANK_BIO2RDF
53-
$ ant load-loincSection
53+
$ ant load-FDAPharmgxTableToOntologyMap
54+
55+
$ ant load-DrOn_RXCUI_DRUG
56+
$ ant load-DrOn_RXCUI_INGREDIENT
57+
58+
-- deprecated --
59+
$ ant load-FDAPreferredSubstanceToRxNORM-restAPI
5460
$ ant load-OMOPId-RXCUIs-from-OHDSI
55-
$ ant load-RXNORM_NDFRT_INGRED_Table
56-
$ ant load-SPLSetIDToRxNORM
61+
62+
5763

5864

5965

@@ -64,7 +70,7 @@ PRE-REQUISITES (Download all source data before run any ant command)
6470
Download and organize all source data files in data folder
6571

6672
--------------------------------------------------------
67-
product label sections and mappings from Dailymed:
73+
Dailymed (product label sections, indexing and mappings)
6874
--------------------------------------------------------
6975

7076
(1) dailymed-labels:
@@ -91,7 +97,7 @@ unzip XMLs to folder "pharmacologic_class_indexing_spl_files"
9197
$ cd pharmacologic_class_indexing_spl_files; unzip \*.zip; rm \*.zip
9298

9399
--------------------------------------------------------
94-
FDA Preferred terms, UNIIs from FDA:
100+
FDA (Preferred terms, UNIIs):
95101
--------------------------------------------------------
96102

97103
Download from http://fdasis.nlm.nih.gov/srs/jsp/srs/uniiListDownload.jsp
@@ -109,47 +115,51 @@ Keep in directory LinkedSPLs-update/data/FDA
109115
Edit LinkedSPLs-update/data-source.properties to reset FDA_UNII_NAMES and FDA_UNII_RECORDS
110116

111117
--------------------------------------------------------
112-
Drug bank Id from Drugbank:
118+
Drugbank (Drug bank Id) :
113119
--------------------------------------------------------
114120

115121
Download from http://www.drugbank.ca/downloads
116122

117123
download drugbank.xml as drugbankX.X and keep in directory LinkedSPLs-update/data/DrugBank
118124

119125
--------------------------------------------------------
120-
UMLS:
126+
UMLS (rxcui):
121127
--------------------------------------------------------
122128

123129
Download RXNORM mappings (full rxnorm) from UMLS at "http://www.nlm.nih.gov/research/umls/rxnorm/docs/rxnormfiles.html"
124130

125131
keep in directory: "LinkedSPLs-update/data/UMLS"
126132

127-
--------------------------------------------------------
128-
umlsdbmi:
129-
--------------------------------------------------------
130-
Repository: https://bitbucket.org/uamsdbmi/dron
133+
------------------------------------------------------------------------
134+
PharmagxTable && FDAPharmgxTableToOntologyMap
135+
-----------------------------------------------------------------------
131136

132-
DrOn to RxNorm and ChEBI
137+
Get CSVs from solomon
138+
biomarker-to-ontology-mapping.csv
139+
genetic-biomarker-table-raw-import.csv
133140

134-
Dron mapping file (dron-rxnorm.owl for drug and dron-ingredient.owl for ingredients) download from:
135-
https://bitbucket.org/uamsdbmi/dron/src
141+
put at "LinkedSPLs-update/mappings/FDA-pharmacogenetic-info-mapping/"
136142

137-
Install readland:
138-
$ sudo apt-get install redland-utils
143+
Edit data-source.properties
144+
ex.
145+
BIOMARKER = mappings/FDA-pharmacogenetic-info-mapping/biomarker-to-ontology-mapping-07242015.xlsx
146+
GENETIC= genetic-biomarker-table-update-07242015.csv
139147

140-
Load dron mapping for ingredient into a triple store by:
141-
rdfproc -n dron parse dron-ingredient.owl
148+
--------------------------------------------------------
149+
umlsdbmi (DronId for drug and ingredient):
150+
--------------------------------------------------------
151+
Repository: https://bitbucket.org/uamsdbmi/dron
152+
Code: https://bitbucket.org/uamsdbmi/dron/src
142153

143-
rdfproc -c dron-ingredient query sparql - '
144-
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX dron: <http://purl.obolibrary.org/obo/dron#> SELECT * WHERE { ?dron dron:DRON_00010000 ?rxcui. }' > dron-chebi-rxcui-ingredient.txt
154+
Download repository from "https://bitbucket.org/uamsdbmi/dron/downloads"
145155

156+
unzip repository and then:
157+
copy dron-rxnorm.owl at data/umasdbmi/dron-rxnorm.owl
158+
copy dron-ingredient.owl at data/umasdbmi/dron-ingredient.owl
146159

147-
Load dron mapping for drug product into triple store by:
148-
rdfproc -n dron-drug parse dron-rxnorm.owl
160+
dron-rxnorm.owl for drug product
161+
dron-ingredient.owl for active ingredients
149162

150-
Mappings pulled using:
151-
rdfproc -c dron-drug query sparql - '
152-
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX owl: <http://www.w3.org/2002/07/owl#> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> PREFIX dron: <http://purl.obolibrary.org/obo/dron#> SELECT * WHERE { ?dron dron:DRON_00010000 ?rxcui. }' > dron-rxcui-drug.txt
153163

154164
------------------------------------------------------------------------
155165
OMOP concept Id from OHDSI or query OMOP CDM V5 (GeriOMOP) by SQL query

linkedSPLs/LinkedSPLs-update/build.xml

Lines changed: 32 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -51,10 +51,10 @@
5151
</fileset>
5252
</delete>
5353

54-
<exec executable="bash">
55-
<arg value="-c" />
56-
<arg value ='unzip ${DAILYMED_LABELS}/archives/otc/\*.zip -d ${SPLS_LOAD}/spls/' />
57-
</exec>
54+
<!-- <exec executable="bash"> -->
55+
<!-- <arg value="-c" /> -->
56+
<!-- <arg value ='unzip ${DAILYMED_LABELS}/archives/otc/\*.zip -d ${SPLS_LOAD}/spls/' /> -->
57+
<!-- </exec> -->
5858

5959
<exec executable="bash">
6060
<arg value="-c" />
@@ -226,11 +226,15 @@
226226
<!-- <antcall target="load-SPLSetIDToRxNORM" /> -->
227227
<!-- <antcall target="load-RXNORM_NDFRT_INGRED_Table" /> -->
228228
<!-- <antcall target="load-FDA_EPC_Table" /> -->
229-
<!-- <antcall target="load-DrOn_ChEBI_RXCUI" /> -->
230-
<!-- <antcall target="load-OMOPId-RXCUIs-from-OHDSI" /> -->
231229
<!-- <antcall target="load-ChEBI_DRUGBANK_BIO2RDF" /> -->
230+
232231
<!-- <antcall target="load-FDAPharmgxTable" /> -->
233-
<!-- <antcall target="load-FDAPharmgxTableToOntologyMap" /> -->
232+
<!-- <antcall target="load-FDAPharmgxTableToOntologyMap" /> -->
233+
234+
<!-- <antcall target="load-DrOn_RXCUI_DRUG" /> -->
235+
<!-- <antcall target="load-DrOn_RXCUI_INGREDIENT" /> -->
236+
237+
<!-- <antcall target="load-OMOPId-RXCUIs-from-OHDSI" /> -->
234238

235239
<antcall target="set.timestamp">
236240
<param name="message" value="updated all mappings in ${mysql-schema}" />
@@ -391,14 +395,14 @@
391395
<param name="message" value="mappings of perferred term, UNII and drugbank URI is created at ${ChEBI-DrugBank-bio2rdf-mapping}/UNII-data/INCHI-OR-Syns-OR-Name-${TODAY_US}.txt" />
392396
</antcall>
393397

394-
<delete file="${ChEBI-DrugBank-bio2rdf-mapping}/fda-substance-preferred-name-to-drugbank-${TODAY_US}.txt"/>
398+
<delete file="${ChEBI-DrugBank-bio2rdf-mapping}/fda-substance-preferred-name-to-drugbank.txt"/>
395399

396-
<exec executable="python" dir="${ChEBI-DrugBank-bio2rdf-mapping}/scripts" output="${ChEBI-DrugBank-bio2rdf-mapping}/fda-substance-preferred-name-to-drugbank-${TODAY_US}.txt">
400+
<exec executable="python" dir="${ChEBI-DrugBank-bio2rdf-mapping}/scripts" output="${ChEBI-DrugBank-bio2rdf-mapping}/fda-substance-preferred-name-to-drugbank.txt">
397401
<arg line="addBio2rdf_UNII_to_DrugBank.py ../UNII-data/INCHI-OR-Syns-OR-Name-${TODAY_US}.txt"/>
398402
</exec>
399403

400404
<antcall target="set.timestamp">
401-
<param name="message" value="post-processed mappings of perferred term and drugbank URI is created at ${ChEBI-DrugBank-bio2rdf-mapping}/fda-substance-preferred-name-to-drugbank-${TODAY_US}.txt" />
405+
<param name="message" value="post-processed mappings of perferred term and drugbank URI is created at ${ChEBI-DrugBank-bio2rdf-mapping}/fda-substance-preferred-name-to-drugbank.txt" />
402406
</antcall>
403407

404408
<sql
@@ -412,7 +416,7 @@
412416

413417
<transaction>
414418
truncate FDA_SUBSTANCE_TO_DRUGBANK_BIO2RDF;
415-
LOAD DATA LOCAL INFILE "${ChEBI-DrugBank-bio2rdf-mapping}/fda-substance-preferred-name-to-drugbank-${TODAY_US}.txt" INTO TABLE FDA_SUBSTANCE_TO_DRUGBANK_BIO2RDF(PreferredSubstance, DRUGBANK_CA, DRUGBANK_BIO2RDF);
419+
LOAD DATA LOCAL INFILE "${ChEBI-DrugBank-bio2rdf-mapping}/fda-substance-preferred-name-to-drugbank.txt" INTO TABLE FDA_SUBSTANCE_TO_DRUGBANK_BIO2RDF(PreferredSubstance, DRUGBANK_CA, DRUGBANK_BIO2RDF);
416420
</transaction>
417421
</sql>
418422

@@ -432,13 +436,13 @@
432436
<exec executable="python" failonerror="true">
433437
<arg line="${ChEBI-DrugBank-bio2rdf-mapping}/scripts/parseDBIdAndChEBI.py ${DRUGBANK_XML}"/>
434438
<redirector append="true">
435-
<outputmapper type="merge" to="${ChEBI-DrugBank-bio2rdf-mapping}/drugbank-to-chebi-${TODAY_US}.txt"/>
439+
<outputmapper type="merge" to="${ChEBI-DrugBank-bio2rdf-mapping}/drugbank-to-chebi.txt"/>
436440
<errormapper type="merge" to="${ERROR_LOG}"/>
437441
</redirector>
438442
</exec>
439443

440444
<antcall target="set.timestamp">
441-
<param name="message" value="mappings of drugbank Id and chebi id is created at ${ChEBI-DrugBank-bio2rdf-mapping}/drugbank-to-chebi-${TODAY_US}.txt" />
445+
<param name="message" value="mappings of drugbank Id and chebi id is created at ${ChEBI-DrugBank-bio2rdf-mapping}/drugbank-to-chebi.txt" />
442446
</antcall>
443447

444448
<sql
@@ -451,7 +455,7 @@
451455
</classpath>
452456
<transaction>
453457
truncate ChEBI_DRUGBANK_BIO2RDF;
454-
load data local infile '${ChEBI-DrugBank-bio2rdf-mapping}/drugbank-to-chebi-${TODAY_US}.txt' into table ChEBI_DRUGBANK_BIO2RDF fields terminated by '\t'
458+
load data local infile '${ChEBI-DrugBank-bio2rdf-mapping}/drugbank-to-chebi.txt' into table ChEBI_DRUGBANK_BIO2RDF fields terminated by '\t'
455459
lines terminated by '\n' (CHEBI_OBO, CHEBI_BIO2RDF, DRUGBANK_CA, DRUGBANK_BIO2RDF);
456460
</transaction>
457461
</sql>
@@ -474,7 +478,7 @@
474478
<delete file=" ${RxNORM-mapping}/PreferredTerm-UNII-Rxcui-mapping.txt"/>
475479

476480
<exec executable="python" failonerror="true">
477-
<arg line="${RxNORM-mapping}/mergePT-UNII-RXCUI.py data/FDA/FDAPreferredSubstanceToUNII.txt ${UNIIS_RXCUIS_FROM_UMLS} ${RxNORM-mapping}/PreferredTerm-UNII-Rxcui-mapping.txt ${RxNORM-mapping}/PreferredTermRxcui-mapping.txt" />
481+
<arg line="${RxNORM-mapping}/mergePT-UNII-RXCUI.py data/FDA/FDAPreferredSubstanceToUNII.txt ${UNIIS_RXCUIS_FROM_UMLS} ${RxNORM-mapping}/PreferredTerm-UNII-Rxcui-mapping.txt ${RxNORM-mapping}/PreferredTermRxcui-mapping.txt" />
478482
<redirector append="true">
479483
<errormapper type="merge" to="${ERROR_LOG}"/>
480484
</redirector>
@@ -613,13 +617,13 @@
613617
<exec executable="python" failonerror="true">
614618
<arg line="${pharmacologic_class_indexing}/parseEPCfromXMLs.py ${PG_CLASS_INDEXING_SPLS}" />
615619
<redirector append="true">
616-
<outputmapper type="merge" to="${pharmacologic_class_indexing}/EPC_extraction_most_recent_${TODAY_US}.txt"/>
620+
<outputmapper type="merge" to="${pharmacologic_class_indexing}/EPC_extraction_most_recent.txt"/>
617621
<errormapper type="merge" to="${ERROR_LOG}"/>
618622
</redirector>
619623
</exec>
620624

621625
<antcall target="set.timestamp">
622-
<param name="message" value="mappings of setId, UNII, NUI and PreferredNameAndRole is created at ${pharmacologic_class_indexing}/EPC_extraction_most_recent_${TODAY_US}.txt" />
626+
<param name="message" value="mappings of setId, UNII, NUI and PreferredNameAndRole is created at ${pharmacologic_class_indexing}/EPC_extraction_most_recent.txt" />
623627
</antcall>
624628

625629
<sql
@@ -632,7 +636,7 @@
632636
</classpath>
633637
<transaction>
634638
truncate FDA_EPC_Table;
635-
LOAD DATA LOCAL INFILE "${pharmacologic_class_indexing}/EPC_extraction_most_recent_${TODAY_US}.txt" INTO TABLE FDA_EPC_Table(setId, UNII, NUI, PreferredNameAndRole);
639+
LOAD DATA LOCAL INFILE "${pharmacologic_class_indexing}/EPC_extraction_most_recent.txt" INTO TABLE FDA_EPC_Table(setId, UNII, NUI, PreferredNameAndRole);
636640
</transaction>
637641
</sql>
638642

@@ -668,12 +672,12 @@
668672
<param name="message" value="parse to get DrOn-rxnorm mappings for clinical drug from ${DRON_RXCUI_DRUG}" />
669673
</antcall>
670674

671-
<delete file="${DrOn-to-RxNorm}/cleaned-dron-to-rxcui-drug-${TODAY_US}.txt"/>
675+
<delete file="${DrOn-to-RxNorm}/cleaned-dron-to-rxcui-drug.txt"/>
672676

673677
<exec executable="python" failonerror="true">
674678
<arg line="${DrOn-to-RxNorm}/cleanData.py ${DrOn-to-RxNorm}/dron-rxcui-drug.txt" />
675679
<redirector append="true">
676-
<outputmapper type="merge" to="${DrOn-to-RxNorm}/cleaned-dron-to-rxcui-drug-${TODAY_US}.txt"/>
680+
<outputmapper type="merge" to="${DrOn-to-RxNorm}/cleaned-dron-to-rxcui-drug.txt"/>
677681
<errormapper type="merge" to="${ERROR_LOG}"/>
678682
</redirector>
679683
</exec>
@@ -692,7 +696,7 @@
692696
</classpath>
693697
<transaction>
694698
truncate DrOn_ChEBI_RXCUI_DRUG;
695-
LOAD DATA LOCAL INFILE '${DrOn-to-RxNorm}/cleaned-dron-to-rxcui-drug-${TODAY_US}.txt' INTO TABLE `DrOn_ChEBI_RXCUI_DRUG` FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' (dron_id, ChEBI, rxcui);
699+
LOAD DATA LOCAL INFILE '${DrOn-to-RxNorm}/cleaned-dron-to-rxcui-drug.txt' INTO TABLE `DrOn_ChEBI_RXCUI_DRUG` FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' (dron_id, ChEBI, rxcui);
696700
</transaction>
697701
</sql>
698702

@@ -729,12 +733,12 @@
729733
<param name="message" value="parse to get DrOn-rxnorm mappings for ingredient from ${DRON_RXCUI_INGREDIENT}" />
730734
</antcall>
731735

732-
<delete file="${DrOn-to-RxNorm}/cleaned-dron-chebi-rxcui-ingredient-${TODAY_US}.txt"/>
736+
<delete file="${DrOn-to-RxNorm}/cleaned-dron-chebi-rxcui-ingredient.txt"/>
733737

734738
<exec executable="python" failonerror="true">
735739
<arg line="${DrOn-to-RxNorm}/cleanData.py ${DrOn-to-RxNorm}/dron-chebi-rxcui-ingredient.txt" />
736740
<redirector append="true">
737-
<outputmapper type="merge" to="${DrOn-to-RxNorm}/cleaned-dron-chebi-rxcui-ingredient-${TODAY_US}.txt"/>
741+
<outputmapper type="merge" to="${DrOn-to-RxNorm}/cleaned-dron-chebi-rxcui-ingredient.txt"/>
738742
<errormapper type="merge" to="${ERROR_LOG}"/>
739743
</redirector>
740744
</exec>
@@ -753,7 +757,7 @@
753757
</classpath>
754758
<transaction>
755759
truncate DrOn_ChEBI_RXCUI_DRUG;
756-
LOAD DATA LOCAL INFILE '${DrOn-to-RxNorm}/cleaned-dron-chebi-rxcui-ingredient-${TODAY_US}.txt' INTO TABLE `DrOn_ChEBI_RXCUI_INGREDIENT` FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' (dron_id, ChEBI, rxcui);
760+
LOAD DATA LOCAL INFILE '${DrOn-to-RxNorm}/cleaned-dron-chebi-rxcui-ingredient.txt' INTO TABLE `DrOn_ChEBI_RXCUI_INGREDIENT` FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' (dron_id, ChEBI, rxcui);
757761
</transaction>
758762
</sql>
759763

@@ -814,13 +818,14 @@
814818

815819
<!-- update FDAPharmgxTable
816820
query the latest version of core and active moiety graphs to get latest version of pharmgxTable (mappings of setid and rxcui)
817-
TOTO; revise python sparql query script to excute query agaist RDB (Mysql)
821+
TOTO; revise createFDAPharmgxDBTable.py to excute query on RDB (Mysql)
822+
Otherwise, have to redump core graph to include pgx data
818823
-->
819824

820825
<target name="load-FDAPharmgxTable" >
821826

822827
<exec executable="python" failonerror="true">
823-
<arg line="${FDA-pharmacogenetic-info-mapping}/createFDAPharmgxDBTable.py ${FDA-pharmacogenetic-info-mapping}/genetic-biomarker-table-raw-import.csv ${RxNORM-mapping}/PreferredTermRxcui-mapping.txt ${FDA-pharmacogenetic-info-mapping}/FDAPharmgxTable.csv" />
828+
<arg line="${FDA-pharmacogenetic-info-mapping}/createFDAPharmgxDBTable.py ${GENETIC} ${RxNORM-mapping}/PreferredTermRxcui-mapping.txt ${FDA-pharmacogenetic-info-mapping}/FDAPharmgxTable.csv" />
824829
<redirector append="true">
825830
<errormapper type="merge" to="${ERROR_LOG}"/>
826831
</redirector>
@@ -862,7 +867,7 @@
862867
</classpath>
863868
<transaction>
864869
truncate FDAPharmgxTableToOntologyMap;
865-
LOAD DATA LOCAL INFILE "${FDA-pharmacogenetic-info-mapping}/biomarker-to-ontology-mapping.csv" INTO TABLE FDAPharmgxTableToOntologyMap(FDAReferencedSubgroup,HGNCGeneSymbol,Synonymns,AlleleVariant,Pharmgkb,URI,Ontology,CuratorComments);
870+
LOAD DATA LOCAL INFILE "${BIOMARKER}" INTO TABLE FDAPharmgxTableToOntologyMap FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' IGNORE 1 LINES(FDAReferencedSubgroup,HGNCGeneSymbol,Synonymns,AlleleVariant,Pharmgkb,URI,Ontology,CuratorComments);
866871
</transaction>
867872
</sql>
868873

linkedSPLs/LinkedSPLs-update/data-source.properties

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,10 @@ DRON_RXCUI_DRUG = data/umasdbmi/dron-rxnorm.owl
2020
DRON_RXCUI_INGREDIENT = data/umasdbmi/dron-ingredient.owl
2121
OMOP_RXCUI = data/OMOP-OHDSI/imeds_drugids_to_rxcuis.csv
2222

23+
## Pgx data from solomon:
24+
BIOMARKER = mappings/FDA-pharmacogenetic-info-mapping/biomarker-to-ontology-mapping-07242015.csv
25+
GENETIC= mappings/FDA-pharmacogenetic-info-mapping/genetic-biomarker-table-update-07242015.csv
26+
2327
## mapping directories:
2428

2529
PT-UNII-ChEBI-mapping = mappings/PT-UNII-ChEBI-mapping

0 commit comments

Comments
 (0)