Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

del sh #19

Merged
merged 1 commit into from
Mar 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view

This file was deleted.

This file was deleted.

14 changes: 14 additions & 0 deletions evals/registry/data/00_scipaper_enzyme_km/samples.jsonl
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{"file_name": "../uni-finder/enzyme/km/paper/10.1007_s00425-014-2102-6.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1007_s00425-014-2102-6.pdf", "answerfile_name": "../uni-finder/enzyme/km/answer/10.1007_s00425-014-2102-6.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1007_s00425-014-2102-6.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Km Value (mM)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/km/paper/10.1007_s10725-019-00528-9.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1007_s10725-019-00528-9.pdf", "answerfile_name": "../uni-finder/enzyme/km/answer/10.1007_s10725-019-00528-9.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1007_s10725-019-00528-9.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Km Value (mM)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/km/paper/10.1016_j.bbrep.2016.11.003.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1016_j.bbrep.2016.11.003.pdf", "answerfile_name": "../uni-finder/enzyme/km/answer/10.1016_j.bbrep.2016.11.003.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1016_j.bbrep.2016.11.003.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Km Value (mM)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/km/paper/10.1016_s0005-2728__97__00090-x.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1016_s0005-2728__97__00090-x.pdf", "answerfile_name": "../uni-finder/enzyme/km/answer/10.1016_s0005-2728__97__00090-x.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1016_s0005-2728__97__00090-x.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Km Value (mM)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/km/paper/10.1016_s0021-9258__18__96277-0.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1016_s0021-9258__18__96277-0.pdf", "answerfile_name": "../uni-finder/enzyme/km/answer/10.1016_s0021-9258__18__96277-0.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1016_s0021-9258__18__96277-0.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Km Value (mM)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/km/paper/10.1016_s0021-9258__18__96427-6.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1016_s0021-9258__18__96427-6.pdf", "answerfile_name": "../uni-finder/enzyme/km/answer/10.1016_s0021-9258__18__96427-6.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1016_s0021-9258__18__96427-6.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Km Value (mM)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/km/paper/10.1016_S0076-6879__75__41082-5.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1016_S0076-6879__75__41082-5.pdf", "answerfile_name": "../uni-finder/enzyme/km/answer/10.1016_S0076-6879__75__41082-5.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1016_S0076-6879__75__41082-5.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Km Value (mM)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/km/paper/10.1016_s0141-8130__01__00188-x.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1016_s0141-8130__01__00188-x.pdf", "answerfile_name": "../uni-finder/enzyme/km/answer/10.1016_s0141-8130__01__00188-x.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1016_s0141-8130__01__00188-x.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Km Value (mM)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/km/paper/10.1021_acs.biochem.6b00536.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1021_acs.biochem.6b00536.pdf", "answerfile_name": "../uni-finder/enzyme/km/answer/10.1021_acs.biochem.6b00536.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1021_acs.biochem.6b00536.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Km Value (mM)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/km/paper/10.1080_09168451.2020.1751582.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1080_09168451.2020.1751582.pdf", "answerfile_name": "../uni-finder/enzyme/km/answer/10.1080_09168451.2020.1751582.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1080_09168451.2020.1751582.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Km Value (mM)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/km/paper/10.1080_09168451.2020.1799749.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1080_09168451.2020.1799749.pdf", "answerfile_name": "../uni-finder/enzyme/km/answer/10.1080_09168451.2020.1799749.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1080_09168451.2020.1799749.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Km Value (mM)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/km/paper/10.1104_pp.19.01225.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1104_pp.19.01225.pdf", "answerfile_name": "../uni-finder/enzyme/km/answer/10.1104_pp.19.01225.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1104_pp.19.01225.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Km Value (mM)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/km/paper/10.1139_b07-081.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1139_b07-081.pdf", "answerfile_name": "../uni-finder/enzyme/km/answer/10.1139_b07-081.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/10.1139_b07-081.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Km Value (mM)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/km/paper/j.1432-1033.1986.tb09548.x.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/j.1432-1033.1986.tb09548.x.pdf", "answerfile_name": "../uni-finder/enzyme/km/answer/j.1432-1033.1986.tb09548.x.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/j.1432-1033.1986.tb09548.x.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Km Value (mM)"], "index": "Substrate"}

This file was deleted.

17 changes: 14 additions & 3 deletions evals/registry/data/00_scipaper_enzyme_substrate/samples.jsonl
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,3 +1,14 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6316846852a855013f98ee678e945582013c1269fcad311c8e933859ade77c68
size 1919
{"file_name": "../uni-finder/enzyme/substrate/paper/s_10.1007_s00425-014-2102-6.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1007_s00425-014-2102-6.pdf", "answerfile_name": "../uni-finder/enzyme/substrate/answer/s_10.1007_s00425-014-2102-6.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1007_s00425-014-2102-6.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Products", "Comment (Product)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/substrate/paper/s_10.1007_s10725-019-00528-9.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1007_s10725-019-00528-9.pdf", "answerfile_name": "../uni-finder/enzyme/substrate/answer/s_10.1007_s10725-019-00528-9.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1007_s10725-019-00528-9.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Products", "Comment (Product)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/substrate/paper/s_10.1007_s11103-006-0040-9.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1007_s11103-006-0040-9.pdf", "answerfile_name": "../uni-finder/enzyme/substrate/answer/s_10.1007_s11103-006-0040-9.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1007_s11103-006-0040-9.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Products", "Comment (Product)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/substrate/paper/s_10.1016_j.bbrep.2016.11.003.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1016_j.bbrep.2016.11.003.pdf", "answerfile_name": "../uni-finder/enzyme/substrate/answer/s_10.1016_j.bbrep.2016.11.003.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1016_j.bbrep.2016.11.003.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Products", "Comment (Product)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/substrate/paper/s_10.1016_s0005-2728__97__00090-x.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1016_s0005-2728__97__00090-x.pdf", "answerfile_name": "../uni-finder/enzyme/substrate/answer/s_10.1016_s0005-2728__97__00090-x.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1016_s0005-2728__97__00090-x.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Products", "Comment (Product)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/substrate/paper/s_10.1016_s0021-9258__18__96277-0.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1016_s0021-9258__18__96277-0.pdf", "answerfile_name": "../uni-finder/enzyme/substrate/answer/s_10.1016_s0021-9258__18__96277-0.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1016_s0021-9258__18__96277-0.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Products", "Comment (Product)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/substrate/paper/s_10.1016_s0021-9258__18__96427-6.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1016_s0021-9258__18__96427-6.pdf", "answerfile_name": "../uni-finder/enzyme/substrate/answer/s_10.1016_s0021-9258__18__96427-6.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1016_s0021-9258__18__96427-6.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Products", "Comment (Product)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/substrate/paper/s_10.1016_S0076-6879__75__41082-5.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1016_S0076-6879__75__41082-5.pdf", "answerfile_name": "../uni-finder/enzyme/substrate/answer/s_10.1016_S0076-6879__75__41082-5.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1016_S0076-6879__75__41082-5.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Products", "Comment (Product)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/substrate/paper/s_10.1021_acs.biochem.6b00536.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1021_acs.biochem.6b00536.pdf", "answerfile_name": "../uni-finder/enzyme/substrate/answer/s_10.1021_acs.biochem.6b00536.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1021_acs.biochem.6b00536.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Products", "Comment (Product)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/substrate/paper/s_10.1080_09168451.2020.1751582.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1080_09168451.2020.1751582.pdf", "answerfile_name": "../uni-finder/enzyme/substrate/answer/s_10.1080_09168451.2020.1751582.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1080_09168451.2020.1751582.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Products", "Comment (Product)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/substrate/paper/s_10.1080_09168451.2020.1799749.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1080_09168451.2020.1799749.pdf", "answerfile_name": "../uni-finder/enzyme/substrate/answer/s_10.1080_09168451.2020.1799749.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1080_09168451.2020.1799749.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Products", "Comment (Product)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/substrate/paper/s_10.1104_pp.19.01225.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1104_pp.19.01225.pdf", "answerfile_name": "../uni-finder/enzyme/substrate/answer/s_10.1104_pp.19.01225.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1104_pp.19.01225.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Products", "Comment (Product)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/substrate/paper/s_10.1139_b07-081.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1139_b07-081.pdf", "answerfile_name": "../uni-finder/enzyme/substrate/answer/s_10.1139_b07-081.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_10.1139_b07-081.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Products", "Comment (Product)"], "index": "Substrate"}
{"file_name": "../uni-finder/enzyme/substrate/paper/s_j.1432-1033.1986.tb09548.x.pdf", "file_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_j.1432-1033.1986.tb09548.x.pdf", "answerfile_name": "../uni-finder/enzyme/substrate/answer/s_j.1432-1033.1986.tb09548.x.csv", "answerfile_link": "https://dp-filetrans-bj.oss-cn-beijing.aliyuncs.com/changjunhan/s_j.1432-1033.1986.tb09548.x.csv", "compare_fields": ["Substrate", "Comment", "Organism", "Products", "Comment (Product)"], "index": "Substrate"}
18 changes: 0 additions & 18 deletions evals/registry/evals/00_scipaper_enzyme_activate_compound.yaml

This file was deleted.

19 changes: 10 additions & 9 deletions ...y/evals/00_scipaper_enzyme_inhibitor.yaml → ...registry/evals/00_scipaper_enzyme_km.yaml
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,18 +1,19 @@
scipaper_enzyme_inhibitor:
id: scipaper_enzyme_inhibitor.val.csv
scipaper_enzyme_km:
id: scipaper_enzyme_km.val.csv
metrics: [accuracy]

scipaper_enzyme_inhibitor.val.csv:
scipaper_enzyme_km.val.csv:
class: evals.elsuite.rag_table_extract:TableExtract
args:
samples_jsonl: 00_scipaper_enzyme_inhibitor/samples.jsonl
samples_jsonl: 00_scipaper_enzyme_km/samples.jsonl
instructions: |
Please give a complete list of Inhibitor, Commentand Organism of all substrates in the paper. Usually the substrates' tags are numbers or IUPAC names.
Please give a complete list of Substrate, Commentand Organism of all substrates in the paper. Usually the substrates' tags are numbers or IUPAC names.
1. Output in csv format, write units not in header but in the value like "10.5 µM". Quote the value if it has comma! For example:
```csv
Inhibitor,Comment,Organism
ATP,"competitive inhibition of verapamil-dependent ATPase-activity",Homo sapiens
p-xylene,"11.4 mM, slight inhibitor",Bos taurus
NH4+, 0.002 mM,Bos taurus
Substrate,Comment,Organism,Km Value
ATP,"competitive inhibition of verapamil-dependent ATPase-activity",Homo sapiens, 3.5 nM
p-xylene,"20 mM Tris-HCl(pH 7.0), 5 mM MgCl2, at 25 ℃"",Bos taurus, 12 nM
D-ribose 6-phosphate, - , Homo sapiens, 120 nM
```
2. If there are multiple tables, concat them. Don't give me reference or using "...", give me complete table!
3. If no relevant information was found in the paper, use '-' to fill in the form in CSV.
16 changes: 0 additions & 16 deletions evals/registry/evals/00_scipaper_enzyme_localization.yaml

This file was deleted.

12 changes: 6 additions & 6 deletions evals/registry/evals/00_scipaper_enzyme_substrate.yaml
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@ scipaper_enzyme_substrate.val.csv:
args:
samples_jsonl: 00_scipaper_enzyme_substrate/samples.jsonl
instructions: |
Please give a complete list of SMILES structures, Km values, Vmax values, target info (protein or cell line), and organism of all substrates in the paper. Usually the substrates' tags are numbers or IUPAC names.
Please give a complete list of Substrate, Commentand Organism of all substrates, Products and Comment of Product in the paper. Usually the substrates' tags are numbers or IUPAC names.
1. Output in csv format, write units not in header but in the value like "10.5 µM". Quote the value if it has comma! For example:
```csv
Substrate,Inhibitors, Km value,Km max,Comment,organism,Vmax value,SMILES,Target info,Activating Compound,
ATP,Cu2+,0.001 mM,-,-,Homo sapiens,-,-,ATP-linker aldehyde,Carboxybenzaldehyde,
p-xylene,NADH,0.004 mM,-,-,Homo sapiens,-,C1CCCCC1,-,Methylbenzaldehyde
NADPH,benzaldehyde, 0.12 mM,125 mM,enzyme form ATP,Bos taurus,-,-,NH4+

Substrate,Comment,Organism,Products,"Comment (Product)"
"NADH + H+ + O2","20 mM Tris-HCl(pH 7.0)",Homo sapiens,"NAD+ + H2O", -
"D-glucose + 6-phosphate","20 mM Tris-HCl(pH 7.0), 5 mM MgCl2, at 25 ℃"",Bos taurus, -
"D-ribose 6-phosphate", - , Homo sapiens, "glycerol + phosphate", -
```
2. If there are multiple tables, concat them. Don't give me reference or using "...", give me complete table!
3. If no relevant information was found in the paper, use '-' to fill in the form in CSV.
Loading