Skip to content

ielab/trec-ct-2023

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TREC Clinical Trials 2021 results

Runs NDCG@10 P@10 RPrec MRR
BM25, k1=0.82, b=0.68 0.3395 0.4520 0.1892 0.6942
ANCE 0.1052 0.1280 0.0541 0.3017
Hybrid, BM25+ANCE, a=2
(grid search from a=1 to 10)
0.3488 0.4667 0.1883 0.7208

TREC Clinical Trials 2022 results

Runs NDCG@10 P@10 RPrec MRR Recall@1000
TREC BEST (frocchio_monot5_e) 0.6125 0.6780 0.3652 0.8519 0.6765
a. BM25, k1=0.82, b=0.68 0.3103 0.3760 0.1715 0.6481 0.3862
b. ANCE 0.0909 0.0960 0.0360 0.2168 0.1018
c. DPR(BERT) + 10k chatGPT data 0.2186 0.2420 0.1055 0.4629 0.2646
d. DPR(PubmedBERT) + 10k chatGPT data 0.3481 0.4000 0.2409 0.5871 0.5391
e. DPR(PubmedBERT) + 20k chatGPT data 0.3372 0.3820 0.2112 0.6018 0.4563
f. DPR(PubmedBERT) + 20k chatGPT data + 5k labelled data 0.4037 0.4500 0.2409 0.6418 0.4982
g. Hybrid a + f, alpha=0.9 0.4937 0.5500 0.2895 0.7912 0.5901
h. DPR(PubmedBERT) + 20k chatGPT data + 5k labelled data + Hard negatives 0.4096 0.4840 0.2711 0.6693 0.5932
i. Hybrid a + h, alpha=0.8 0.4819 0.5620 0.2954 0.7391 0.5930
j. SPLADE(PubmedBERT) + 20k chatGPT data + 5k labelled data 0.3729 0.4120 0.2257 0.6180 0.5196
k. Hybrid h + j, alpha=0.8 0.4746 0.5460 0.3071 0.6949 0.6369
k. k + gpt-3.5-turbo setwise.n=3 rerank 100 0.4934 0.5760 - 0.7569 -

TREC Clinical Trials 2022 results (l=2, official setting)

Runs NDCG@10 P@10 RPrec MRR Recall@1000
TREC best (frocchio_monot5_e) 0.6125 0.5080 0.3297 0.7262 0.7396
TREC Second best (DoSSIER_5) 0.5565 0.4560 0.2434 0.6191 0.6239
TREC third best (iiia-unipd, manual run) 0.5051 0.3980 0.2790 0.6085 -
BM25, k1=0.82, b=0.68 0.3103 0.2120 0.1191 0.4126 0.3663
-------------------------------------------------------------------------------------- ------------- ------------ ------------ ------------ --------------
a. DR(PubmedBERT) + 20k chatGPT data + 5k labelled data (ckpt6000) 0.4037 0.3260 0.2158 0.5741 0.5551
b. DR(PubmedBERT + CT MLM) + 20k chatGPT data + 5k labelled data (ckpt3000) 0.4072 0.3280 0.2194 0.6392 0.5992
c. DR(PubmedBERT + CT MLM) + 20k chatGPT data + 5k labelled data + HN (ckpt3000) 0.4271 0.3240 0.2274 0.4826 0.5987
-------------------------------------------------------------------------------------- ------------- ------------ ------------ ------------ --------------
a. SPLADE(PubmedBERT) + 20k chatGPT data + 5k labelled data (ckpt10000) 0.3729 0.3020 0.1975 0.5339 0.5764
b. SPLADE(PubmedBERT + CT MLM) + 20k chatGPT data + 5k labelled data (ckpt12000) 0.3512 0.2920 0.1854 0.4964 0.5576
c. SPLADE(PubmedBERT) + 20k chatGPT data + 5k labelled data +HN (ckpt16000) 0.4235 0.3280 0.2341 0.5374 0.5968
-------------------------------------------------------------------------------------- ------------- ------------ ------------ ------------ --------------
a. Hybrid, DR c + SPLADE c, alpha=0.5 0.5024 0.3800 0.2612 0.5884 0.6529
-------------------------------------------------------------------------------------- ------------- ------------ ------------ ------------ --------------
a. Cross-encoder (PubmedBERT large), HN from hybrid a, ckpt2000, rerank hybrid a top1000 0.5614 0.4280 0.2812 0.7009 0.6529
b. Cross-encoder (PubmedBERT large), HN from hybrid a, ckpt3000, rerank hybrid a top1000 0.5804 0.4400 0.2915 0.7427 0.6529
c. Cross-encoder (PubmedBERT large), HN from hybrid a, ckpt4000, rerank hybrid a top1000 0.5977 0.4560 0.3069 0.7154 0.6529
d. Cross-encoder (PubmedBERT large), HN from hybrid a, ckpt5000, rerank hybrid a top1000 0.6055 0.4660 0.3121 0.7407 0.6529
e. Cross-encoder (PubmedBERT large), HN from hybrid a, ckpt8000, rerank hybrid a top1000 0.6064 0.4740 0.3069 0.7131 0.6529
f. Cross-encoder (PubmedBERT large), HN from hybrid a, ckpt9000, rerank hybrid a top1000 0.6090 0.4800 0.3107 0.7063 0.6529
g. Cross-encoder (PubmedBERT large), HN from hybrid a, ckpt10000, rerank hybrid a top1000 0.5982 0.4640 0.3018 0.7160 0.6529
h. Cross-encoder (PubmedBERT large), HN from hybrid a, ckpt11000, rerank hybrid a top1000 0.6006 0.4620 0.3012 0.7114 0.6529
-------------------------------------------------------------------------------------- ------------- ------------ ------------ ------------ --------------
a. Hybrid DR c + SPLADE c + GPT-3.5-turbo judger rerank top20 0.5254 0.4120 0.2641 0.6225 0.6529
-------------------------------------------------------------------------------------- ------------- ------------ ------------ ------------ --------------
a. Hybrid DR c + SPLADE c + Cross-encoder f, alpha=0.1 0.6209 0.4880 0.3109 0.7545 0.6529
b. Hybrid DR c + SPLADE c + Cross-encoder f, alpha=0.2 0.6162 0.4800 0.3171 0.7423 0.6529
-------------------------------------------------------------------------------------- ------------- ------------ ------------ ------------ --------------
a. Hybrid DR c + SPLADE c + Cross-encoder f, alpha=0.1 + GPT-4 judger rerank top20
avg num_api_calls: 20, avg prompt tokens: 28105.44, avg generate tokens: 28.86
0.6591 0.5680 0.3241 0.7795 0.6529

TREC Clinical Trials 2023 submission

GPT-4 judger rerank top20: avg num_api_calls: 20, avg prompt tokens: 27125.3, avg generate tokens: 35.725, $32.64

About

Team IELAB at TREC Clinical Trial Track 2023

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published