-
Notifications
You must be signed in to change notification settings - Fork 16
Open
Description
Hi
I am trying to reproduce the numbers stated in the paper for appropriate comparisons to a paper I am writing. But when I run the following command I get a corpus BLEU score of 30.69.
. scripts/conala/test.sh ../external-knowledge-codegen/best_pretrained_models/finetune.mined.retapi.distsmpl.dr0.3.lr0.001.lr_de0.5.lr_da15.beam15.seed0.mined_100000.intent_count100k_topk1_temp5.bin 2>&1
load model from [../external-knowledge-codegen/best_pretrained_models/finetune.mined.retapi.distsmpl.dr0.3.lr0.001.lr_de0.5.lr_da15.beam15.seed0.mined_100000.intent_count100k_topk1_temp5.bin]
Decoding: 100%|██████| 500/500 [02:39<00:00, 3.13it/s]
{'corpus_bleu': 0.30694588794625494, 'oracle_corpus_bleu': 0.4181369862278688, 'avg_sent_bleu': 0.2376696401071103, 'oracle_avg_sent_bleu': 0.3983062032090926, 'exact_match': 0.028, 'oracle_exact_match': 0.084}I am guessing the reranker is not used in the generation of the results.
To solve this I accessed the testing function directly to generate hyps and evaluate them with the same BLEU functions.
model_file='external_repos/external-knowledge-codegen/best_pretrained_models/finetune.mined.retapi.distsmpl.dr0.3.lr0.001.lr_de0.5.lr_da15.beam15.seed0.mined_100000.intent_count100k_topk1_temp5.bin'
reranker_file = 'external_repos/external-knowledge-codegen/best_pretrained_models/reranker.conala.vocab.src_freq3.code_freq3.mined_100000.intent_count100k_topk1_temp5.bin'
self.parser = StandaloneParser('default_parser',
model_file,
'conala_example_processor',
beam_size=15,
cuda=True,
reranker_path=reranker_file)This gives me a similar corpus BLEU score of 30.078 and an average sentence BLEU score with NLTK with smooth_fn3 of 25.295.
What are the necessary commands in sequence to get the score from the paper?
Metadata
Metadata
Assignees
Labels
No labels