Reproducing results

Hi

I am trying to reproduce the numbers stated in the paper for appropriate comparisons to a paper I am writing. But when I run the following command I get a corpus BLEU score of 30.69.
```bash
. scripts/conala/test.sh ../external-knowledge-codegen/best_pretrained_models/finetune.mined.retapi.distsmpl.dr0.3.lr0.001.lr_de0.5.lr_da15.beam15.seed0.mined_100000.intent_count100k_topk1_temp5.bin 2>&1
load model from [../external-knowledge-codegen/best_pretrained_models/finetune.mined.retapi.distsmpl.dr0.3.lr0.001.lr_de0.5.lr_da15.beam15.seed0.mined_100000.intent_count100k_topk1_temp5.bin]
Decoding: 100%|██████| 500/500 [02:39<00:00,  3.13it/s]
{'corpus_bleu': 0.30694588794625494, 'oracle_corpus_bleu': 0.4181369862278688, 'avg_sent_bleu': 0.2376696401071103, 'oracle_avg_sent_bleu': 0.3983062032090926, 'exact_match': 0.028, 'oracle_exact_match': 0.084}
```

I am guessing the reranker is not used in the generation of the results.

To solve this I accessed the testing function directly to generate hyps and evaluate them with the same BLEU functions.
```python
model_file='external_repos/external-knowledge-codegen/best_pretrained_models/finetune.mined.retapi.distsmpl.dr0.3.lr0.001.lr_de0.5.lr_da15.beam15.seed0.mined_100000.intent_count100k_topk1_temp5.bin'
reranker_file = 'external_repos/external-knowledge-codegen/best_pretrained_models/reranker.conala.vocab.src_freq3.code_freq3.mined_100000.intent_count100k_topk1_temp5.bin'
self.parser = StandaloneParser('default_parser',
                              model_file,
                              'conala_example_processor',
                              beam_size=15,
                              cuda=True,
                              reranker_path=reranker_file)
```
This gives me a similar corpus BLEU score of 30.078 and an average sentence BLEU score with NLTK with smooth_fn3 of 25.295.

What are the necessary commands in sequence to get the score from the paper?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reproducing results #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reproducing results #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions