-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathhelp.txt
49 lines (48 loc) · 2.72 KB
/
help.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
usage: extract.py [-h] [--n N] [--k K] [--alpha ALPHA]
[--batch_size BATCH_SIZE] [--temperature TEMPERATURE]
[--repetition_penalty REPETITION_PENALTY]
[--min_length MIN_LENGTH] [--max_length MAX_LENGTH]
[--num_return_sequences NUM_RETURN_SEQUENCES]
[--top_p TOP_P] [--top_k TOP_K] [--device DEVICE]
[--pretrained_model_name PRETRAINED_MODEL_NAME]
[--revision REVISION] [--assets ASSETS] [-d]
optional arguments:
-h, --help show this help message and exit
--n N The number of texts you want to sample. Default=10000
--k K The number of texts you want to screen out of the
sampled texts. Default=100
--alpha ALPHA Threshold value for similarity. Default=0.5
--batch_size BATCH_SIZE
The number of sentences to generate at once. It
depends on the maximum VRAM size of the GPU.
Default=16
--temperature TEMPERATURE
The value used to module the next token probabilities.
Default=1.0
--repetition_penalty REPETITION_PENALTY
The parameter for repetition penalty. 1.0 means no
penalty. Default=1.0
--min_length MIN_LENGTH
The minimum length of the sequence to be generated.
Default=256
--max_length MAX_LENGTH
The maximum length of the sequence to be generated.
Default=256
--num_return_sequences NUM_RETURN_SEQUENCES
Number of generating output statements from LM using a
single input.Default=1
--top_p TOP_P If set to float < 1, only the most probable tokens
with probabilities that add up to top_p or higher are
kept for generation. Default=1.0
--top_k TOP_K The number of highest probability vocabulary tokens to
keep for top-k-filtering. Default=40
--device DEVICE The device on which the model will be loaded.
Default=cuda:0
--pretrained_model_name PRETRAINED_MODEL_NAME
The pretrained model to use. Default=kakaobrain/kogpt
--revision REVISION The specific model version to use. Default=main
--assets ASSETS The folder where the output result file (*.csv) will
be saved. The file name is saved as
'{revision}-{nowtime}-{n}.csv' by default.
Default=assets
-d, --debug Specifies the debugging mode. Default=False