Skip to content

Commit de05ac6

Browse files
committed
Add more sh tags
1 parent 95c16a8 commit de05ac6

File tree

3 files changed

+10
-10
lines changed

3 files changed

+10
-10
lines changed

README.md

+5-5
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ python test_inference.py -m <path_to_model> -p "Once upon a time,"
8686

8787
A simple console chatbot is included. Run it with:
8888

89-
```
89+
```sh
9090
python examples/chat.py -m <path_to_model> -mode llama -gs auto
9191
```
9292

@@ -115,7 +115,7 @@ and **exllamav2_HF** loaders.
115115

116116
To install the current dev version, clone the repo and run the setup script:
117117

118-
```
118+
```sh
119119
git clone https://github.com/turboderp/exllamav2
120120
cd exllamav2
121121
pip install -r requirements.txt
@@ -125,7 +125,7 @@ pip install .
125125
By default this will also compile and install the Torch C++ extension (`exllamav2_ext`) that the library relies on.
126126
You can skip this step by setting the `EXLLAMA_NOCOMPILE` environment variable:
127127

128-
```
128+
```sh
129129
EXLLAMA_NOCOMPILE= pip install .
130130
```
131131

@@ -142,7 +142,7 @@ PyTorch.
142142

143143
Either download an appropriate wheel or install directly from the appropriate URL:
144144

145-
```
145+
```sh
146146
pip install https://github.com/turboderp/exllamav2/releases/download/v0.0.12/exllamav2-0.0.12+cu121-cp311-cp311-linux_x86_64.whl
147147
```
148148

@@ -153,7 +153,7 @@ can also be installed this way, and it will build the extension while installing
153153

154154
A PyPI package is available as well. This is the same as the JIT version (see above). It can be installed with:
155155

156-
```
156+
```sh
157157
pip install exllamav2
158158
```
159159

doc/convert.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -94,7 +94,7 @@ measurement pass on subsequent quants of the same model.
9494

9595
Convert a model and create a directory containing the quantized version with all of its original files:
9696

97-
```
97+
```sh
9898
python convert.py \
9999
-i /mnt/models/llama2-7b-fp16/ \
100100
-o /mnt/temp/exl2/ \
@@ -104,7 +104,7 @@ python convert.py \
104104

105105
Run just the measurement pass on a model, clearing the working directory first:
106106

107-
```
107+
```sh
108108
python convert.py \
109109
-i /mnt/models/llama2-7b-fp16/ \
110110
-o /mnt/temp/exl2/ \
@@ -114,7 +114,7 @@ python convert.py \
114114

115115
Use that measurement to quantize the model at two different bitrates:
116116

117-
```
117+
```sh
118118
python convert.py \
119119
-i /mnt/models/llama2-7b-fp16/ \
120120
-o /mnt/temp/exl2/ \

doc/eval.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ in which ExLlama runs out of system memory when loading large models.
2929
This is the standard [HumanEval](https://github.com/openai/human-eval) test implemented for ExLlamaV2 with
3030
dynamic batching.
3131

32-
```
32+
```sh
3333
pip install human-eval
3434
python eval/humaneval.py -m <model_dir> -o humaneval_output.json
3535
evaluate-functional-correctness humaneval_output.json
@@ -64,7 +64,7 @@ performance.
6464
This is the standard [MMLU](https://github.com/hendrycks/test) test implemented for ExLlamaV2 with
6565
dynamic batching.
6666

67-
```
67+
```sh
6868
pip install datasets
6969
python eval/mmlu.py -m <model_dir>
7070
```

0 commit comments

Comments
 (0)