Replies: 1 comment
-
|
Egads! llama works. llama-13b Although I had to do the following to get it to work: Now I want to find where the model is loaded in the PY code to try an experiment. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I've been haunting github A1111 forums getting performance on my 4090 from 13 it/s to 39 to 42 and today to 51 using torch.compile(). It's easy if you have a i9-13900K. I'm less interested in pretty pictures than the potential of chatgpt like things so I want to give ?llama? a try.
I'm trying to get this thing running to see what it can do and if I can find anything to make it faster. Downloading the 252GB HFv2 thing the instructions said to do first. It seems, according to the instructions, that I need this AND one of the 4bit files. I grabbed 13B for my first test.
25% downloaded so far and another hour to go.
cloned the repo, pip installed the requirements, and just waiting to do my first run.
One problem is that I found a way to grab the 6? json files matching llama-13b-hf but could find how to grab the json files for llama-13b-hf-int4. I'll just copy them from the other dir and hope it works.
Beta Was this translation helpful? Give feedback.
All reactions