feat: improve default ollama configuration #32

olimorris · 2024-03-28T10:57:43Z

I'm really keen to improve base Ollama configuration in the plugin.

Have you found any particular models that excel at coding? Would be great to add an Ollama section to the Readme to help fellow users.

Signed-off-by: Chaz Leong <[email protected]>

cleong14 · 2024-03-29T01:14:30Z

Hey @olimorris,

Apologies, I just noticed your comment.

First off, I just want to say thank you for the great plugin you've developed here. Also, I'm a fan of a few other projects you've developed such as tmux-pomodoro-plus. Thanks for building such great OSS projects!

Currently, I'm still figuring out my own workflow/preferences, but I can definitely provide some context regarding some of the craziness you may have noticed in my fork.

Finding a model that really excels at coding has proven to be more challenging than I expected. I've tried models that claim to excel at coding or have been fine-tuned for coding tasks, but the results almost always have been poor. This could have been due to a few different reasons though such as incorrect template formatting, bad choice of parameter values, or just a poor prompt on my end.

I've been pretty happy with the dolphin-mistral model though. Happy enough to make it my base model for most, if not all, my modelfiles.

A few reasons why I like using dolphin-mistral:

It is an uncensored model
Results for both code and non-code related tasks have been pretty good; better than any other model I've tried so far anyway
It's been the best fit and most performant model given my limited resources

Additionally, I knew I wanted to use danielmiessler's tool fabric in my workflow. Specifically, I wanted to leverage the System prompts from patterns found in fabric. I've found the dolphin-mistral model and the style of System prompts from the patterns in fabric work pretty well together which is another reason why I've chosen dolphin-mistral as my base model for most of my modelfiles.

Hopefully this helps answer your question. I'm still pretty green and very much still learning what works best but happy to help answer additional questions you might have or share more of my own lessons learned thus far.

mrjones2014 · 2024-03-29T11:19:57Z

In my experience dolphin-mixtral is much much better than dolphin-mistral. If I understand correctly, dolphin-mistral isn't actually fully uncensored but dolphin-mixtral is.

cleong14 · 2024-03-29T13:04:57Z

In my experience dolphin-mixtral is much much better than dolphin-mistral. If I understand correctly, dolphin-mistral isn't actually fully uncensored but dolphin-mixtral is.

dolphin-mixtral is actually the model I wanted to use but due to a lack of resources needed to run a model of that size, I ended up going with dolphin-mistral.

olimorris · 2024-03-29T16:06:38Z

That would explain why dolphin-mixtral was sooooo slow on my M1 Mac.

I'm conscious that because I don't use Ollama, the defaults may not be that useful for anyone that's trying it out the plugin so completely open to any change and a section in the README about getting started with Ollama and Code Companion.

mrjones2014 · 2024-03-29T16:19:34Z

First usage can be quite slow since dolphin-mixtral is 26 GB lol but once downloaded it shouldn't really be that slow.

cleong14 · 2024-03-29T23:36:22Z

@olimorris

That would explain why dolphin-mixtral was sooooo slow on my M1 Mac.

lol ya I had a similar reaction too when I first tried dolphin-mixtral on my M2 Mac.

I'm conscious that because I don't use Ollama, the defaults may not be that useful for anyone that's trying it out the plugin so completely open to any change and a section in the README about getting started with Ollama and Code Companion.

I'd say definitely do not use the changes I've been pushing my to forked main branch. Initially, I intended to fork just to fix the issues I had on my end. I figured these issues were specific to me and a result of trying to do things that were just outside of the expected default plugin behavior.

That being said if you're looking to simply improve the Ollama adapter defaults to something more sensible, I don't mind making those updates then pushing those changes either here or to a clean branch with a new PR.

@mrjones2014

First usage can be quite slow since dolphin-mixtral is 26 GB lol but once downloaded it shouldn't really be that slow.

Do you typically start Ollama then leave the same instance running? What are you running Ollama on and what type of impact on your resources are you seeing when you run dolphin-mixtral?

I agree that the performance differences are quite noticeable between cold vs warm vs hot starts. I find myself typically moving between cold/warm/hot starts somewhat fluidly depending on what I'm doing which also affects how I sometimes choose to start Ollama. E.g. Sometimes I'll start and run Ollama directly, sometimes I'll start it from within Neovim or a different tool, etc.

I do want to find a place for dolphin-mixtral in my workflow though. I'll probably give it another go and see if I can figure out a way to dampen the impact on performance so it at least "feel" slow or like I am running a 26 GB model locally. lol

cleong14 · 2024-03-30T08:32:18Z

I just tried to run dolphin-mixtral again and it was painfully slow. My resources immediately spike and I can barely get passed the first run.

Alternatively, I did try dolphin-mistral:7b-v2.6-dpo-laser-q5_K_M (Q5_K_M supposed use case is 'large, very low quality loss - recommended') and it appears to be yielding better results compared to dolphin-mistral:7b-v2.6-dpo-laser-q4_K_M (Q4_K_M supposed use case is 'medium, balanced quality - recommended').

olimorris · 2024-03-30T10:59:32Z

Are there any Ollama models that have been fine-tuned for specific languages or use cases?

I'm thinking a way to programmatically switch models based on a buffer type or set of conditions could be useful. So if I'm working in a Ruby file it would load a Ruby/Rails specific model.

Edit: I've started exploring this with OpenAI Assistants. It's been cool to send it specific knowledge for Neovim, Tree-sitter etc. But...the implementation is so unique that I can't workout how to implement it in the product neatly.

mrjones2014 · 2024-03-30T13:34:53Z

Do you typically start Ollama then leave the same instance running? What are you running Ollama on and what type of impact on your resources are you seeing when you run dolphin-mixtral?

I run it as a background service on NixOS so it’s always running. If you’re not using it, it doesn’t impact performance. I haven’t noticed too much usage when running dolphin-mixtral, but maybe that’s just my hardware. I’m running it on a RTX-2080 Ti and a M2 Max when I’m on macOS.

But actually I plan to (just haven’t gotten around to it yet) run Ollama as a service on my home server where I run Jellyfin and connect to it remotely. It’s quite nice with open-webui which provides a ChatGPT-like web UI for ollama.

lazymaniac · 2024-03-31T19:40:36Z

For ollama models I like deepseek-coder (7B model) and nous-hermes2 (10B model) which are producing very decent results for me compared to others, or opencodeinterpreter (7, 13, 33B model) - fine-tuned deepseek-coder but very fresh and still under tests

lazymaniac · 2024-03-31T19:54:07Z

Sorry for double post, but take a look: https://huggingface.co/spaces/bigcode/bigcode-models-leaderboard

olimorris · 2024-04-01T12:36:25Z

Thanks all. I'll move this to #38

cleong14 added 7 commits March 25, 2024 17:26

feat: configure custom ollama adapter

b3be02a

Signed-off-by: Chaz Leong <[email protected]>

chore: update ollama adapter

f65e620

Signed-off-by: Chaz Leong <[email protected]>

chore: update ollama adapter

243b9fe

Signed-off-by: Chaz Leong <[email protected]>

chore: update ollama adapter

0ca5520

Signed-off-by: Chaz Leong <[email protected]>

chore: update ollama adapter

64ef501

Signed-off-by: Chaz Leong <[email protected]>

chore: update ollama adapter

b4181e2

chore: update ollama adapter

38dae8d

chore: update ollama adapter

6fea06f

olimorris closed this Apr 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: improve default ollama configuration #32

feat: improve default ollama configuration #32

olimorris commented Mar 28, 2024

cleong14 commented Mar 29, 2024

mrjones2014 commented Mar 29, 2024

cleong14 commented Mar 29, 2024

olimorris commented Mar 29, 2024

mrjones2014 commented Mar 29, 2024

cleong14 commented Mar 29, 2024

cleong14 commented Mar 30, 2024

olimorris commented Mar 30, 2024 •

edited

Loading

mrjones2014 commented Mar 30, 2024

lazymaniac commented Mar 31, 2024

lazymaniac commented Mar 31, 2024

olimorris commented Apr 1, 2024

feat: improve default ollama configuration #32

feat: improve default ollama configuration #32

Conversation

olimorris commented Mar 28, 2024

cleong14 commented Mar 29, 2024

mrjones2014 commented Mar 29, 2024

cleong14 commented Mar 29, 2024

olimorris commented Mar 29, 2024

mrjones2014 commented Mar 29, 2024

cleong14 commented Mar 29, 2024

cleong14 commented Mar 30, 2024

olimorris commented Mar 30, 2024 • edited Loading

mrjones2014 commented Mar 30, 2024

lazymaniac commented Mar 31, 2024

lazymaniac commented Mar 31, 2024

olimorris commented Apr 1, 2024

olimorris commented Mar 30, 2024 •

edited

Loading