Releases: oKatanaaa/lima-gui
Releases · oKatanaaa/lima-gui
v0.6.1
Breaking changes
- Added
tiktoken
as a dependency. OpenAI tokenizers are supported now. - Now to launch the app you need to run
limagui
instead ofpython -m lima_gui.app
.
Fixes
- Fixes #4 .Since LLM providers love to mess with licensing (sometimes messing up the access to tokenizers), the cl100k_base from
tiktoken
is used by default as the underlying tokenizer. - Add
loguru
to dependency list.
v0.6.0
In this update I lay foundation for better compatibility with existing LLM finetuning stack by changing the data schema to be more compliant with OpenAI API. You can now easily export data (new Export as OpenAI dataset
option) and use it as is with existing training pipelines.
It also includes various QoL improvements.
Breaking changes
- The underlying data format has been massively overhauled. Meaning that if you have been collecting data using LIMA-GUI, you won't be able to load it using the newer version. To update your data use
python -m lima_gui.update_data
script. It takes in a path to a target input file (or folder with multiple files) and a path to a target output file (or folder). Although the script can't handle function calling data. If needed, I can update the script (just make a corresponding issue). - When using
completion API
the chat is formatted inChatML
. That means that you can usecompletion
mode to generate (and steer) partial answers ofChatML
compliant models, such ascognitivecomputations/dolphin-2.6-mistral-7b-dpo
. transformers
library is removed from dependencies.tokenizers
are used instead. Sorry for that stupid mistake.
Major changes
- You can now export your dataset in OpenAI finetuning API compliant format (
jsonl
file with a lot of{"messages": [...]}
). Click File -> Export as OpenAI dataset. Ctrl + S
now saves into the last opened file and no longer opens file selection window.- LIMA-GUI will track changes and:
- Ask you to save the data if you haven't done so and trying to close the program.
- Ask you to save the data if you haven't done so and trying to open another file.
- All of the prints are replaced with
loguru
library. All of the calls are being logged. As of nowDEBUG
level is set by default.
Fixes
- LIMA-GUI now works with the latest version of
openai
library.
v0.5.1
Features
- Function calling API. Now
assistant
messages may containfunction calls
. It is useful for gathering data for LLMs with agency (e.g. making your own code interpreter or LLM that supports plugins). You can also use OpenAI API to generate function calls automatically.
Bugfixes
on_content_changed
callback was called twice when setting data in a message item.
v0.4.2
First usable version. I make this release in preparation for future (possibly breaking) changes to make sure there is a working version available.
Functionality list:
- allows to gather multi-turn conversational data (for ChatGPT like chatbots);
- fully OpenAI API compliant data format;
- OpenAI API integration for data gathering assistance;
- allows to tag conversations (coding, QA, contextual QA, etc.);
- allows to assign language to a dialog (currently ru/en only);
- token counting in each dialog (uses HF tokenizers);
Notes:
- currently no performance optimizations are at place;
- may crash sometimes (I am working on this but make sure to save your data regularly).