How to train this model #183

CodingGreatEmperor · 2025-01-11T06:06:10Z

Due diligence

I have done my due diligence in trying to find the answer myself.

Topic

The paper

Question

For mini model, moshi model, and tokenizer. I just wanna customize a little bit of model params of them

yukiarimo · 2025-01-14T04:19:53Z

+1

bruno-hays · 2025-01-23T14:33:48Z

https://github.com/kyutai-labs/moshi/blob/main/FAQ.md

We will release some training / fine-tuning code, but we do not have any timeline yet. Please be patient.

We don't have any other information or timeline at the moment, AFAIK

Airoura · 2025-02-26T08:49:48Z

It's too hard to implement llama version of moshi from scratch, need to work together.

https://github.com/Airoura/LlamaMoshi

2010b9 · 2025-02-27T16:37:15Z

Hi! 🙂

I have some questions (maybe they don't make total sense – sorry if that's the case! – but I'm still learning about this):

Will the fine-tuning / training code make it possible for the model to "speak" in other languages? And to perform tool calling? If so, do you have any idea how many training examples would be needed? And for how long and in which hardware should I train the model?

These are two requirements I have for my use case. Maybe there's more that I don't know about, but these seem the two most important ones – speak in another language and perform tool calling.

Thanks in advance 🙂

davidbrowne17 · 2025-03-09T02:48:26Z

https://github.com/yangdongchao/RSTnet this repo provides code to finetune Moshi, I have used it (with some modifications) to finetune moshi.

yukiarimo · 2025-03-09T03:15:16Z

Would love to have a Colab of that

CodingGreatEmperor added the question Further information is requested label Jan 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to train this model #183

How to train this model #183

CodingGreatEmperor commented Jan 11, 2025

yukiarimo commented Jan 14, 2025

bruno-hays commented Jan 23, 2025

Airoura commented Feb 26, 2025

2010b9 commented Feb 27, 2025

davidbrowne17 commented Mar 9, 2025

yukiarimo commented Mar 9, 2025

How to train this model #183

How to train this model #183

Comments

CodingGreatEmperor commented Jan 11, 2025

Due diligence

Topic

Question

yukiarimo commented Jan 14, 2025

bruno-hays commented Jan 23, 2025

Airoura commented Feb 26, 2025

2010b9 commented Feb 27, 2025

davidbrowne17 commented Mar 9, 2025

yukiarimo commented Mar 9, 2025