Mistral 7B model

rufus@discuss.tchncs.de · edit-2 2 years ago

Mistral 7B model

rufus@discuss.tchncs.de · edit-2 2 years ago

Thanks for paying close attention. I just threw kobold.cpp at it and was amazed by the speed of a 7B model on my old PC ;-) Let it complete a few stories and asked the instruct-tuned variant about llamas and other facts… Somehow missed that there are still things missing. My tests for simple and short texts seemed fine.

Another thing I somehow completely missed is the release of Qwen. This is funded by Alibaba? I need to read up on it.

Regarding the fine-tuning attempts… Idk. My personal opinion is: I’m going to be patient and see. Things are always moving fast and the community (not the researchers) sometimes do silly stuff. And most of the tools are probably focused on Llama as of now. So it’ll probably take more than a few hours to see decent results. But I’m sure the community will have a try. Especially if it turns out the performance is really as good or better than Llama 2.