It's good model, but it still requires 24gb vram.
I'm waiting until something like llama.cpp is made for this.
Welcome to Free Open-Source Artificial Intelligence!
We are a community dedicated to forwarding the availability and access to:
Free Open Source Artificial Intelligence (F.O.S.A.I.)
It's good model, but it still requires 24gb vram.
I'm waiting until something like llama.cpp is made for this.
AFAIK Mistral does already work in llama.cpp, or am I misunderstanding something? I've yet to try it.
Not true. See — or actually nothing to be seen here, since “it just works”: https://github.com/ggerganov/llama.cpp/discussions/3368 and https://huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF
And here is someone describing how to do the quantization yourself: https://advanced-stack.com/resources/running-inference-using-mistral-ai-first-released-model-with-llama-cpp.html
Ooh, thanks. 🤗
Does it really "Whip the llama's ass?".
Yeah. Give me some skins and crazy visualizations that react to its inner workings. (Edit: Sorry, I'm late to the party.)
That's a great article on a good website : no paywall nor advertising.
What it says about this model is that it's better than other comparable large language models and it is so because of a great group of searchers (once from Google and from Meta) working on this.
They say it is comparatively small at 7 billion parameters. Open source, free to download, free to use, free to tweak yourself.