42
submitted 6 months ago by ooli@lemmy.world to c/chatgpt@lemmy.world
you are viewing a single comment's thread
view the rest of the comments
[-] cygon@lemmy.world 6 points 6 months ago

Tip: try Oobabooga's Text Generation WebUI with one of the WizardLM Uncensored models from HuggingFace in GGML or GGUF format.

The GGML and GGUF formats perform very well with CPU inference when using LLamaCPP as the engine. My 10 years old 2.8 GHz CPUs generate about 2 words per second. Slightly below reading speed, but pretty solid. Just make sure to keep to the 7B models if you have 16 GiB of memory and 13B models if you have 32 GiB of memory.

[-] ooli@lemmy.world 2 points 6 months ago

Super useful! Thanks! I installed the oobabooga stugg. The http://localhost:7860/?__theme=dark open fine. But then nothing works. how do I train the model with that 8gb .kbin file I downloaded? There are so much option, and I dont even know what I'm doing

[-] cygon@lemmy.world 4 points 6 months ago* (last edited 6 months ago)

There's a "models" directory inside the directory where you installed the webui. This is where the model files should go, but they also have supporting files (.yaml or .json) with important metadata about the model.

The easiest way to install a model is to let the webui download the model itself:

Screenshot of Oobaboga's WebUI with the model tab open and the model names from HuggingFace entered

And after it finishes downloading, just load it into memory by clicking the refresh button, selecting it, choosing llama.cpp and then load (perhaps tick the 'CPU' box, but llama.cpp can do mixed CPU/GPU inference, too, if I remember right).

Screen of the model page in Oobaboga's WebUI with the model ready to be loaded

My install is a few months old, I hope the UI hasn't changed to drastically in the meantime :)

this post was submitted on 13 Mar 2024
42 points (82.8% liked)

ChatGPT

8852 readers
1 users here now

Unofficial ChatGPT community to discuss anything ChatGPT

founded 1 year ago
MODERATORS