LocalLLaMA

2249 readers

1 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 1 year ago

MODERATORS

pax@sh.itjust.works

SkySyrup@sh.itjust.works

noneabove1182@sh.itjust.works

[Help] Trying to run a local Story telling model with KoboldCpp (kbin.social)

submitted 1 year ago* (last edited 1 year ago) by darkeox@kbin.social to c/localllama@sh.itjust.works

16 comments fedilink hide all child comments

Hi,

Just like the title says:

I'm try to run:

https://huggingface.co/TheBloke/WizardLM-Uncensored-SuperCOT-StoryTelling-30B-SuperHOT-8K-GGML

With:

koboldcpp:v1.43 using HIPBLAS on a 7900XTX / Arch Linux

Running :

--stream --unbantokens --threads 8 --usecublas normal

I get very limited output with lots of repetition.

Illustrattion

I mostly didn't touch the default settings:

Settings

Does anyone know how I can make things run better?

EDIT: Sorry for multiple posts, Fediverse bugged out.

you are viewing a single comment's thread
view the rest of the comments

[–] darkeox@kbin.social 2 points 1 year ago (1 children)

Ah thank you for the trove of information. What would be the best general knowledge model according to you?

[–] rufus@discuss.tchncs.de 1 points 1 year ago* (last edited 1 year ago) (1 children)

Well, I'm not that up to date anymore. I think MythoMax 13b is pretty solid. Also for knowledge. But I can't be bothered anymore to read up on things twice weekly. That news is probably already 3 weeks old and there will be a (slightly) better one out there now. And it gets outperformed by pretty much every one of the big 70b models. But I can't run them on my hardware, so I wouldn't know.

This benchmark ranks them by several scientific tests. You can hide the 70b models and scarlett-33b seems to be a good contender. Or the older Platypus models directly below. But be cautious, sometimes these models look better on paper than they really are.

Also regarding 'knowledge': I don't know about your application. Just in case you're not aware of this... Language models hallucinate and regularly just make up stuff. Even expensive and big models will do this. The models we play with, even more so. Just be aware of it.

And lastly: There is another good community here on Lemmy: !fosai@lemmy.world You can find a few tutorials and more people there, too. And have a look at the 'About' section or stickied posts there. They linked more benchmarks and info.

[–] darkeox@kbin.social 2 points 1 year ago

Alright, thanks for the info & additional pointers.