this post was submitted on 27 Aug 2024
339 points (90.3% liked)
Technology
59377 readers
3716 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I use Virtual Machines and run local LLMs. LLMs need VRAM rather than CPU RAM. You shouldn't be doing it on a laptop without a serious NPU or GPU, if at all. I don't know if I will be using VMs heavily on this machine or not, but that would be a good reason to have more RAM. Even so 32 GiB should be enough for a few VMs running concurrently.
Honestly, I think that for many people, if they're using a laptop or phone, doing LLM stuff remotely makes way more sense. It's just too power-intensive to do a lot of that on battery. That doesn't mean not-controlling the hardware -- I keep a machine with a beefy GPU connected to the network, can use it remotely. But something like Stable Diffusion normally requires only pretty limited bandwidth to use remotely.
If people really need to do a bunch of local LLM work, like they have a hefty source of power but lack connectivity, or maybe they're running some kind of software that needs to move a lot of data back and forth to the LLM hardware, I think I might consider lugging around a small headless LLM box with a beefy GPU and a laptop, plug the LLM box into the laptop via Ethernet or whatnot, and do the LLM stuff on the headless box. Laptops are just not a fantastic form factor for heavy crunching; they've got limited ability to dissipate heat and tight space constraints to work with.
Yeah it is easier to do it on a desktop or over a network. That's what I was trying to imply. Although having an NPU can help. Regardless I would rather be using my own server than something like ChatGPT.
That's fair. I've put it there as more of a possible use case rather than something you should be consistently doing.
Although iGPU can perform quite well when given a lot of RAM, afaik.