Elon Musk's AI assistant Grok boasted that the billionaire had the "potential to drink piss better than any human in history," among other absurd claims.
I once tried a community version from huggingface (distilled), which worked quite well even on modest hardware. But that was a while ago. Unfortunately, I haven’t had much time to look into this stuff lately, but I wanted to check that again at some point.
You can run GLM Air on pretty much any gaming desktop with 48GB+ of RAM. Check out ubergarm’s ik_llama.cpp quants on Huggingface; that’s state of the art right now.
Most aren’t really running Deepseek locally. What ollama advertises (and basically lies about) is the now-obselete Qwen 2.5 distillations.
…I mean, some are, but it’s exclusively lunatics with EPYC homelab servers, heh. And they are not using ollama.
Thx for clarifying.
I once tried a community version from huggingface (distilled), which worked quite well even on modest hardware. But that was a while ago. Unfortunately, I haven’t had much time to look into this stuff lately, but I wanted to check that again at some point.
You can run GLM Air on pretty much any gaming desktop with 48GB+ of RAM. Check out ubergarm’s ik_llama.cpp quants on Huggingface; that’s state of the art right now.
Also, I’m a quant cooker myself. Say the word, and I can upload an IK quant more specifically tailored for whatever your hardware/aim is.