I prefer waterfox, OpenAI can keep its Chat chippy tea browser.

  • brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    3 days ago

    I have access to GLM 4.6 through a service but that’s the ~350B parameter model and I’m pretty sure that’s not what you’re running at home.

    It is. I’m running this model, with hybrid CPU+GPU inference, specifically: https://huggingface.co/Downtown-Case/GLM-4.6-128GB-RAM-IK-GGUF

    You can likely run GLM Air on your 3060 desktop if you have 48GB+ RAM, or a smaller MoE easily. Heck. I’ll make a quant just for you, if you want.

    Depending on the use case, I’d recommend ERNIE 4.5 21B (or 28B for vision) on your Macbook, or a Qwen 30B variant. Look for DWQ MLX quants, specifically: https://huggingface.co/models?sort=modified&search=dwq

    • MagicShel@lemmy.zip
      link
      fedilink
      English
      arrow-up
      1
      ·
      3 days ago

      I’m going to upgrade my ram shortly because I found a bad stick and I’m down to 16GB currently. I’ll see if I can swing that order this weekend.

    • MagicShel@lemmy.zip
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      3 days ago

      I’m have to check. It’s a pro, not air, but I think it’s only 40GB total. I’m really new to Macs so the memory situation is unclear. I requested it at work specifically for its capability to run local AI.