I prefer waterfox, OpenAI can keep its Chat chippy tea browser.

  • MagicShel@lemmy.zip
    link
    fedilink
    English
    arrow-up
    2
    ·
    3 days ago

    I’ll look into it. OAI’s 30B model is the most I can run in my MacBook and it’s decent. I don’t think I can even run that on my desktop with a 3060 GPU. I have access to GLM 4.6 through a service but that’s the ~350B parameter model and I’m pretty sure that’s not what you’re running at home.

    It’s pretty reasonable in capability. I want to play around with setting up RAG pipelines for specific domain knowledge, but I’m just getting started.

    • brucethemoose@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      3 days ago

      I have access to GLM 4.6 through a service but that’s the ~350B parameter model and I’m pretty sure that’s not what you’re running at home.

      It is. I’m running this model, with hybrid CPU+GPU inference, specifically: https://huggingface.co/Downtown-Case/GLM-4.6-128GB-RAM-IK-GGUF

      You can likely run GLM Air on your 3060 desktop if you have 48GB+ RAM, or a smaller MoE easily. Heck. I’ll make a quant just for you, if you want.

      Depending on the use case, I’d recommend ERNIE 4.5 21B (or 28B for vision) on your Macbook, or a Qwen 30B variant. Look for DWQ MLX quants, specifically: https://huggingface.co/models?sort=modified&search=dwq

      • MagicShel@lemmy.zip
        link
        fedilink
        English
        arrow-up
        1
        ·
        3 days ago

        I’m going to upgrade my ram shortly because I found a bad stick and I’m down to 16GB currently. I’ll see if I can swing that order this weekend.

      • MagicShel@lemmy.zip
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        1
        ·
        3 days ago

        I’m have to check. It’s a pro, not air, but I think it’s only 40GB total. I’m really new to Macs so the memory situation is unclear. I requested it at work specifically for its capability to run local AI.