TL;DR:
My iPhone 16 Pro Max produces garbage output when running MLX LLMs. An iPhone 15 Pro runs the same code perfectly. A MacBook Pro also runs the same code perfectly. The tensor outputs on the 16 show numerical values an order of magnitude wrong. I suspect it points
Quite obviously not the optimal use case. “The tensor outputs on the 16 show numerical values an order of magnitude wrong.”
That’s the hardware issue he was talking about, it has no relation to the effectiveness of the usage of the LLM. It sounded to be mostly a project he was doing for fun rather then out of necessity
Grok says it’s right so it must be 🤤