TL;DR:
My iPhone 16 Pro Max produces garbage output when running MLX LLMs. An iPhone 15 Pro runs the same code perfectly. A MacBook Pro also runs the same code perfectly. The tensor outputs on the 16 show numerical values an order of magnitude wrong. I suspect it points
He also had it work on a Mac, an iPhone 15 and an iPhone 17. Only his iPhone 16 got the internal LLM state wrong. It’d be interesting to know how a failure like that happens. Presumably most iPhone 16s have a working NPU. Apple would surely want to get to the bottom of this but I doubt they would be open about their findings. Maybe they do know but the solution is ‘buy new iPhone’.