TL;DR:
My iPhone 16 Pro Max produces garbage output when running MLX LLMs. An iPhone 15 Pro runs the same code perfectly. A MacBook Pro also runs the same code perfectly. The tensor outputs on the 16 show numerical values an order of magnitude wrong. I suspect it points
I think you missed the point of his post. His issue is that the numeric operations the phone executes to run the LLM is producing garbage. Arguably this could break all kinds of neural networks, such as voice transcription. He’s not complaining that the LLMs are themselves unable to properly perform math.
He also had it work on a Mac, an iPhone 15 and an iPhone 17. Only his iPhone 16 got the internal LLM state wrong. It’d be interesting to know how a failure like that happens. Presumably most iPhone 16s have a working NPU. Apple would surely want to get to the bottom of this but I doubt they would be open about their findings. Maybe they do know but the solution is ‘buy new iPhone’.