when it outputs results, it’s not clear if it engaged the math engine or just guessed
That depends on the harness though. In the plain model output it will be clear if a tool call happened, and it depends on the application UI around it whether that’s directly shown to the user, or if you only see the LLM’s final response based on it.
That depends on the harness though. In the plain model output it will be clear if a tool call happened, and it depends on the application UI around it whether that’s directly shown to the user, or if you only see the LLM’s final response based on it.