Everyone’s getting their knickers in a twist over nothing here.
Of course an AI can track time, if it’s given access to a timer MCP server.
Can we track time without tools, just in our heads? Certainly not very accurately. We can, however, track it reasonably accurately if given access to a quartz stop watch (typically +/-15 s/year)
A language model is based around language and reasoning by words/symbols. It’s not a surprise it doesn’t have timing capability.
What Altman SHOULD be embarrassed about is that the model lies about its capabilities. That implies that the context is still not right - it should be adequately trained and given context to prevent the lying. That implies a much more worrying issue - and something that Anthropic handles far better, IMHO (when asked if it can track time, if says “no, not on my own”, and then proceeds to build a JavaScript timer that it offers up to track time).
I don’t use them but I follow the news about them loosely. The reason for this is epistemic humility. Claude has a pretty good idea of what its capabilities are and where the ceiling is. Chatgpt has no clue what its limits are so it believes it can do everything. Basically chatgpt has a lot of info and no idea where the gaps live and Claude has a fair idea when to search or use some external function to handle something. Gemini has less than Claude but more than chatgpt. Grok has little to no epistemic humility, but it did manage to accurately portray Musk as a world champion piss drinker, something none of the others were able to do.
I say that, but it’s been a few months since I looked. That could have changed because shit moves fast. By the looks of what it’s trying to do with the timer chatgpt has less than it used to. Possibly because of the way the model is trained to be helpful and confident.
It could simply save a timestamp of the “begin timer” message and compare it to the timestamp of the “end” message. It’s not that complicated, and writing a script and executing it is overkill… It just needs access to a calculator skill.
Yes, it handles it better, but it’s still a dumb approach and waste of energy.
Everyone’s getting their knickers in a twist over nothing here.
Of course an AI can track time, if it’s given access to a timer MCP server.
Can we track time without tools, just in our heads? Certainly not very accurately. We can, however, track it reasonably accurately if given access to a quartz stop watch (typically +/-15 s/year)
A language model is based around language and reasoning by words/symbols. It’s not a surprise it doesn’t have timing capability.
What Altman SHOULD be embarrassed about is that the model lies about its capabilities. That implies that the context is still not right - it should be adequately trained and given context to prevent the lying. That implies a much more worrying issue - and something that Anthropic handles far better, IMHO (when asked if it can track time, if says “no, not on my own”, and then proceeds to build a JavaScript timer that it offers up to track time).
I don’t use them but I follow the news about them loosely. The reason for this is epistemic humility. Claude has a pretty good idea of what its capabilities are and where the ceiling is. Chatgpt has no clue what its limits are so it believes it can do everything. Basically chatgpt has a lot of info and no idea where the gaps live and Claude has a fair idea when to search or use some external function to handle something. Gemini has less than Claude but more than chatgpt. Grok has little to no epistemic humility, but it did manage to accurately portray Musk as a world champion piss drinker, something none of the others were able to do.
I say that, but it’s been a few months since I looked. That could have changed because shit moves fast. By the looks of what it’s trying to do with the timer chatgpt has less than it used to. Possibly because of the way the model is trained to be helpful and confident.
It could simply save a timestamp of the “begin timer” message and compare it to the timestamp of the “end” message. It’s not that complicated, and writing a script and executing it is overkill… It just needs access to a calculator skill.
Yes, it handles it better, but it’s still a dumb approach and waste of energy.