Screenshot of this question was making the rounds last week. But this article covers testing against all the well-known models out there.

Also includes outtakes on the ‘reasoning’ models.

  • ToTheGraveMyLove@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    1
    ·
    18 hours ago

    Missing the point. Any person would know walking to the car wash isn’t reasonable. You shouldn’t have to craft a perfectly tailored prompt for AI to realize that. If you think this is a gatcha, then whoah boy, I’ve got a bridge to well ya!

    • NewNewAugustEast@lemmy.zip
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      8
      ·
      17 hours ago

      You are missing the point. Any reasonable person would wonder why you asking a stupid question.

      Which is why when asked, the AI said of course the car is there, you. Must be asking either a trick question or for another reason.

      • rebelsimile@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        8
        ·
        15 hours ago

        It could be that. or it could be that the AI gives the illusion of reasoning and this is an example of the illusion breaking. But no it was probably that it knew it was a trick question and decided to answer wrongly because it is very very smart. Yeah.

        • NewNewAugustEast@lemmy.zip
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          4
          ·
          15 hours ago

          What is the wrong answer here? You asked how to get to the car wash. Where the hell do you think the car would be? It isn’t getting washed if it isn’t there.

          I know AI is not really AI. I know how llms work, hell I know how to train them.

          But this kind of question makes no sense, so you get back an answer that follows the weights and answers as if there was some sense to it.

          I repeat for those in the back, when would you ever ask this question? The answer is never.

          Its a dumb, stupid question. There are probably thousands of others questions to demonstrate “wrong answers”, this isn’t one of them.