Screenshot of this question was making the rounds last week. But this article covers testing against all the well-known models out there.

Also includes outtakes on the ‘reasoning’ models.

  • TankovayaDiviziya@lemmy.world
    link
    fedilink
    English
    arrow-up
    9
    arrow-down
    6
    ·
    3 hours ago

    We poked fun at this meme, but it goes to show that the LLM is still like a child that needs to be taught to make implicit assumptions and posses contextual knowledge. The current model of LLM needs a lot more input and instructions to do what you want it to do specifically, like a child.

    • Rob T Firefly@lemmy.world
      link
      fedilink
      English
      arrow-up
      9
      arrow-down
      1
      ·
      edit-2
      2 hours ago

      LLMs are not children. Children can have experiences, learn things, know things, and grow. Spicy autocomplete will never actually do any of these things.

    • kshade@lemmy.world
      link
      fedilink
      English
      arrow-up
      9
      ·
      2 hours ago

      We have already thrown just about all the Internet and then some at them. It shows that LLMs can not think or reason. Which isn’t surprising, they weren’t meant to.

      • eronth@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        4
        ·
        2 hours ago

        Or at least they can’t reason the way we do about our physical world.

        • Nalivai@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          6 minutes ago

          You’re failing into the same trap. When the letters on the screen tell you something, it’s not necessarily the truth. When there is “I’m reasoning” written in a chatbot window, it doesn’t mean that there is a something that’s reasoning.

        • zalgotext@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          10
          ·
          1 hour ago

          No, they cannot reason, by any definition of the word. LLMs are statistics-based autocomplete tools. They don’t understand what they generate, they’re just really good at guessing how words should be strung together based on complicated statistics.

      • Nalivai@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 minutes ago

        By now it’s kind of getting clear that fundamentally it’s the best version of the thing that we get. This is a primetime.
        For some time, there was a legit question of “if we give it enough data, will there be a qualitative jump”, and as far as we can see right now, we’re way past this jump. Predictive algorithm can form grammatically correct sentences that are related to the context. That’s it, that’s the jump.
        Now a bunch of salespeople are trying to convince us that if there was one jump, there necessarily will be others, while there is no real indication of that.