• Hildegarde@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    7
    ·
    27 minutes ago

    LLMs are predictive models. They scraped as much text as possible to create a model that predicts the next word accurately. To generate text, the LLM assembles a sequence of likely next words.

    That exact same sort of model can be turned around and asked, how closely did the actual next word match the predicted one? Good test for training the LLM. A better model will make more accurate predictions.

    AI checkers are usually doing that test. Does the real text match what the AI predicted? It sounds like a test of the text, but it really isn’t. In this case, yes. Of course an AI trained on Mary Shelly’s Frankenstein can accurately predict the next word of Mary Shelly’s Frankenstein. It has the whole book memorized, if it were accurate to anthropomorphize computer code.

    So the “checker” calls it AI generated. These checkers don’t work.

    • Kairos@lemmy.today
      link
      fedilink
      English
      arrow-up
      1
      ·
      4 minutes ago

      Actually they’re not doing that check as they don’t have access to the models, they’re running their own statistical transformer that asks “how closely does this match our database”?

  • Chloé 🥕@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    165
    ·
    8 hours ago

    tools like these are used to reject CVs and grade school papers btw

    no matter how much ai is trash do NOT use ai checkers, they do not work

    • Buddahriffic@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      2 hours ago

      Yeah, LLM-based checkers will still have LLM-based problems, most notably being incapable of true analysis, which is the whole point of an AI checker. It’s just the same text predictor shit.

      Oh and also there’s an arms race where generative AI has the advantage because eventually it will be capable of generating things entirely indistinguishable from what a human would make (though it will still be susceptible to the hallucinations and errors it’s already famous for).

    • LillyPip@lemmy.ca
      link
      fedilink
      English
      arrow-up
      38
      ·
      edit-2
      5 hours ago

      Yep, they’re all trash and should not be relied upon.

      I got anywhere from 35% to 70% AI generated results on a book I wrote in 2019, before AI was even released.

      eta: it’s not about plagiarism, either. I also ran my novel through plagiarism checkers, since it’s easy to accidentally write passages similar to existing work. 0% on those, but high numbers in the AI checkers.

      • Echo Dot@feddit.uk
        link
        fedilink
        English
        arrow-up
        3
        ·
        1 hour ago

        I had to write a short story for English literature class in 2006 and I still have the file. Apparently over half of that is AI generated, which is pretty impressive on my part I must say.

      • atopi@piefed.blahaj.zone
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        1
        ·
        3 hours ago

        before AI was even released
        GPT-1 was released in 2018 (though i dont think you need an AI checker to verify if something was made by it or not)

    • thevoidzero@lemmy.world
      link
      fedilink
      English
      arrow-up
      19
      ·
      5 hours ago

      I witnessed an interaction where a grad school professor used AI detector and threatened to fail a student for submitting “AI generated” paper. It was so stupid, even after showing them how if you just add a few spelling mistakes the detection says human written, or even putting their own email in AI detector to show an example. It’s like the saying “little knowledge is dangerous”

      • Echo Dot@feddit.uk
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 hour ago

        When I was at university I was pretty belligerent and if a professor tried that on me I’d have reported them for academic misconduct. They should be grading in the damn papers themselves, if they’re not going to do that then what is the point in them?

    • Viking_Hippie@lemmy.dbzer0.comOP
      link
      fedilink
      English
      arrow-up
      17
      ·
      8 hours ago

      ESPECIALLY don’t use the “ai text humanizer” function of one that’s absolutely certain that RL authors were AI 🤦🏻

      • brucethemoose@lemmy.world
        link
        fedilink
        English
        arrow-up
        9
        ·
        edit-2
        4 hours ago

        I don’t buy it. Not until I can test it, hands on.

        So many LLM papers have amazing (and replicated) results in testing, yet fall apart in the real world outside of the same lab tests everyone uses. Research is overfit to hell.

        And that’s giving them the benefit on the doubt; assuming they didn’t train on the test set in one form or another. Like how Llama 4 technically aced LM Arena because they finetuned it to.

        • qqq@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          2 hours ago

          It looks like Pangram specifically holds back 4 million documents during training and has a corpus of “out of domain” documents that they test against that didn’t even have the same style as the testing data.

          I’m surprised at how well it does; I really wonder what the model is picking out. I wonder if it’s somehow the same “uncanny valley” signal that we get from AI generated text sometimes.

          To show that our model is able to generalize outside of its training domain, we hold out all email from our training set and evaluate our model on the entire Enron email dataset, which was released publicly as a dataset for researchers following the extrication of the emails of all Enron executives in the legal proceedings in the wake of the company’s collapse.

          Our model with email held out achieves a false positive rate of 0.8% on the Enron email dataset after hard negative mining, compared to our competitors (who may or may not have email in their training sets) which demonstrate a FPR of at least 2%. After generating AI examples based on the Enron emails, we find that our false negative rate is around 2%. We show an overall accuracy of 98% compared to GPTZero and Originality which perform at 89% and 91% respectively.

          and

          We exclude 4 million examples from our training pool as a holdout set to evaluate false positive rates following calibration on the above benchmark.

      • qqq@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 hours ago

        Wow thanks for sharing this. I always thought these things were just complete BS but it seems like some actually do work

      • errer@lemmy.world
        link
        fedilink
        English
        arrow-up
        12
        ·
        5 hours ago

        Looked at the preprint. False positive rate of 0.2%, that’s crazy. I kinda find it hard to believe? It doesn’t seem possible to me.

        • criss_cross@lemmy.world
          link
          fedilink
          English
          arrow-up
          5
          ·
          4 hours ago

          That’s still 2 out of 1000 which if you’re using this at scale is not a great rate.

          Would also be curious how that’s calculated if that’s done whit their test data that they’ve iterated on heavily or with actual feedback (which may never get back to them)

  • PityPityBangBang@lemmy.world
    link
    fedilink
    English
    arrow-up
    14
    arrow-down
    1
    ·
    edit-2
    6 hours ago

    Perhaps that so many people have quoted that chapter in college and high school papers, book review and film reviews, and cultural criticism that maybe there is a weird “shoot the moon” situation where a “works of origin” begin to look like a “works of derivation” in LLMs.

    • Echo Dot@feddit.uk
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 hour ago

      The problem is it’s not plagiarism detector (it would also be a pretty bad one since it can’t detect quotes) it’s an AI detector.

      It’s saying that a direct quote is AI, Which obviously isn’t true, it’s a quote, which is a different thing.

      If 10% of my thesis is quoting other works that’s not the same as my thesis being 10% AI generated. The distinction needs to be made.

    • Hacksaw@lemmy.ca
      link
      fedilink
      English
      arrow-up
      9
      ·
      4 hours ago

      Yeah, or perhaps there is no need to make up excuses for the Copyright Infringement, world bruning, infinite lying machine lying about what text is real vs generated by it. LLMs lie, LLM based LLM detectors lie about lies.

    • tempest@lemmy.ca
      link
      fedilink
      English
      arrow-up
      13
      ·
      5 hours ago

      Frankenstein is out of copyright.

      I would be unsurprised if you couldn’t tease out the entire book. I wonder if Mary Shelly was a fan of dashes.

      • 8oow3291d@feddit.dk
        link
        fedilink
        English
        arrow-up
        5
        ·
        4 hours ago

        Being out of copyright is kinda irrelevant. There are lawsuits right now, because the AI firms apparently fed the AI’s tons of copyrighted books.

        • tempest@lemmy.ca
          link
          fedilink
          English
          arrow-up
          4
          ·
          edit-2
          4 hours ago

          It is and it isn’t. Those lawsuits mean they at least try to stop it from producing copyrighted work. They won’t make Simpsons characters or produce anything from the house of mouse without major cajoling or some trickery in the prompt.

          For the text from Frankenstein they are not even going to try.

          Incidentally after writing this content I tried to get chatgpt to reproduce the first paragraph of chapter 3. It refused and offered a summary. I “reminded” it that the book is in the public domain and then it reproduced it without issue.

          • OwOarchist@pawb.social
            link
            fedilink
            English
            arrow-up
            1
            ·
            3 hours ago

            I tried to get chatgpt to reproduce the first paragraph of chapter 3. It refused and offered a summary. I “reminded” it that the book is in the public domain and then it reproduced it without issue.

            I bet you could do exactly the same thing for a book that’s still copyrighted.

            • tempest@lemmy.ca
              link
              fedilink
              English
              arrow-up
              2
              ·
              2 hours ago

              I did see posts of someone doing it with Harry Potter but I think it took a little more effort

          • 8oow3291d@feddit.dk
            link
            fedilink
            English
            arrow-up
            2
            ·
            4 hours ago

            They still obviously trained it on the copyrighted text. Which I think is what some claim is illegal without payment?

            Mind you, I don’t think copyright should cover that, for text at least. It is not in society’s interest.

  • Björn@swg-empire.de
    link
    fedilink
    English
    arrow-up
    52
    ·
    8 hours ago

    Her defense was that it wasn’t an “artificial” intelligence: “It’s alive. It’s alive!”

  • LordAmplifier@pawb.social
    link
    fedilink
    English
    arrow-up
    37
    ·
    8 hours ago

    So the AI thinks this human-made text is actually AI-made and offers an AI tool that’ll turn this human-made text into an AI-made text that’ll appear more human than the human-made text? I wonder how it’d rewrite this paragraph.

    Sometimes it feels like the formal texts I write (like anything I write in the context of a job application) sound a bit like AI, but I just try to immitate the dumb way HR people write their job postings.

      • lauha@lemmy.world
        link
        fedilink
        English
        arrow-up
        29
        ·
        6 hours ago

        Does it say “100% plagiarised”? No. It says it is 100% AI generated which is clearly false.

      • Windex007@lemmy.world
        link
        fedilink
        English
        arrow-up
        21
        ·
        6 hours ago

        It isn’t saying “I recognize this text”, it is saying “this text is AI generated”.

        And then it’s offering a service to rewrite it, with ai, so that it can’t be recognized as ai.

        It’s doing SOMETHING, for sure. I just don’t think what it’s doing results in accurate results for what it claims to measure.

        • 13igTyme@piefed.social
          link
          fedilink
          English
          arrow-up
          3
          ·
          edit-2
          3 hours ago

          That’s how they get you. You’ll pay money to get AI to make it appear human. Then another AI will detect the AI writing and offer to change it for a fee. They are all in on it. This keeps going until society collapses… Or people stop using fake AI detectors.

      • skisnow@lemmy.ca
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        13
        ·
        6 hours ago

        1000%, and I’m disappointed at how few people are pointing this out. The purpose of the AI detector is to measure if someone has submitted something original, and I’m confident that OOP is not Mary Shelley.

        • Nawor3565@lemmy.blahaj.zone
          link
          fedilink
          English
          arrow-up
          17
          ·
          6 hours ago

          I mean… No? We had plagiarism checkers long before these AI checkers, and the latter very specifically advertises that it’s meant to detect if an LLM generated the text, not whether it’s original text. Completely different tools with different purposes.

  • BennyInc@feddit.org
    link
    fedilink
    English
    arrow-up
    107
    ·
    11 hours ago

    Still no statement from Mary. Sounds like she is guilty and doesn’t know how to respond.