• ToadOfHypnosis@lemm.ee
    link
    fedilink
    English
    arrow-up
    61
    arrow-down
    3
    ·
    3 days ago

    So AI taxes power, water for cooling, and other natural resources to be ramped up and used. Now this creates a second wasteful AI to do the same and create an endless loop so that the first AI just keeps spinning its wheels and wasting resources until discovered. The idea makes sense from a pure “stop unauthorized crawling” perspective, but damn we just have no solutions that don’t accelerate climate impact. This planet is just going to turn into an oven to cook us.

    • floofloof@lemmy.caOP
      link
      fedilink
      English
      arrow-up
      16
      ·
      edit-2
      2 days ago

      “No real human would go four links deep into a maze of AI-generated nonsense,” Cloudflare explains. “Any visitor that does is very likely to be a bot, so this gives us a brand-new tool to identify and fingerprint bad bots.”

      It sounds like there may be a plan to block known bots once they have used this tool to identify them. Over time this would reduce the amount of AI slop they need to generate for the AI trap, since bots already fingerprinted would not be served it. Since AI generators are expensive to run, it would be in Cloudflare’s interests to do this. So while your concern is well placed, in this particular case there may be a surge of energy and water usage at first that tails off once more bots are fingerprinted.

      • rottingleaf@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        ·
        2 days ago

        “No real human would go four links deep into a maze of AI-generated nonsense,”

        Looking for porn me with red eyes swearing at the screen.

        • Singletona082@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          2 days ago

          …real.

          ‘Four links deep’

          HEY NOW! Sometimes stuff just gets interesting!

          ‘Into a maze of AI-Generated Nonsense.’

          And sometimes that interesting is porn related!

      • turmacar@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        ·
        edit-2
        2 days ago

        The problem being they’re now attempting anti-fingerprinting tactics. A lot of the AI crawlers used to identify themselves as Amazon/openAI/etc. And aren’t anymore because they were being blocked. Now they’re coming from random IPs with random/obfuscated agent ids.

        This is a legal problem not a technological one.

    • piecat@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      2 days ago

      It’s definitely an arms race. One other outcome is that it gets too expensive to be cost effective and slows down that way.

    • rottingleaf@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      edit-2
      2 days ago

      There are solutions. I’ve just read (diagonally) a paper on attacks on Kademlia. The solutions would be similar to what’s recommended there. The problems are in appearances different, but stem from no admission control for the network.

      All this tomfoolery about “oh horror, how do we solve this” is because bot farms and recommendation systems and ad networks have proven very convenient and profitable, nobody wants to scratch that ecosystem in favor of f2f services. So they want to remove one side of the coin, but leave the other.

  • RejZoR@lemmy.ml
    link
    fedilink
    English
    arrow-up
    61
    ·
    3 days ago

    This is Ai poisoning. Blocking it you just make it not learn. Feeding it bullshit poisons its knowledge making it hallucinate.

    I also wonder how Ai crawlers know what wasn’t already generated by Ai, potentially “inbreeding” knowledge as I call it with Ai hallucinations of the past.

    When whole Ai craze began, everything online was human made basically. Not anymore now. It’ll just get worse if you ask me.

    • CheeseNoodle@lemmy.world
      link
      fedilink
      English
      arrow-up
      29
      ·
      3 days ago

      The scary part is even humans don’t really have a proper escape mechanism for this kind of misinformation. Sure we can spot AI a lot of the time but there are also situations where we can’t and it kind of leaves us only trusting people we already knew before AI, and being more and more distrustful of information in general.

      • theangryseal@lemmy.world
        link
        fedilink
        English
        arrow-up
        11
        ·
        2 days ago

        Holy shit, this.

        I’m constantly worried that what I’m seeing/hearing is fake. It’s going to get harder and harder to find older information on the internet too.

        Shit, it’s crept outside of the internet actually. Family buys my kids books for Christmas and birthdays and I’m checking to make sure they aren’t AI garbage before I ever let them look at it because someone bought them an AI book already without realizing it.

        I don’t really understand what we hope to get from all of this. I mean, not really. Maybe if it gets to a point where it can truly be trusted, I just don’t see how.

        • Flagstaff@programming.dev
          link
          fedilink
          English
          arrow-up
          2
          ·
          2 days ago

          I don’t really understand what we hope to get from all of this.

          Well, even among the most moral devs, the garbage output wasn’t intended, and no one could have predicted the pace at which it’s been developing. So all this is driving a real need for in-person communities and regular contact—which is at least one great result, I think.

    • JustARegularNerd@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      18
      ·
      3 days ago

      Kind of. They’re actually trying to avoid this according to the article:

      “The company says the content served to bots is deliberately irrelevant to the website being crawled, but it is carefully sourced or generated using real scientific facts—such as neutral information about biology, physics, or mathematics—to avoid spreading misinformation (whether this approach effectively prevents misinformation, however, remains unproven).”

      • Muad'dib@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        5
        ·
        2 days ago

        That sucks! What’s the point of putting an AI in a maze if you’re not going to poison it?

    • floofloof@lemmy.caOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 days ago

      Some of these LLMs introduce very subtle statistical patterns into their output so it can be recognized as such. So it is possible in principle (not sure how computationally feasible when crawling) to avoid ingesting whatever has these patterns. But there will also be plenty of AI content that is not deliberately marked in this way, which would be harder to filter out.

    • Flic@mstdn.social
      link
      fedilink
      arrow-up
      2
      ·
      3 days ago

      @RejZoR @floofloof yeah AI will get worse and worse the more it trains on its own output. I can only see “walled-garden” AIs trained on specific datasets for specific industries being useful in future. These enormous “we can do everything (we can’t do anything)” LLMs will die a death.

  • Ilovethebomb@lemm.ee
    link
    fedilink
    English
    arrow-up
    27
    ·
    3 days ago

    Feeding AI crawlers the excrement of their forebears is a perfect way to deal with them.

  • lol_idk@lemmy.ml
    link
    fedilink
    English
    arrow-up
    18
    arrow-down
    1
    ·
    3 days ago

    Throwing more power resources at a resource hungry process seems like a no win

  • JustARegularNerd@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    5
    ·
    3 days ago

    I really want to see what the bullshit looks like - shame the article doesn’t actually show a sample, guess I’d have to make my browser look like an AI crawler

  • latenightnoir@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    2
    ·
    3 days ago

    Heh, sounds like what one of my exes used to do when she wanted some alone time, she’d throw me an informational rabbit hole and let me dive right in it for a couple of hours=)))