• forkDestroyer@infosec.pub
      link
      fedilink
      English
      arrow-up
      24
      arrow-down
      1
      ·
      3 months ago

      I’m being a bit extra but…

      Your statement:

      The article headline is wildly misleading, bordering on being just a straight up lie.

      The article headline:

      A Developer Accidentally Found CSAM in AI Data. Google Banned Him For It

      The general story in reference to the headline:

      • He found csam in a known AI dataset, a dataset which he stored in his account.
      • Google banned him for having this data in his account.
      • The article mentions that he tripped the automated monitoring tools.

      The article headline is accurate if you interpret it as

      “A Developer Accidentally Found CSAM in AI Data. Google Banned Him For It” (“it” being “csam”).

      The article headline is inaccurate if you interpret it as

      “A Developer Accidentally Found CSAM in AI Data. Google Banned Him For It” (“it” being “reporting csam”).

      I read it as the former, because the action of reporting isn’t listed in the headline at all.

      ___

      • Blubber28@lemmy.world
        link
        fedilink
        English
        arrow-up
        7
        ·
        3 months ago

        This is correct. However, many websites/newspapers/magazines/etc. love to get more clicks with sensational headlines that are technically true, but can be easily interpreted as something much more sinister/exciting. This headline is a great example of it. While you interpreted it correctly, or claim to at least, there will be many people that initially interpret it the second way you described. Me among them, admittedly. And the people deciding on the headlines are very much aware of that. Therefore, the headline can absolutely be deemed misleading, for while it is absolutely a correct statement, there are less ambiguous ways to phrase it.

        • MangoCats@feddit.it
          link
          fedilink
          English
          arrow-up
          3
          ·
          3 months ago

          can be easily interpreted as something…

          This is pretty much the art of sensational journalism, popular song lyric writing and every other “writing for the masses” job out there.

          Factual / accurate journalism? More noble, but less compensated.

        • obsoleteacct@lemmy.zip
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          1
          ·
          3 months ago

          It is a terrible headline. It can be debated whether it’s intentionally misleading, but if the debate is even possible then the writing is awful.

          • MangoCats@feddit.it
            link
            fedilink
            English
            arrow-up
            2
            ·
            3 months ago

            if the debate is even possible then the writing is awful.

            Awfully well compensated in terms of advertising views as compared with “good” writing.

            Capitalism in the “free content market” at work.

      • WildPalmTree@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        3 months ago

        The inclusion of “found” indicates that it is important to the action taken by Google, would be my interpretation.

    • MangoCats@feddit.it
      link
      fedilink
      English
      arrow-up
      17
      ·
      3 months ago

      Google’s only failure here was to not unban on his first or second appeal.

      My experience of Google and the unban process is: it doesn’t exist, never works, doesn’t even escalate to a human evaluator in a 3rd world sweatshop - the algorithm simply ignores appeals inscrutably.

    • ulterno@programming.dev
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      1
      ·
      edit-2
      3 months ago

      Another point is, the reason Google’s AI is able to identify CSAM is because it has that in its training data, flagged as such.

      In that case, it would have detected the training material as ~100% match.

      I don’t get though, how it ended up being openly available as if it were properly tagged, they would probably exclude it from the open-sourced data. And now I see it would also not be viable to have an open-source, openly scrutinisable AI deployment for CSAM detection for the same reason.

      And while some governmental body got a lot of backlash for trying to implement such an AI thing on chat stuff, Google gets to do so all it wants because it’s E-Mail/GDrive and all on their servers and you can’t expect privacy.


      Considering how many such stories of people having problems due to this system is coming up, is there any statistic of legitimate catches using this model? I suspect not, because why would anyone use Google services for this kind of stuff?

      • arararagi@ani.social
        link
        fedilink
        English
        arrow-up
        2
        ·
        3 months ago

        You would think, but none of these companies actually make their own dataset, they buy from third parties.

    • ayyy@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      2
      ·
      3 months ago

      The article headline is wildly misleading, bordering on being just a straight up lie.

      A 404Media headline? The place exclusively staffed by former BuzzFeed/Cracked employees? Noooo, couldn’t be.

    • katy ✨@piefed.blahaj.zone
      link
      fedilink
      English
      arrow-up
      11
      arrow-down
      10
      ·
      3 months ago

      so they got mad because he reported it to an agency that actually fights csam instead of them so they can sweep it under the rug?

        • katy ✨@piefed.blahaj.zone
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          6
          ·
          3 months ago

          they obviously did if they banned him for it; and if they’re training on csam and refuse to do anything about it then yeah they have a connection to it.

          • MangoCats@feddit.it
            link
            fedilink
            English
            arrow-up
            3
            ·
            3 months ago

            Google doesn’t ban for hate or feels, they ban by algorithm. The algorithms address legal responsibilities and concerns. Are the algorithms perfect? No. Are they good? Debatable. Is it possible to replace those algorithms with “thinking human beings” that do a better job? Also debatable, from a legal standpoint they’re probably much better off arguing from a position of algorithm vs human training.

    • Cybersteel@lemmy.world
      link
      fedilink
      English
      arrow-up
      12
      arrow-down
      38
      ·
      3 months ago

      We need to block access to the web to certain known actors and tie ipaddresses to IDs, names, passport number. For the children.

        • Cybersteel@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          13
          ·
          3 months ago

          In the current digitized world, trivial information is accumulating every second; preserved in all it’s tritness, never fading, always accessible; rumors of petty issues, misinterpretations, slander.

          All junk data preserved in an unfiltered state, growing at an alarming rate, it will only slow down social progress.

          The digital society furthers human flaws and selectively rewards development of convenient half-truths. Just look at the strange juxtaposition of morality around us. Billions spent on new weapons to humanely murder other humans. Rights of criminals are given more respect than the privacy of their own victims. Although there are people in poverty, huge donations are made to protect endangered species; everyone grows up being told what to do.

          “Be nice to other people.”

          “But beat out the competition.”

          “You’re special, believe in yourself and you will succeed”.

          But it’s obvious from the start that only a few can succeed.

          You exercise your right to freedom and this is the result. All the rhetoric to avoid conflict and protect each other from hurt. The untested truths spun by different interests continue to churn and accumulate in the sandbox of political correctness and value systems.

          Everyone withdrawals into their own small gated community, afraid of a larger forum; they stay inside their little ponds leaking what ever “truth” suits them into the growing cesspool of society at large.

          The different cardinal truths neither clash nor mesh, no one is invalidated but no one is right. Not even natural selection can take place here.

          The world is being engulfed in “Truth”. And this is the way the world ends. Not with a BANG, but with a…

      • tetris11@feddit.uk
        link
        fedilink
        English
        arrow-up
        11
        ·
        3 months ago

        Also, pay me exhorbitant amounts of tax-payer money to ineffectually enforce this. For the children.

      • NoForwardslashS@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        2
        ·
        3 months ago

        No need to go that far. If we just require one valid photo ID for TikTok, the children will finally be safe.

        • bobzer@lemmy.zip
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          25
          ·
          edit-2
          3 months ago

          A “material image” doesn’t make any sense. An image is material. It should be CSAI if you wanna be specific.

          I don’t know why this is the second time I’ve had a discussion about CSAM being a stupid acronym on Lemmy, but it’s also the only place I’ve ever seen people use it.

            • MangoCats@feddit.it
              link
              fedilink
              English
              arrow-up
              1
              ·
              3 months ago

              Material can be anything.

              And, if you’re trying to authorize law enforcement to arrest and prosecute, you want the broadest definitions possible.

            • bobzer@lemmy.zip
              link
              fedilink
              English
              arrow-up
              2
              arrow-down
              6
              ·
              3 months ago

              You’re right. It can be images, that’s exactly why saying “this man was found in possession of child abuse material images” does not make grammatical sense. It’s why CP still defines it better as we’re not arresting people for owning copies of Lolita, which you could argue is also CSAM.

              the majority of people disagree with you.

              The majority of people can be wrong.

                • bobzer@lemmy.zip
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  arrow-down
                  5
                  ·
                  3 months ago

                  “I’m mad you’re right so let me compare you to a hateful right wing grifter and also by the way, you’re wrong because all my friends say so.”

                  It may shock you but a handful of Lemmy users doesn’t constitute the linguistic consensus you’re trying to inherit here.