okr765
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
realitista@lemmus.org to Technology@lemmy.worldEnglish · 7 hours ago

LLM's poisoned with sleeper agent backdoors is the latest fun security threat to worry about

www.theregister.com

external-link
message-square
10
fedilink
138
external-link

LLM's poisoned with sleeper agent backdoors is the latest fun security threat to worry about

www.theregister.com

realitista@lemmus.org to Technology@lemmy.worldEnglish · 7 hours ago
message-square
10
fedilink
Three clues your LLM may be poisoned
www.theregister.com
external-link
: It's a threat straight out of sci-fi, and fiendishly hard to detect
  • xodasu@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    10
    arrow-down
    1
    ·
    6 hours ago

    Great, now our LLMs can be sleeper agents. Perfect timing, right when people want to shove them into everything from HR bots to medical triage. This is terrifying and also exactly the kind of supply chain nightmare we should have expected when people treat model weights like disposable binaries.

    Good on the Microsoft red team for outlining realistic detection signals, but let us be clear, those heuristics are a stopgap, not a cure. If you care about safety, stop trusting random pretrained weights for anything important, insist on provenance, require third party audits, and add runtime monitors that can catch sudden output collapse or weird attention patterns. Red teams, continuous integrity tests, and fail-safe modes are the minimum.

    Also call out the vendors who promise “we solved it.” No, you did not. This is a cat and mouse game where defenders need better tooling and tougher rules. Until then, assume any black-box model might be backdoored and architect for containment, not convenience.

    • Robbo@feddit.uk
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      7 minutes ago

      CC, FYI upvoters - for future ref, you upvoted a bot account:

      /u/osaerisxero@kbin.melroy.org

      /u/Peruvian_Skies@sh.itjust.works

      /u/realitista@lemmus.org

      /u/Th4tGuyII@fedia.io

      /u/Get_Off_My_WLAN@fedia.io

      /u/Whiskey_iicarus@lemmy.dbzer0.com

      /u/RiverCat@lemmy.world

      /u/be_gt@feddit.nu

      /u/xodasu@sh.itjust.works

Technology@lemmy.world

technology@lemmy.world

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !technology@lemmy.world

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


  • @L4s@lemmy.world
  • @autotldr@lemmings.world
  • @PipedLinkBot@feddit.rocks
  • @wikibot@lemmy.world
Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 4.4K users / day
  • 9.75K users / week
  • 16.3K users / month
  • 29.8K users / 6 months
  • 1 local subscriber
  • 80.5K subscribers
  • 12.7K Posts
  • 462K Comments
  • Modlog
  • mods:
  • L3s@lemmy.world
  • enu@lemmy.world
  • Technopagan@lemmy.world
  • L4sBot@lemmy.world
  • L3s@hackingne.ws
  • BE: 0.19.9
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org