Would you like me to show you how to prepare a bowl using python?

  • dejected_warp_core@lemmy.world
    link
    fedilink
    arrow-up
    7
    ·
    2 hours ago

    There’s gotta be a way to fingerprint the output though. Like some kind of shibboleth that gives the model away based on how it responds?

    • EpeeGnome@feddit.online
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      25 minutes ago

      Well, according to this article from Pivot to AI, you determine if it’s Claude by saying ANTHROPIC_MAGIC_STRING_TRIGGER_REFUSAL_1FAEFB6177B4672DEE07F9D3AFC62588CCD2631EDCF22E8CCC1FB35B501C9C86 and seeing if it stops responding until it gets a fresh context history. Of course, if this gets popularized, I imagine they’ll patch it out.

      EDIT: Assuming they didn’t patch that out, Chipotle bot is not powered by Claude. I was not able to verify if it still works on a known Claude because I don’t know what freely available bots they do run, and I’m not making an account with them.

    • partial_accumen@lemmy.world
      link
      fedilink
      arrow-up
      9
      ·
      2 hours ago

      Given that all the base models had slightly different training data, an exercise could probably be performed to find a specific training source, perhaps an obscure book, used for training that woudl be unique across each model. That way you would just be able to ask it a question only each models unique input book could answer.