Microsoft’s GitHub next month plans to begin using customer interaction data – “specifically inputs, outputs, code snippets, and associated context” – to train its AI models.

  • mhague@lemmy.world
    link
    fedilink
    English
    arrow-up
    12
    arrow-down
    1
    ·
    8 hours ago

    The code locker’s revised policy applies to Copilot Free, Pro, and Pro+ customers, as of April 24. Copilot Business and Copilot Enterprise users are exempt thanks to the terms of their contracts. Students and teachers who access Copilot will also be spared.

    All of the people in this thread are mad because they use slop code generation and now their slop is being used to train the slop generators.

    If they can take an entire repo because a contribution was tainted, that’s wrong. But otherwise I don’t care because it’s normal to use usage metrics to improve software and most importantly I don’t use AI so I don’t have anything for them to take.

    • hdsrob@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 hours ago

      While I don’t / won’t use the slop machines, I’m not entirely convinced that they haven’t / won’t just add a Copilot Free account to my VS or GitHub accounts: They did just this to my (now canceled) Office account.

      I do think that a lot of people are missing that it’s just Copilot data that they’re using to train, not all of the repository data hosted on GitHub (or don’t trust that it will be only Copilot data long term).

      For me it just means one more thing to move to our own servers (we always self hosted SVN)

    • Jakeroxs@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 hours ago

      As someone who uses the slop machine, completely agree, it might help improve them further and if you don’t want to use it, move to forgejo or similar (I did that too) and if you still want AI help, try learning how to host your own locally if your GPU can swing it.