Perhaps that so many people have quoted that chapter in college and high school papers, book review and film reviews, and cultural criticism that maybe there is a weird “shoot the moon” situation where a “works of origin” begin to look like a “works of derivation” in LLMs.
Yeah, or perhaps there is no need to make up excuses for the Copyright Infringement, world bruning, infinite lying machine lying about what text is real vs generated by it. LLMs lie, LLM based LLM detectors lie about lies.
It is and it isn’t. Those lawsuits mean they at least try to stop it from producing copyrighted work. They won’t make Simpsons characters or produce anything from the house of mouse without major cajoling or some trickery in the prompt.
For the text from Frankenstein they are not even going to try.
Incidentally after writing this content I tried to get chatgpt to reproduce the first paragraph of chapter 3. It refused and offered a summary. I “reminded” it that the book is in the public domain and then it reproduced it without issue.
I tried to get chatgpt to reproduce the first paragraph of chapter 3. It refused and offered a summary. I “reminded” it that the book is in the public domain and then it reproduced it without issue.
I bet you could do exactly the same thing for a book that’s still copyrighted.
Perhaps that so many people have quoted that chapter in college and high school papers, book review and film reviews, and cultural criticism that maybe there is a weird “shoot the moon” situation where a “works of origin” begin to look like a “works of derivation” in LLMs.
The problem is it’s not plagiarism detector (it would also be a pretty bad one since it can’t detect quotes) it’s an AI detector.
It’s saying that a direct quote is AI, Which obviously isn’t true, it’s a quote, which is a different thing.
If 10% of my thesis is quoting other works that’s not the same as my thesis being 10% AI generated. The distinction needs to be made.
Yeah, or perhaps there is no need to make up excuses for the Copyright Infringement, world bruning, infinite lying machine lying about what text is real vs generated by it. LLMs lie, LLM based LLM detectors lie about lies.
Frankenstein is out of copyright.
I would be unsurprised if you couldn’t tease out the entire book. I wonder if Mary Shelly was a fan of dashes.
Being out of copyright is kinda irrelevant. There are lawsuits right now, because the AI firms apparently fed the AI’s tons of copyrighted books.
It is and it isn’t. Those lawsuits mean they at least try to stop it from producing copyrighted work. They won’t make Simpsons characters or produce anything from the house of mouse without major cajoling or some trickery in the prompt.
For the text from Frankenstein they are not even going to try.
Incidentally after writing this content I tried to get chatgpt to reproduce the first paragraph of chapter 3. It refused and offered a summary. I “reminded” it that the book is in the public domain and then it reproduced it without issue.
I bet you could do exactly the same thing for a book that’s still copyrighted.
I did see posts of someone doing it with Harry Potter but I think it took a little more effort
They still obviously trained it on the copyrighted text. Which I think is what some claim is illegal without payment?
Mind you, I don’t think copyright should cover that, for text at least. It is not in society’s interest.