That’s the job of the web server, not of the application that runs on it.
There is already software you can get that feeds a never-ending maze of text to AI scrapers, some of which is AI generated and/or designed to poison LLM training. The problem is that these still use up a ton of bandwidth.
A never-ending maze would mean the scrapers just hammer our servers forever. Better to lead them into a honeypot and automatically ban their IP. Like PieFed does.
There are a lot of strategies. afaik a tar pit tries to waste the attacker’s resources by delaying our responses to their traffic? A honey pot tries to funnel bot traffic towards a place which only bots would go to. Once they go there you know they’re a bot and they can be banned.
That’s discord model.
Fediverse needs to have a layer which traps AI in a never-ending maze.
That’s the job of the web server, not of the application that runs on it.
There is already software you can get that feeds a never-ending maze of text to AI scrapers, some of which is AI generated and/or designed to poison LLM training. The problem is that these still use up a ton of bandwidth.
A never-ending maze would mean the scrapers just hammer our servers forever. Better to lead them into a honeypot and automatically ban their IP. Like PieFed does.
So just find scrapers and bot farm owners IRL and burn down their houses, easy
What about a maze that adds a few hundred ms to the response time with each request, so the load gets less the longer it’s trapped?
I haven’t tried to make something like that. I think it’d be hard to do that without also exhausting our resources too.
Ah, that makes sense
Is that how tarpitting works? I didn’t know.
There are a lot of strategies. afaik a tar pit tries to waste the attacker’s resources by delaying our responses to their traffic? A honey pot tries to funnel bot traffic towards a place which only bots would go to. Once they go there you know they’re a bot and they can be banned.
Sadly that only works for scrapers, content engaging bots are immune to it.
How would that layer distinguish AI from non-AI?