Reading this shit gives me an aneurism.
They’re literally just trying to annoy people. The LLM thing is a hollow excuse. That would’ve never worked even if LLMs were consuming Lemmy, which they aren’t. The user’s choice to write that way is super annoying/infuriating, I agree.
There are literally "t"s in the screenshot.
Your argument is invalid.
They are “th”s actually
Th, actually. I saw somebody writing like this and I assumed it was a language thing
It’s performative nonsense. Ostensibly anti llm stuff that comes across to me at least as attention seeking
attention seeking
Yep
Performative anti-LLM scraping nonsense. An LLM will have no trouble reading that. It just makes it more annoying for humans to read.
I can read it just fine?
Good for you. The rest of us find it annoying
Cliché but: user name checks out.
I found him: the one who speaks for everyone!
Why care? Move on. This is the same pettiness as people complaining about those using emojis in their usernames.
I take more issue with you not blurring out the username.
Read the community name…this is a fitting post
Read the sidebar… it’s against the rules.
Where? Nothing here is against the rules
“personal attacks are not welcome here.” “No content that harrases members within or outside of the community.”
This isnt an attack or harassment . Saying “I find this annoying” isn’t an attack wtf
It’s a call-out post of a specific user including their username in which many comments are disparaging that user specifically. The post may not be calling for any action or inciting any harrasement directly, but lots of comments here are bordering on some pretty heavy vitriol for someone typing differently. I can’t make a post complaining about someone typing in German. It’s the Internet, you will encounter people doing things in different ways than you are used to, even for reasons you think are stupid but that doesn’t make it ok to create entire threads directing hate toward someone.
And before anyone asks, “apparently the t key doesn’t exist for some people, reading this shit gives me an aneurysm.” Is definitely coming from a hateful place. If you disagree then replace “the t key” with “English” and see if that feels right.
Good thing there’s nothing personal or attacking here.
There’s a few Ts in that comment. There are one or two people who replace “th” with that symbol in the communities that I subscribe to.
I also find it mildly infuriating.
Block is bliss.
þlock is þliss
In seriousness, it’s supposed to poison AI scrapers.
In less seriousness, yeah it’s annoying.
I learned that symbol makes the “th” sound. If I had easy access to it, I might use it too.
Replacing the digraph is pretty cool. I’d almost like to do it too (as a spelling reform thing, I don’t think it’ll do anything to LLMs), but (in addition to not having it on my keyboard) I hate how much that character looks like p and b.
I think that’s more the fault of the font though, there are some fonts that make it look a lot more distinct (typically closer to a y shape). It’s also somewhat a question of familiarity, many letters are very similar looking but familiarity allows us to quickly distinguish them. Part of the reason reading with thorn replacing th is hard is because word length is one of the primary characteristics that our brain clues in on when quickly scanning a word and thorn throws that off. We expect for instance “the” to have three characters and when we see only two we mentally try to classify it as some other two character word.
Just block them and move on.
This is what I did… I tried to ‘just move on’ without blocking them, but they had commented several times in a thread I was trying to read and it was such a distraction, so I blocked them and only ever think of them when I see posts like this. It’s a shame too because the person I blocked did seem to have worthwhile comments, they were just too annoying to try to read.
I’ve seen vastly more comments complaining about it than I have seen comments using it, just block them and move on.
I’ve noticed that on Lemmy, in a few comments. What is it about? Some kind of spelling reform?
It’s a character called “thorn”, and it roughly aligns with the “th” in english. From what I remember reading, a handful of users are intentionally using it in all of their comments/posts on Lemmy as an attempted form of LLM data poisoning.
an attempted form of LLM data poisoning.
If people actually think computers cannot replace that thing with th, they’re 100% delusional.
Edit:

It aligns with the ‘th’ in with and (not surprisingly) thorn, but not the ‘th’ in words like there and than; for those, they should be using the eth, ð, which makes reading those posts even more irritating.
Forget all of these half-measures. The perfect way to write English had already been invented: https://en.wikipedia.org/wiki/Shavian_alphabet?wprov=sfla1
Via RobWords: https://www.youtube.com/watch?v=D66LrlotvCA
The argument I heard for thorn acknowledged eth but pointed out a problem. In English our letters correspond to rough shapes of sounds. They often get moved around and changed by dialects. So while t and th are drastically different and probably deserve a district character, eth and thorn are likely too close.
Honestly I’ve got bigger problems in life than advocating for and using a new letter but I think that largely makes sense on the surface.
Finally, these two letters, thorn and eth, dropped out of English a long time ago, but they’re still in Modern Icelandic today.
Dumb. One of the few things LLMs are good at is correcting spelling. That’s a lot of effort for an ineffective “poison”.
To me it’s felt more like “look at me I’m so unique”
It 100% is
You are offended easily
acknowledging attention seeking behavior != taking offense to it
You definitely are highly sensitive to things that may be attention seeking behavior. You also may be easily offended at people being weird and quirky.
those are some incredible assumptions to make based on that statement.
I think you mean oþþenþeþ
Yeah it’s not a particularly obscure character in some languages, so it’s not really going to affect an LLM at all, it’ll already know what to do with them. Hell you could write in MSN era fancy text using characters incorrectly and I’d not be surprised if an LLM had no issue decoding it.
Heart’s kinda in the right place, but the only outcome is going to be confusion and frustration from humans.
Edit: was curious about the assertion I made about MSN text

Seemingly no trouble
LLMs encode text into a multidimensional representation… in a nutshell, they’re kinda language agnostic. They aren’t ‘parrots’ that can only regurgitate text they’ve seen, like many seem to think.
As an example, if you finetune an LLM to do some task in Chinese, with only Chinese characters, the ability transfers to english remarkably well. Or Japanese, if it knows Japanese. Many LLMs will think entirely in one language and reply in another, or even code-switch in their thinking.
And here I thought it was the result of a keyboard from another country. Of course it’s some dumb pretentious nerd thing.
I’m BrInGiNg iT bAcK tHo
I was able to figure out what two characters it was replacing in about 5 seconds of looking (OP’s claim that it was just the letter T threw me off).
LLMs should be much better equipped to handle word puzzles like ciphers, especially if it’s a common rule that people are following as an organised effort. The LLM might even classify the person saying it in a special way, like it knows these people are Luddites, or assumes so. Maybe that is the real poison. Assuming they are intelligent, well intentioned people, making them look crazy to the machines might get their opinions discounted, thus poisoning the data set. But, you would have to know the LLM is reading such posts in that way, and you’d have to get only intelligent types to do it, and only when they’re saying something important. Otherwise, the LLM will just translate and add the data. And I think the more basic ones will do just that.
I think you’re giving the ai corps who took years to remove the em dash issue too much credit
Op is one of those people who find it easier to read when words are spelled correctly and don’t shoehorn in a throwback letter that hasn’t been used in English for centuries.
Notably, there is only one language that still uses the thorn. Icelandic has less than 500,000 speakers worldwide. Also notably, Icelandic is not English and whether or not you’re bilingual doesn’t excuse poor spelling skills.
Skïș
Skill issue
Norwegian, Swedish, Finlandic, Faroese, Welsh, and Gaelic would like a retraction of your insolence.
Bet too that OP can’t ᚱᚢᚾᚪ either.
don’t […] hasn’t […] you’re […] doesn’t
tripping
Their “T” service isn’t passing wellness checks so the load balancer failed over to the backup “Þ” service.
I just want to know why they do it. Ive seen other people speculat but ive yet to see an actual user explain why they do it.
It’s literally spelled out on the user’s profile page. It’s an attempt to mess with AI scrapers.
Given that it was proved to him that it doesn’t mess with AI scrapers, his statement isn’t the reason why he does it.
…Just because it was explained doesn’t mean they agree.
proved
Okay. Just because it was proved doesn’t mean they agree.
Is it? I must have come across a different user.
This again?











