Urgh. That’s horrifying:

I have two chimps within, Laziness and Hyperactivity. They smoke cigs, drink yerba, fling shit at each other, and devour the face of anyone who gets close to either.
They also devour my dreams.
Urgh. That’s horrifying:

To be fair here’s how cats would be reconstructed if they went extinct and we had to rely on fossils:

…nah, screw that, the lady in my pic is still hella charming, the one in the OP is an abomination!
Translated from Spanish
And they didn’t even make some joke on how dumb (burro) it looks like! hglksflksdlllksdf
[Replying to myself as this is a tangent]
I think the “bots can generate misinfo even if you just feed them correct info” point deserves its own example.
Let’s say you’re making a model. It looks at the preceding word, and tries to predict the next. And you feed it the following sentences, both true:
1. Humans are apes.
2. Cats are felines.
From both the bot “learnt” five words. And also how to connect them; for example “are” can be followed by either “apes” and “felines”, both having the same weight. Then, as you ask the bot to generate sentences, it generates the following:
3. Humans are felines.
4. Cats are apes.
And you got bullshit!
What large models do is a way more complex version of the above, looking at way more than just the immediately preceding word, but it’s still the same in spirit.
I’m failing to see how this is different from making up a fact and then spreading it to news outlets.
They uploaded the papers to a single preprint server. That’s important.
Preprints are papers predating any sort of peer review; as such, there’s a lot of junk mixed in — no big deal if you know the field, but a preprint server is certainly not a source of reliable information, nor it should be treated as such. On the other side, news outlets are expected to provide you reliable information, curated and researched by journalists.
And peer review is a big fucking deal in science, because it’s what sorts all that junk out. Only muppets who don’t fucking care about misinformation would send bots to crawl preprints, and feed the resulting data into a large model; or to use the potential misinfo from the bot as if it was reliable. (Those two sets of muppets are the ones violating ethic and moral principles, by the way.)
So no, your comparison is not even remotely accurate. What they did is more like writing bullshit in a piece of paper, gluing it on a random phone pole, and checking if someone would repeat that bullshit.
They also went through the trouble to make sure that no reasonably literate human being would ever confuse that thing with an actually scientific paper. As the text says:
Feeding false information to an LLM is no different that a magazine. It only regurgitates what’s been said.
Yes, it is different. Because the large token model won’t simply “repeat” things, it’ll mix and match them and form all sorts of bullshit, even if you didn’t feed it with any bullshit.
Here’s an example of that, fresh from the oven. I don’t reasonably expect people to be feeding misinfo regarding Latin pronunciation into bots, and yet a lot of this table is nonsense:

Compare the table above with this table and this one and you’ll notice the obvious errors:
All it had to do was to copy info from Wiktionary, as it includes even phonetic and phonemic info. But since the bot is not just “regurgitating” info — it’s basically predicting what should come next, and doing so with no regards to truth value — it’s mixing-and-matching shit into nonsense.
It isn’t going to suddenly start doing science on its own to determine if what you’ve said is true or not.
If you actually read the bloody article instead of assuming, you’d know why the researchers did this: they don’t expect the bot to do science on its own, they expect people to treat info from those bots as potentially incorrect.
Its job is to tell you what color the sky is based on what you told it the color of the sky was.
And your job is to not trust it if it tells you “Yes, you are completely right! The colour of the sky is always purple. Do you need further information on other naturally purple things?”


Minha segunda facul foi letras com habilitação em linguística. Queria ter trabalhado com isso, mas hoje em dia sou só um tradutor mequetrefe :P


Realmente o mistério é mais difícil de solucionar do que parece à primeira vista.
É geralmente assim com palavrão, a etimologia é sempre uma bagunça. Eles são usados constantemente então o significado evolui muito rápido, só que quase não tem registro, as pessoas evitam de escrevê-los.
Só pra te dar um exemplo. Um dos palavrões com etimologia mais bem estudada é o “merda” do latim. Sabemos ser herdado do proto-indo-europeu, e que os falantes de latim usavam-no direto, já que tudo quanto é língua neolatina herdou a merda. Mesmo assim a gente quase não sabe em que situações os falantes de latim usavam a palavra, porque quase nunca era escrita; só em uns epigramas do Marcial e umas pichações em Pompeia. (inb4 sim, é o mesmo “merda” do português.)
Com esses insultos é a mesma coisa. As pessoas evitam de registrar. E nisso a gente perde a história deles.


Nicknames are often erratic — cue to Juca (Joaquim), Chico (Francisco; no idea why the /ʃ/), Mafê (Maria Fernanda). I don’t know why, but I feel like they work through a different logic than simple shortenings.


Se incomoda se eu responder em português? Então, pra resumir a missa: tenho quase certeza que o xingamento (viado) vem do nome do bicho (veado). Motivos:


I think it also applies to expletives. Check for example ⟨vagabunda⟩* /va.ga.'bũ.da/; if there was some pressure to keep the stressed syllable it would be clipped into *bunda or *gabunda, but it’s usually clipped into ⟨vagaba⟩ instead. Technically the /b/ from the stressed syllable is still there, but the core /ũ/ ⟨un⟩ is gone.
*gotta explain this one to the folks here. “Vagabunda” means whore, promiscuous woman, etc. It’s highly offensive, way more than the nearest English equivalent (slut), it’s the sort of word to not use even in a joke. (The masculine “vagabundo” is depreciative but socially acceptable — it means lazy arse, do-nothing.)


100% isso.
Em especial, essa “flexibilidade” aparece bastante pras vogais átonas, variam muito de acordo com o dialeto e o ritmo da fala. E ao contrário da variação nas consoantes, as pessoas não prestam muita atenção nelas.
I’m fairly sure what happened with “viado” in PT was just like “nigga” in English. In both you get a non-standard spelling of another word (“veado” and “nigger”), representing a popular pronunciation of the word (note African American English is non-rhotic, so ⟨er⟩ and ⟨a⟩ would sound both /ə/). But they still sound the same in those popular variations.
Pior que acho que o outro ali nem fala português. Ao menos, não proficientemente. Reparou como ele confundiu “esse” com “isso”?


For that pair of words (ES año vs. PT ano) this works, but note the correspondence gets really messy, it depends on the etymology of the word. A quick run-down would be:
| Origin | Spanish | Portuguese | Example |
|---|---|---|---|
| Late Latin */nj/ | /ɲ/ ⟨ñ⟩ | /ɲ/ ⟨nh⟩ | Latin balneum → baneum → *banjʊ̃ → ES baño, PT banho “bath” |
| Latin /gn/ [ŋn] | /ɲ/ ⟨ñ⟩ | /ɲ/ ⟨nh⟩ | can’t recall an example both kept, but Latin agnum → PT anho /ɲ/ “lamb” (archaic) |
| Latin /n:/ | /ɲ/ ⟨ñ⟩ | /n/ ⟨n⟩ | Latin annum → ES año, PT ano “year” |
Then for Latin intervocalic /n/ Spanish simply keeps it. Portuguese initially converts it into vowel nasalisation, but then changes it further on, it’s a bit messy:
For ES “ano” anus and PT “ânus” anus this doesn’t work, though. Portuguese didn’t inherit the word, but reborrowed it. And perhaps to avoid making it sound like “ano” (year), kept the Latin nominative ending. (If the word was inherited it would end as *ão or something like this.)


It does have a tilde but it’s mostly used over vowels, to represent nasalisation; e.g.
For /ɲ/ (the phoneme written “ñ” in Spanish) it’s as you said, though: it’s spelled “nh” instead.


This suggests widespread homophobia if enough of them could combine their brainpower to form these few thoughts
Yup, that’s accurate. Welcome to Latin America and its macho culture. People don’t even get why those jokes are bad. Then when the LGBTQ+ community correctly points out that “a piada mata mais do que a bala” (the joke kills more often than the bullet), the default popular reaction is to claim “waaah they’re overreacting” (spoilers: they aren’t).


Viado comes from desviado, which means someone who was driven off the proper path. It’s just a matter of homophony (and homophobia).
I’ve seen people backtracking the etymology to desviado and transviado. I don’t buy it because clipping (truncamento) in Portuguese usually preserves the start of the word, even at the expense of the stressed syllable; e.g.
So following the same pattern for “desviado” the result would be *des or *desvi, not “viado”.


It gets weirder with expletives. Like “puta merda” whore shit and “merda do caralho” shit from the dick. They don’t make sense at all, people simply chain whatever profanity they find to “express” their frustration. (And you can even combine them, as “puta merda do caralho” whore shit from the dick. Semantically it’s nuts.)
So you walk like a duck, quack like a duck, but someone plucked your feathers off??? :P
Warning: the poster above is a bird pretending to be a human being, to infiltrate into mammal society. Discretion is advised.
And it’s such a great game. It exploits really well the expectations of visual novels and games in general, first pretending to play along them and then breaking those expectations. (I’m trying my hardest to not spoil it, seriously.)
But Google doesn’t care about it. Or about sensible rules. Or enforcing fairly the very rules it expects you to follow.