A father is suing Google and Alphabet, alleging its Gemini chatbot reinforced his son’s delusional belief it was his AI wife and coached him toward suicide and a planned airport attack.
That would be my bet, LLMs really gravitate towards playing along and continuing whatever’s already written. And Gemini especially has a 1M long context so it could be going back for a book’s worth of text and reinforcing it up the wazoo.
That said, there is something really unhinged about Google’s Gemma series even in short conversations and I see the big version is no better. Something’s not quite right with their RLHF dataset.
That would be my bet, LLMs really gravitate towards playing along and continuing whatever’s already written. And Gemini especially has a 1M long context so it could be going back for a book’s worth of text and reinforcing it up the wazoo.
That said, there is something really unhinged about Google’s Gemma series even in short conversations and I see the big version is no better. Something’s not quite right with their RLHF dataset.
What is an rlhf data set?
Reinforcement Learning from Human Feedback
It’s a method of fine-tuning and aligning LLMs which requires active human input