• Tetragrade@leminal.space
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    2 hours ago

    You cannot know this a-priori. The commenter is clearly producing a stochastic average of the explanations that up the advantage for their material conditions.

    For instance, many SoTA models are trained using reinforcement learning, so it’s plausible that its learned that spamming meaningless tokens can delay negative reward (this isn’t even particularly complex). There’s no observable difference in the response, without probing the weights we’re just yapping.