That’s still 2 out of 1000 which if you’re using this at scale is not a great rate.
Would also be curious how that’s calculated if that’s done whit their test data that they’ve iterated on heavily or with actual feedback (which may never get back to them)
Looked at the preprint. False positive rate of 0.2%, that’s crazy. I kinda find it hard to believe? It doesn’t seem possible to me.
That’s still 2 out of 1000 which if you’re using this at scale is not a great rate.
Would also be curious how that’s calculated if that’s done whit their test data that they’ve iterated on heavily or with actual feedback (which may never get back to them)