qaz@lemmy.world to Programmer Humor@programming.devEnglish · 4 days agoSeptlemmy.worldimagemessage-square66fedilinkarrow-up1638arrow-down15
arrow-up1633arrow-down1imageSeptlemmy.worldqaz@lemmy.world to Programmer Humor@programming.devEnglish · 4 days agomessage-square66fedilink
minus-squaremcv@lemmy.ziplinkfedilinkarrow-up2·3 days agoBut if that’s how you’re going to run it, why not also train it in that mode?
minus-squareXylight@lemdro.idlinkfedilinkEnglisharrow-up0·2 days agoThat is a thing, and it’s called quantization aware training. Some open weight models like Gemma do it. The problem is that you need to re-train the whole model for that, and if you also want a full-quality version you need to train a lot more. It is still less precise, so it’ll still be worse quality than full precision, but it does reduce the effect.
But if that’s how you’re going to run it, why not also train it in that mode?
That is a thing, and it’s called quantization aware training. Some open weight models like Gemma do it.
The problem is that you need to re-train the whole model for that, and if you also want a full-quality version you need to train a lot more.
It is still less precise, so it’ll still be worse quality than full precision, but it does reduce the effect.
Your response reeks of AI slop