SnausagesinaBlanket@lemmy.world to Ask Lemmy@lemmy.world · 1 day agoIs there a currently an accurate way to say how much power per prompt LLMs use?message-squaremessage-square13fedilinkarrow-up133arrow-down14
arrow-up129arrow-down1message-squareIs there a currently an accurate way to say how much power per prompt LLMs use?SnausagesinaBlanket@lemmy.world to Ask Lemmy@lemmy.world · 1 day agomessage-square13fedilink
minus-squarefizzle@quokk.aulinkfedilinkEnglisharrow-up23·22 hours agoMost of the power consumption comes from training and optimising models. You only interact with the finished product, so power per query is very low compared to that required to develop the LLM.
minus-squarelime!@feddit.nulinkfedilinkarrow-up7·18 hours agowhile this is true in isolation, the amount of users means that inference now uses more power than training for the large actors.
minus-squareMichal@programming.devlinkfedilinkarrow-up6·edit-214 hours agoThe question is about per-prompt, so number of users is not relevant. What may be more relevant is number of tokens in and out. If anything, number of users will decrease power use per prompt due to economy of scale.
Most of the power consumption comes from training and optimising models. You only interact with the finished product, so power per query is very low compared to that required to develop the LLM.
while this is true in isolation, the amount of users means that inference now uses more power than training for the large actors.
The question is about per-prompt, so number of users is not relevant. What may be more relevant is number of tokens in and out.
If anything, number of users will decrease power use per prompt due to economy of scale.