This happened to me a lot when I tried to run big models with low context windows. It would effectively run out of memory so each new token wouldn’t actually be added to the context so it would just get stuck in an infinite loop repeating the previous token. It is possible that there was a memory issue on Google’s end.
This happened to me a lot when I tried to run big models with low context windows. It would effectively run out of memory so each new token wouldn’t actually be added to the context so it would just get stuck in an infinite loop repeating the previous token. It is possible that there was a memory issue on Google’s end.
There is something wrong if it’s not discarding old context to make room for new