AnnouncementsMatrixEventsFunnyVideosMusicAncapsTechnologyEconomicsPrivacyGIFSCringeAnarchyFilmPicsThemesIdeas4MatrixAskMatrixHelpTop Subs
4
Other discussionsAdd topics

Comment preview

[-]x0x70(0|0)

The short take away is that exposing a reasoning model to very short text makes it trigger the end of sequence token almost immediately during reasoning steps. So it skips that and just 0-shots the problem.

You wouldn't really want to train a model on really small amount of text anyway I wouldn't think and it would be easy data to clean. But there is another way to fix this after the fact. Just don't accept an end of sequence token from a model until it's produced some content. The downside is that would require modifying some code and most people re-training a model just want to plug in new model files.