LLM myths 2: perplexity and surprise
Language Models LLM mythsGiven a language model, people call the mean of the negative log-likelihood of a sentence the “perplexity”. As the word indicates, it is supposed to show whether the language model is perplexed or surprised by the sentence. Sometimes, people modify the mean and also talk about “the level of surprise” on a small range of tokens or a single token. It is a simple explanation that gets the concept across to a wide audience as one of many ways of anthromorphizing language models. Where does it come from and does it really agree with the human sense of “surprise”?
Read more...