Abstracts: NeurIPS 2024 with Weizhu Chen 

Microsoft Research Podcast - A podcast by Researchers across the Microsoft research community

Categories:

Next-token prediction trains a language model on all tokens in a sequence. VP Weizhu Chen discusses his team’s 2024 NeurIPS paper on how distinguishing between useful and “noisy” tokens in pretraining can improve token efficiency and model performance.Read the paperGet the code