May 11, 2025Transformers need glasses! Information Over-Squashing in Language TasksLLM Representational CollapseNoteLLMRepresentational CollapseOver-SquashingEmpiricalNeurIPS2024
May 7, 2025MICROADAM: Accurate Adaptive Optimization with Low Space Overhead and Provable ConvergenceMicroAdam, 通过梯度稀疏化以及 error compensation 实现轻量的优化器NoteLightweightError CompensationQuantizationOptimizerTheoreticalNeurIPS2024
April 18, 2025EMR-MERGING: Tuning-Free High-Performance Model MergingEMR-MERGING, 多任务权重合并技术NoteWeight MergingMulti-TaskEmpiricalNeurIPS2024
March 16, 2025Recommender Systems with Generative RetrievalTIGER, 向量量化生成式检索NoteSequential RecommendationGenerativeVector QuantizationSeminalEmpiricalNeurIPS2023
March 10, 2025Neural Discrete Representation LearningVQ-VAE, 向量量化的开山之作NoteVAEVector QuantizationSeminalEmpiricalNeurIPS2017