June 12, 2025Restructuring Vector Quantization with The Rotation Trick一种利用 Rotation Trick 来替代 STE 的方案NoteVector QuantizationRotationSTESeminalEmpiricalICLR2025
June 10, 2025Let’s Verify Step by Step来自 OpenAI 的 process supervisionNoteReward ModelProcess SupervisionOpenAIEmpiricalICLR2024
May 12, 2025Round and Round We Go! What makes Rotary Positional Encodings useful?理解 RoPE 的高低频NoteLLMPositional EncodingTheoreticalICLR2025
May 7, 2025CPT: Efficient Deep Neural Network Training via Cyclic PrecisionCPT, 类似 CosineAnnealingWarmRestarts 的 Precision 循环机制NoteLow-PrecisionQuantizationGeneralizationEmpiricalICLR2021
April 5, 2025Language Representations Can be What Recommenders Need: Findings and PotentialsNext-token embedding 之于协同过滤NoteCollaborative FilteringLLMUniversal EmbeddingEmpiricalICLR2025