Theoretical

March 4, 2026

Trust Region Policy Optimization

PPO 的前身

January 25, 2026

Fine-Tuning Language Models with Just Forward Passes

零阶优化 & 收敛理论

October 11, 2025

Limitations of Dense Retrieval and Beyond

向量检索的瓶颈以及生成式检索的未来

September 23, 2025

Understanding Embedding Scaling in Collaborative Filtering

理解 Scaling up Embedding Size 过程中会出现的 double-peak/logarithmic 现象

August 25, 2025

GRAND: Graph Neural Diffusion

July 31, 2025

Pareto Multi-Task Learning

通过限定子区域实现 Pareto MTL

July 29, 2025

Multiple-Gradient Descent Algorithm (MGDA) for Multiobjective Optimization

从梯度融合角度理解多任务/多目标优化

May 22, 2025

Universal Prompt Tuning for Graph Neural Networks

图上特征 prompt 等价各异 graph prompt

May 13, 2025

Base of RoPE Bounds Context Length

讨论 RoPE base 对于相似 Tokens 感知能力的影响

May 12, 2025

Round and Round We Go! What makes Rotary Positional Encodings useful?

理解 RoPE 的高低频