May 7, 2025GaLore: Memory-Efficient LLM Training by Gradient Low-Rank ProjectionGaLore, 低秩空间中的梯度投影以及权重更新NoteLightweightLow-PrecisionOptimizerSVDTheoreticalICML2024
May 7, 2025MICROADAM: Accurate Adaptive Optimization with Low Space Overhead and Provable ConvergenceMicroAdam, 通过梯度稀疏化以及 error compensation 实现轻量的优化器NoteLightweightError CompensationQuantizationOptimizerTheoreticalNeurIPS2024