May 7, 20251-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence SpeedAdam 预训练的 1-bit SGD 优化方法NoteLow-PrecisionQuantizationError CompensationOptimizerTheoreticalICML2021
May 7, 2025MICROADAM: Accurate Adaptive Optimization with Low Space Overhead and Provable ConvergenceMicroAdam, 通过梯度稀疏化以及 error compensation 实现轻量的优化器NoteLightweightError CompensationQuantizationOptimizerTheoreticalNeurIPS2024