July 2, 2025EARN: Efficient Inference Acceleration for LLM-based Generative Recommendation by Register Tokens通过减小 KV cache size 加速 LLMRec 的推理NoteSequential RecommendationLLMKV CacheEmpiricalKDD2025