MTandHJ
  • 论文
  • 随笔
  • Tags
  • Slides
  • ⤴
  • Search
    • Tags

      • 2015 1
      • 2016 1
      • 2017 2
      • 2019 2
      • 2021 3
      • 2022 1
      • 2023 5
      • 2024 15
      • 2025 8
      • ACL 1
      • Advertisement 1
      • Application 1
      • arXiv 3
      • Attention 2
      • CIKM 1
      • CIR 1
      • Codebook Collapse 2
      • Collaborative Filtering 3
      • CVPR 2
      • Data Augmentation 1
      • Doc 2
      • EDM 1
      • EMA 1
      • Embedding 1
      • Empirical 29
      • End-to-End 1
      • Error Compensation 2
      • Foundation 1
      • FQT 1
      • GAN 1
      • Generalization 1
      • Generative 4
      • Git 1
      • GNN 3
      • Graph 3
      • ICLR 5
      • ICML 5
      • Image Synthesis 1
      • KDD 1
      • Knowledge 1
      • Knowledge Tracing 3
      • Laplace Transform 1
      • Lightweight 4
      • LLM 5
      • Low-Bit 2
      • Low-Precision 6
      • LSTM 1
      • Math 1
      • Memory 2
      • Multi-step 1
      • Multi-task 1
      • Multimodal 3
      • Multimodal Recommendation 2
      • NeurIPS 8
      • Note 36
      • OpenAI 1
      • Optimizer 7
      • Over-Squashing 1
      • Positional Encoding 3
      • Process Supervision 2
      • Prompt 2
      • Quantization 4
      • Reasoning 1
      • Recommendation 2
      • Representational Collapse 1
      • Residual 1
      • Retrieval 1
      • Reward Model 2
      • RNN 1
      • Rotation 1
      • Scaling Law 1
      • Seminal 8
      • Sequential 1
      • Sequential Recommendation 5
      • SIGIR 1
      • Slide 3
      • Spectral 1
      • STE 1
      • SVD 2
      • SWA 1
      • Theoretical 6
      • TORS 1
      • Transfer Function 1
      • Trend 6
      • Trial 2
      • Trick 1
      • Universal Embedding 3
      • Unsupervised 1
      • VAE 3
      • Vector Quantization 11
      • Weight Merging 1
      • WWW 1
      • Zero-shot 1
      June 15, 2025

      Addressing Representation Collapse in Vector Quantized Models with One Linear Layer

      SimVQ, 坐标变换替代可学习 Codebook

      • Note
      • VAE
      • Vector Quantization
      • Codebook Collapse
      • Empirical
      • ArXiv
      • 2024
      June 12, 2025

      Restructuring Vector Quantization with The Rotation Trick

      一种利用 Rotation Trick 来替代 STE 的方案

      • Note
      • Vector Quantization
      • Rotation
      • STE
      • Seminal
      • Empirical
      • ICLR
      • 2025
      June 11, 2025

      Is Every Item Worth An Embedding?

      是否每个 Item 都值得一个可学习的 Embedding 呢

      • Trial
      • Recommendation
      • Embedding
      • 2025
      June 10, 2025

      Let’s Verify Step by Step

      来自 OpenAI 的 process supervision

      • Note
      • Reward Model
      • Process Supervision
      • OpenAI
      • Empirical
      • ICLR
      • 2024
      June 10, 2025

      Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations

      来自 DeepSeek 的 process supervision

      • Note
      • Reward Model
      • Process Supervision
      • Unsupervised
      • Empirical
      • ACL
      • 2024
      June 8, 2025

      Simba: 符号梯度的可行性

      在 Lion 优化器之上的一些探索

      • Trial
      • Optimizer
      • Low-Bit
      • 2025
      May 22, 2025

      Universal Prompt Tuning for Graph Neural Networks

      图上特征 prompt 等价各异 graph prompt

      • Note
      • Graph
      • GNN
      • Prompt
      • Theoretical
      • NeurIPS
      • 2023
      May 21, 2025

      All in One: Multi-Task Prompting for Graph Neural Networks

      统一 graph/edge/node-level 的 graph prompt

      • Note
      • Graph
      • GNN
      • Prompt
      • Empirical
      • KDD
      • 2023
      May 18, 2025

      Connection Bottleneck in Attention

      • Slide
      • Attention
      • Positional Encoding
      May 18, 2025

      Laplace Transform

      Laplace 变换的基本概念

      • Math
      • Laplace Transform
      • Transfer Function
      May 13, 2025

      Base of RoPE Bounds Context Length

      讨论 RoPE base 对于相似 Tokens 感知能力的影响

      • Note
      • LLM
      • Positional Encoding
      • Theoretical
      • NeurIPS
      • 2024
      May 12, 2025

      Round and Round We Go! What makes Rotary Positional Encodings useful?

      理解 RoPE 的高低频

      • Note
      • LLM
      • Positional Encoding
      • Theoretical
      • ICLR
      • 2025
      May 11, 2025

      Transformers need glasses! Information Over-Squashing in Language Tasks

      LLM Representational Collapse

      • Note
      • LLM
      • Representational Collapse
      • Over-Squashing
      • Empirical
      • NeurIPS
      • 2024
      May 9, 2025

      Data Augmentation as Free Lunch: Exploring the Test-Time Augmentation for Sequential Recommendation

      TTA, Test-Time Augmentation

      • Note
      • Sequential Recommendation
      • Data Augmentation
      • Empirical
      • SIGIR
      • 2025
      May 7, 2025

      1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed

      Adam 预训练的 1-bit SGD 优化方法

      • Note
      • Low-Precision
      • Quantization
      • Error Compensation
      • Optimizer
      • Theoretical
      • ICML
      • 2021
      May 7, 2025

      CPT: Efficient Deep Neural Network Training via Cyclic Precision

      CPT, 类似 CosineAnnealingWarmRestarts 的 Precision 循环机制

      • Note
      • Low-Precision
      • Quantization
      • Generalization
      • Empirical
      • ICLR
      • 2021
      May 7, 2025

      GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

      GaLore, 低秩空间中的梯度投影以及权重更新

      • Note
      • Lightweight
      • Low-Precision
      • Optimizer
      • SVD
      • Theoretical
      • ICML
      • 2024
      May 7, 2025

      MICROADAM: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence

      MicroAdam, 通过梯度稀疏化以及 error compensation 实现轻量的优化器

      • Note
      • Lightweight
      • Error Compensation
      • Quantization
      • Optimizer
      • Theoretical
      • NeurIPS
      • 2024
      May 7, 2025

      Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients

      Q-GaLore, 对 GaLore 进一步施加低精度量化

      • Note
      • Lightweight
      • Low-Precision
      • Optimizer
      • SVD
      • Empirical
      • ArXiv
      • 2024
      May 6, 2025

      SWALP: Stochastic Weight Averaging in Low-Precision Training

      SWALP, 通过 SWA 稳定低精度训练

      • Note
      • Low-Precision
      • FQT
      • SWA
      • Empirical
      • ICML
      • 2019
      April 21, 2025

      Scaling Laws for Online Advertisement Retrieval

      快手, 广告场景下的 Scaling Laws

      • Note
      • Advertisement
      • Retrieval
      • Scaling Law
      • Empirical
      • 2024
      April 18, 2025

      EMR-MERGING: Tuning-Free High-Performance Model Merging

      EMR-MERGING, 多任务权重合并技术

      • Note
      • Weight Merging
      • Multi-Task
      • Empirical
      • NeurIPS
      • 2024
      April 15, 2025

      OneRec: Unifying Retrieve and Rank with Generative Recommender and Preference Alignment

      OneRec, 端到端的推荐模型

      • Note
      • Multimodal Recommendation
      • End-to-End
      • Generative
      • Vector Quantization
      • Empirical
      • 2025
      April 15, 2025

      QARM: Quantitative Alignment Multi-Modal Recommendation at Kuaishou

      QARM, 多模态推荐对齐与量化

      • Note
      • Multimodal Recommendation
      • Generative
      • Vector Quantization
      • Empirical
      • 2024
      April 5, 2025

      Language Representations Can be What Recommenders Need: Findings and Potentials

      Next-token embedding 之于协同过滤

      • Note
      • Collaborative Filtering
      • LLM
      • Universal Embedding
      • Empirical
      • ICLR
      • 2025
      April 3, 2025

      Think Before Recommend: Unleashing the Latent Reasoning Power for Sequential Recommendation

      ReaRec, 多步序列推荐推理

      • Note
      • Sequential Recommendation
      • Multi-Step
      • Reasoning
      • Empirical
      • 2025
      April 2, 2025

      Physics of Language Models: Part 3.1, Knowledge Storage and Extraction

      探究 LLM 如何记忆和提取知识的实验性文章

      • Note
      • LLM
      • Knowledge
      • Seminal
      • Empirical
      • ICML
      • 2024
      April 1, 2025

      A Self-Attentive Model for Knowledge Tracing

      SAKT, 自注意力知识追踪

      • Note
      • Knowledge Tracing
      • Attention
      • Empirical
      • EDM
      • 2019
      March 31, 2025

      ECNU 生存指北

      ECNU

      • Doc
      March 30, 2025

      Dynamic Key-Value Memory Networks for Knowledge Tracing

      DKVMN, 带记忆结构的知识追踪

      • Note
      • Knowledge Tracing
      • Memory
      • Seminal
      • Empirical
      • WWW
      • 2017
      March 30, 2025

      Meta-Learning with Memory-Augmented Neural Networks

      MANN, 外置记忆模块

      • Note
      • Memory
      • Seminal
      • Empirical
      • ICML
      • 2016
      March 27, 2025

      Deep Knowledge Tracing

      DKT, 知识追踪

      • Note
      • Knowledge Tracing
      • RNN
      • LSTM
      • Seminal
      • Empirical
      • NeurIPS
      • 2015
      March 27, 2025

      Unifying Generative and Dense Retrieval for Sequential Recommendation

      LIGER, 生成式检索的冷启动缺陷与解决方案

      • Note
      • Sequential Recommendation
      • Generative
      • Vector Quantization
      • Empirical
      • 2024
      March 25, 2025

      Collaborative Alignment for Recommendation

      CARec, ID-Textual 特征对齐

      • Note
      • Collaborative Filtering
      • Universal Embedding
      • Empirical
      • CIKM
      • 2024
      March 24, 2025

      Multimodal Pre-training for Sequential Recommendation via Contrastive Learning

      MP4SR, 模态融合之于多模态协同过滤

      • Note
      • Sequential Recommendation
      • Multimodal
      • Universal Embedding
      • Empirical
      • TORS
      • 2024
      March 21, 2025

      Vector Quantization

      • Slide
      • Vector Quantization
      March 19, 2025

      SOLO

      • Slide
      • Optimizer
      • Low-Bit
      • EMA
      March 16, 2025

      Autoregressive Image Generation using Residual Quantization

      RQ-VAE, 残差向量量化

      • Note
      • VAE
      • Residual
      • Vector Quantization
      • Empirical
      • CVPR
      • 2022
      March 16, 2025

      Recommender Systems with Generative Retrieval

      TIGER, 向量量化生成式检索

      • Note
      • Sequential Recommendation
      • Generative
      • Vector Quantization
      • Seminal
      • Empirical
      • NeurIPS
      • 2023
      March 13, 2025

      中研春招聘小记

      找工作

      • Note
      • Application
      • 2023
      March 12, 2025

      Finite Scalar Quantization: VQ-VAE Made Simple

      FSQ, 标量量化

      • Note
      • Vector Quantization
      • Codebook Collapse
      • Empirical
      • ArXiv
      • 2023
      March 11, 2025

      Taming Transformers for High-Resolution Image Synthesis

      VQGAN, 自回归式的图片生成

      • Note
      • GAN
      • Vector Quantization
      • Image Synthesis
      • Seminal
      • Empirical
      • CVPR
      • 2021
      March 10, 2025

      Neural Discrete Representation Learning

      VQ-VAE, 向量量化的开山之作

      • Note
      • VAE
      • Vector Quantization
      • Seminal
      • Empirical
      • NeurIPS
      • 2017
      March 3, 2025

      Git

      Git 的基本操作

      • Doc
      • Trick
      • Git
      January 1, 0001

      Lightweight Optimizers

      • Trend
      • Optimizer
      • Lightweight
      January 1, 0001

      Multimodal Collaborative Filtering

      • Trend
      • Collaborative Filtering
      • Multimodal
      January 1, 0001

      Network Quantization

      • Trend
      • Quantization
      • Low-Precision
      January 1, 0001

      Recommendation Foundation Model

      • Trend
      • Recommendation
      • Foundation
      • Multimodal
      • Sequential
      • Graph
      January 1, 0001

      Spectral Graph Neural Networks

      • Trend
      • GNN
      • Spectral
      January 1, 0001

      Zero-shot Composed Image Retrieval

      • Trend
      • CIR
      • Zero-Shot

      MTandHJ © 2025