MLLM

July 20, 2025

TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation

兼顾 Low-level 的 Pixel 信息和 High-level 的 Semantic 信息

July 17, 2025

UniCode$^2$: Cascaded Large-scale Codebooks for Unified Multimodal Understanding and Generation

非常自然的 Image-Codeword+Text-LLM-NextCodeWord-Generation 流程