July 15, 2025MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers多尺度 Transformer, 探究非 Subword Tokenizer 的可能性NoteTokenizationMultiscaleSeminalEmpiricalNeurIPS2023
July 15, 2025SpaceByte: Towards Deleting Tokenization from Large Language Modeling探究非 Subword Tokenizer 的可能性NoteTokenizationMultiscaleEmpiricalNeurIPS2024