-
ICE
In-Context Editing: Learning Knowledge from Self-Induced Distributions
-
Qwen Audio
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models
-
Jamba
Jamba: A Hybrid Transformer-Mamba Language Model
-
Fast Vocabulary Transfer
Fast Vocabulary Transfer for Language Model Compression
-
Mixtral
Mixtral of Experts