• ICE

    In-Context Editing: Learning Knowledge from Self-Induced Distributions

  • Qwen Audio

    Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models

  • Jamba

    Jamba: A Hybrid Transformer-Mamba Language Model

  • Fast Vocabulary Transfer

    Fast Vocabulary Transfer for Language Model Compression

  • Mixtral

    Mixtral of Experts