MobileMoE: Scaling On-Device Mixture of Experts

arXiv cs.AI · 2026-05-26

Mixture-of-Experts (MoE) has become the de facto architecture for hundred-billion-parameter language models, yet its advantages at sub-billion scales for on-device deployment remain largely unexplored. To close this gap, we present MobileMoE, a family of on-device MoE language models with sub-billion active paramete...