Related items
AIHugging Face Blog
Mixture of Experts (MoEs) in Transformers
AIHugging Face Blog
SegMoE: Segmind Mixture of Diffusion Experts
AIarXiv cs.AI
MobileMoE: Scaling On-Device Mixture of Experts
Mixture-of-Experts (MoE) has become the de facto architecture for hundred-billion-parameter language models, yet its advantages at sub-billion scales for on-device deployment remain largely unexplored. To close this gap, we present MobileMoE, a family of on-device MoE language models with sub-billion active paramete...
AIHugging Face Blog
Welcome Mixtral - a SOTA Mixture of Experts on Hugging Face
AIHugging Face Blog
BERT 101 - State Of The Art NLP Model Explained
AIHugging Face Blog