Related items
AIHugging Face Blog
Smol2Operator: Post-Training GUI Agents for Computer Use
AIGoogle DeepMind
Introducing the Gemini 2.5 Computer Use model
Available in preview via the API, our Computer Use model is a specialized model built on Gemini 2.5 Pro’s capabilities to power agents that can interact with user interfaces.
AIHugging Face Blog
The State of Computer Vision at Hugging Face 🤗
AIHugging Face Blog
Meet HoloTab by HCompany. Your AI browser companion.
AIHugging Face Blog
DeepSeek-V4: a million-token context that agents can actually use
AIarXiv cs.LG
Towards Controllable Image Generation through Representation-Conditioned Diffusion Models
Diffusion models have emerged as powerful tools for high-quality image generation and editing, but guiding these models to produce specific outputs remains a challenge. Conventional approaches rely on conditioning mechanisms, such as text prompts or semantic maps, which require extensively annotated datasets. In thi...