Related items
AIHugging Face Blog
Introducing RTEB: A New Standard for Retrieval Evaluation
AIHugging Face Blog
Back to The Future: Evaluating AI Agents on Predicting Future Events
AIHugging Face Blog
OpenEnv in Practice: Evaluating Tool-Using Agents in Real-World Environments
AIHugging Face Blog
ScreenSuite - The most comprehensive evaluation suite for GUI Agents!
AIHugging Face Blog
CyberSecEval 2 - A Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models
AIGoogle AI
A new era for AI Search
Text that reads "The best of a search engine with the best of AI"