Is AGI Here? Not Even Close, New AI Benchmark Suggests
Nvidia CEO Jensen Huang recently claimed that AGI (artificial general intelligence) has been achieved, but the latest benchmark from the ARC Prize Foundation shows otherwise. ARC-AGI-3, a new rigorous AGI test, evaluated leading AI models in unfamiliar, instructionless interactive environments meant to measure true generalization and reasoning. Google’s Gemini 3.1 Pro topped AI results at just 0.37%, with OpenAI, Anthropic, and xAI models scoring even lower—well below humans, who solved 100% of the novel tasks without prior training. Unlike previous benchmarks, ARC-AGI-3 prevents memorization by keeping most test environments private and measures efficiency compared to humans. The results show that current AIs struggle with spontaneous problem-solving outside their training, failing at tasks that humans, even young children, find intuitive. The foundation argues this gap is due to limitations in reasoning and generalization, not perception. Despite industry marketing claims, results indicate that AGI is far from achieved. The ARC Prize 2026 is offering $2 million for open-sourced solutions, but as of now, AI systems are not passing the test.
