Amazon has introduced Nova Act, a general-purpose AI agent designed to interact with web browsers, perform tasks autonomously, and assist with online navigation.
Highlights
Launched as a research preview, the AI model enables developers to prototype agent-driven applications using the Nova Act SDK, a toolkit for building AI-powered automation tools.
AI Capabilities and Applications
Developed by Amazon’s AGI lab in San Francisco, Nova Act is designed to navigate websites, fill out forms, select dates on calendars, and automate online interactions.
Developers using the Nova Act SDK can create agentic applications capable of handling simple workflows without constant human intervention. Early demonstrations suggest the AI could assist with:
- Making dining reservations
- Completing online purchases
- Managing customer service interactions
The technology is expected to be integrated into Alexa+, Amazon’s upcoming AI-enhanced voice assistant, as part of the company’s broader efforts to improve digital automation.
Performance and Competitive Landscape
Amazon reports that Nova Act has outperformed AI agents from OpenAI and Anthropic in internal testing. On the ScreenSpot Web Text benchmark—used to evaluate an AI model’s ability to interact with text-based elements on a screen—Nova achieved a 94% accuracy rate, compared to:
- 88% for OpenAI’s CUA model
- 90% for Anthropic’s Claude 3.7 Sonnet
Benchmark | Nova Act | OpenAI’s CUA | Anthropic’s Claude 3.7 Sonnet |
---|---|---|---|
Accuracy | 94% | 88% | 90% |
Speed (Avg. Response Time) | 0.8 sec | 1.0 sec | 1.2 sec |
Integration Capabilities (Ease of Ecosystem Integration) |
9/10 | 7/10 | 8/10 |
Special Features | Local processing for minimal latency and enhanced privacy; deep Amazon ecosystem integration. | Focus on text-based interactions; standard API integration. | Hybrid reasoning and advanced contextual responses; robust support for agentic tasks. |
Amazon has not yet published comparisons using widely recognized industry benchmarks, such as WebVoyager, leaving open questions about real-world performance across diverse web environments.
Amazon’s AGI Vision and AI Research
Nova Act is the first publicly released project from Amazon’s AGI lab, founded by former OpenAI researchers David Luan and Pieter Abbeel. Both previously led AI startups—Luan at Adept and Abbeel at Covariant—before joining Amazon to expand its AI research.
Their work on Nova aligns with Amazon’s broader artificial general intelligence (AGI) efforts, aiming to develop AI systems that can perform digital tasks traditionally handled by humans. Luan has described AI agents as a stepping stone toward more advanced, general-purpose AI models.
Integration with Amazon’s AI Ecosystem
Nova Act complements Amazon Nova Pro, another AI model within Amazon’s ecosystem designed for:
- Video summarization
- Question-answering tasks
- Software development
Challenges
Despite advancements in AI-driven automation, AI agents from Amazon, OpenAI, and Google have faced reliability challenges, particularly with handling complex tasks across different domains.
As early testing of Nova Act progresses, it remains to be seen whether Amazon’s approach will address these limitations or encounter similar obstacles.