NLP Cloud quietly introduces AGI with o3 models, moving into the next stage of AI development.

As OpenAI’s ’12 days of shipmas’ draws to a close, the company has made a soft announcement of AGI through the introduction of their next-generation frontier models o3 and o3 Mini. These models have achieved state-of-the-art performance, nearing 90%, on the ARC-AGI benchmark, surpassing human performance. This marks a significant change in just one month, as in November, Sam Altman hinted at achieving this benchmark internally, which was disregarded by Francois Chollet, the creator of the ARC-AGI benchmark. However, with the ‘o’ family of models virtually saturating the benchmark, the ARC team has announced a newer, upgraded evaluation (ARC-AGI benchmark 2). These frontier models will now be accessible to researchers for public safety testing, with o3 Mini set for release in January 2025 and o3 to follow shortly after.

’AI Research at Indian Universities Seems Like an Individual Endeavor’

L’Oreal Professionnel AirLight Pro Review: Quicker, Lighter, and Repairable

Altman has stated that this marks the beginning of the next phase of AI, but Chollet believes that OpenAI is still not there with AGI. While the new model is impressive and a big milestone towards AGI, there are still easy tasks that o3 cannot solve, and early indications suggest that o3 will struggle with the challenging ARC-AGI-2 tasks. This soft announcement was expected during the 12-days of shipmas, but Altman has been cautious as it could disrupt the contract with lead investor Microsoft and invite more scrutiny from competitors like Google and Anthropic.

In the coming year, companies are actively working towards scaling reasoning capabilities. Google has released Gemini 2.0 Flash Thinking with advanced reasoning capabilities, and Chinese models Qwen and DeepSeek have also joined the race. Meta has hinted at releasing reasoning models next year, with xAI’s Grok and Anthropic expected to follow. OpenAI researchers are betting on the Reinforcement Learning (RL) architecture to further this new paradigm of reasoning. The progress from o1 to o3 in just three months shows how fast progress will be in the new paradigm of RL on the chain of thought to scale inference compute. This aligns with Google DeepMind’s expertise, as former researcher Finbarr Timbers points out.

OpenAI has skipped the name “o2” to avoid trademark concerns and has made a soft announcement to avoid disrupting their contract with Microsoft and inviting more scrutiny from competitors. The RL technique is expected to play a significant role in scaling reasoning capabilities, and OpenAI’s o3 has already beaten the ARC-AGI benchmark.

Working from home is the new normal as we combat the Covid-19

Your Blonde Hair Needs These Purple Shampoos, For Sure

The Most Outrageous Kim Khasyian Outfits of All Time

Fun Things You Should Do for Yourself During Self-Quarantine

How Do You Find Love When You’re Stuck at Home?

How This Painter’s Artful Pants Caught the Eye of Bella Hadid

Trending Tags

Working from home is the new normal as we combat the Covid-19

Your Blonde Hair Needs These Purple Shampoos, For Sure

The Most Outrageous Kim Khasyian Outfits of All Time

Fun Things You Should Do for Yourself During Self-Quarantine

How Do You Find Love When You’re Stuck at Home?

How This Painter’s Artful Pants Caught the Eye of Bella Hadid

Trending Tags

NLP Cloud quietly introduces AGI with o3 models, moving into the next stage of AI development.

’AI Research at Indian Universities Seems Like an Individual Endeavor’

L’Oreal Professionnel AirLight Pro Review: Quicker, Lighter, and Repairable

Related Posts

’AI Research at Indian Universities Seems Like an Individual Endeavor’

L’Oreal Professionnel AirLight Pro Review: Quicker, Lighter, and Repairable

Top 9 French Presses (2024): Made of Plastic, Glass, Stainless Steel, and for Travel

Cloud gaming on the PlayStation Portal isn’t the thrilling advancement we anticipated.

The competition to convert animal vocalizations into human language.

Cloud gaming on the PlayStation Portal isn't the thrilling advancement we anticipated.

Leave a Reply Cancel reply

Recommended Stories

Top Organic Mattress and Bedding of 2024: Non-Toxic, Natural Sleep Solutions

Popular Stories

GenAI is expected to contribute 5% of WNS Analytics revenue in FY25, with anticipated growth ahead.

US Government Claims Dependence on Chinese Lithium Batteries Poses Significant Risks

Bangalore startup raises $300K to develop AI workers for businesses.

This startup in Bengaluru has developed the fastest inference engine, surpassing Together AI and Fireworks AI.

Some customers are left without a Kindle due to Amazon’s Colorsoft launch.

Recent Posts

Categories

Welcome Back!

Retrieve your password

Are you sure want to unlock this post?

Are you sure want to cancel subscription?

Trending Tags

Trending Tags

​NLP Cloud quietly introduces AGI with o3 models, moving into the next stage of AI development.

RELATED POSTS

Related Posts

Leave a Reply Cancel reply

Recommended Stories

Popular Stories

Recent Posts

Categories

Welcome Back!

Retrieve your password

Are you sure want to unlock this post?

Are you sure want to cancel subscription?

NLP Cloud quietly introduces AGI with o3 models, moving into the next stage of AI development.