Google’s new Gemini AI model dominates benchmarks, beats GPT-4o and Claude-3

This is the first time Google’s taken the top slot on the Chatbot Arena leaderboard.

Own this piece of crypto history

COINTELEGRAPH IN YOUR SOCIAL FEED

There’s a new top dog in the world of generative artificial intelligence benchmarks and its name is Gemini 1.5 Pro.

The previous champ, OpenAI’s ChatGPT-4o, was finally surpassed on Aug. 1 when Google quietly launched an experimental release of its latest model.

Gemini’s latest update arrived without fanfare and is currently labelled as experimental. But it quickly gained the attention of the AI community across social media as reports began to trickle in that it was surpassing its rivals on benchmark scores.

Artificial intelligence benchmarks

OpenAI’s ChatGPT has been the standard bearer for generative AI since the launch of GPT-3. Its latest model, GPT-4o, and its closest competitor, Anthropic’s Claude-3, have reigned supreme above most other models in most common benchmarks for the past year or so with little in the way of competition.

Source: Large Model Systems Organization.

One of the most popular benchmarks is called the LMSYS Chatbot Arena. It tests models on a variety of tasks and assigns an overall competency score. GPT-4o received a score of 1,286 while Claude-3 earned a respectable 1,271.

A previous version of Gemini 1.5 Pro scored 1,261. But the experimental version (Gemini 1.5 Pro 0801) released on Aug 1 scored a whopping 1,300.

This indicates that it’s overall more capable than its competitors, but benchmarks aren’t necessarily an accurate representation of what an AI model can and can’t do.

Community excitement

Without deeper comparisons available, we’re entering an era where the AI chatbot market has matured enough to offer multiple options. It’s ultimately up to end-users to determine which AI model works best for them.

Anecdotally, there’s been a wave of excitement over the latest version of Gemini with users on social media calling it “insanely good.” One Redditor went so far as to write that it “blows 4o out of the water.”

It’s unclear at this time if the experimental version of Gemini 1.5 Pro will end up being the default going forward. While it remains generally available as of the time of this article’s publication, the fact that it’s in what’s considered an early release or testing phase indicates that it’s possible the model could be rescinded or changed for safety or alignment reasons.

This article first appeared at Cointelegraph.com News

Google’s new Gemini AI model dominates benchmarks, beats GPT-4o and Claude-3

COINTELEGRAPH IN YOUR SOCIAL FEED

Artificial intelligence benchmarks

Community excitement

What do you think?

Written by Outside Source

Sigma Capital’s $100M fund plans to invest in 100 Web3 projects

Algorand retests key level amid 20% spike

Stellar’s XLM price breaks out: is a 30% surge coming?

US Senate Banking Committee chair says crypto framework will be a priority

XRP crypto price crosses $3 for first time since 2018

XYZVerse presale hits $5m while Polygon and Polkadot stumble

Sigma Capital’s $100M fund plans to invest in 100 Web3 projects

Algorand retests key level amid 20% spike

Stellar’s XLM price breaks out: is a 30% surge coming?

US Senate Banking Committee chair says crypto framework will be a priority

XRP crypto price crosses $3 for first time since 2018

XYZVerse presale hits $5m while Polygon and Polkadot stumble

TON Foundation updates TON virtual machine

Bitcoin Price Loses $10K in 2 Days After Dropping Below $92K: Where Is the Bottom?

DeFi rug pull surge reveals more complex crypto scam strategies

Crypto ETPs start 2025 with $585M inflows — CoinShares

Russia’s stock exchange ready to list Bitcoin under one condition

CleanSpark becomes fourth Bitcoin miner to hold 10,000 BTC

Bitcoin price crumbles to $62K support, but derivatives metrics show bullish signs

North Carolina lower chamber overturns governor’s veto of CBDC ban

Sigma Capital’s $100M fund plans to invest in 100 Web3 projects

Algorand retests key level amid 20% spike

Stellar’s XLM price breaks out: is a 30% surge coming?

US Senate Banking Committee chair says crypto framework will be a priority

COINTELEGRAPH IN YOUR SOCIAL FEED

Artificial intelligence benchmarks

Community excitement

What do you think?

Ad Blocker Detected!

Log In

With social network:

Or with username:

Sign In

Forgot password?

Your password reset link appears to be invalid or expired.

Log in

Privacy Policy

Add to Collection

No Collections