OpenAI Employees Accuse xAI of Misleading Grok3 Benchmark Results

Flash

February 23, 2025 11:41 AM

In Brief:

OpenAI employees claim xAI’s Grok3 benchmark comparisons are misleading.

xAI co-founder Igor Babushkin defends the company’s methodology.

‍

A recent dispute has emerged between OpenAI and Elon Musk’s xAI over benchmark test results for Grok3. OpenAI employees accused xAI of presenting misleading charts that compare Grok3’s performance against OpenAI’s o3-mini-high model on the AIME 2025 benchmark.

The controversy stems from xAI’s decision to exclude o3-mini-high’s score under the "cons@64" condition, which OpenAI claims skews the comparison. In response, xAI’s co-founder Igor Babushkin defended the company’s methodology, stating that OpenAI has previously published similar selective benchmark comparisons.

As competition intensifies in the AI space, transparency in model evaluation remains a key issue, with both companies vying for dominance in AI benchmarks.

‍

Disclaimer: Backdoor provides informational content only, it is not offered or intended to be used as legal, tax, investment, financial, or other advice. Investments in digital assets involve risk, and past performance does not guarantee future results. We recommend conducting your own research before making any investment decisions.

What's more

Flash

14/3/2025

Russia Uses Crypto in Oil Trade With India to Bypass Sanctions

Flash

14/3/2025

SEC Accepts Franklin Templeton’s Solana Spot ETF Application

Flash

14/3/2025

BlackRock Expected to File for SOL and XRP Spot ETFs

Flash

14/3/2025

Trump’s Tariff Threats Shake Markets: Stocks Drop, Bitcoin Falls, Gold Surges

Trending

14/3/2025

OpenAI Employees Accuse xAI of Misleading Grok3 Benchmark Results

In Brief:

OpenAI employees claim xAI’s Grok3 benchmark comparisons are misleading.

xAI co-founder Igor Babushkin defends the company’s methodology.

What's more

Russia Uses Crypto in Oil Trade With India to Bypass Sanctions

SEC Accepts Franklin Templeton’s Solana Spot ETF Application

BlackRock Expected to File for SOL and XRP Spot ETFs

Trump’s Tariff Threats Shake Markets: Stocks Drop, Bitcoin Falls, Gold Surges

Pump.fun Introduces New Chat Feature for Enhanced User Interaction

Subscribe our newspaper in every Wednesday!