AI Capabilities May Be Overhyped on Bogus Benchmarks, Study Finds

You know all of those reports about artificial intelligence models successfully passing the bar or achieving Ph.D.-level intelligence ? Looks like we should start taking those degrees back. A new study from researchers at the Oxford Internet Institute suggests that most of the popular benchmarking tools that are used to test AI performance are often unreliable and misleading.

Researchers looked at 445 different benchmark tests used by the industry and other academic outfits to test everything from reasoning capabilities to performance on coding tasks . Experts reviewed each benchmarking approach and found indications that the results produced by these tests may not be as accurate as they have been presented, due in part to vague definitions for what a benchmark is attempting to

See Full Page

Interests (0)

Settings

AI Capabilities May Be Overhyped on Bogus Benchmarks, Study Finds

AI’s capabilities may be exaggerated by flawed tests, study says

Computer Chips in Our Bodies Could Be the Future of Medicine

West Virginia University Statler College launches online master’s program for cybersecurity

San Diego makes way for Waymo: Autonomous ride-hailing service coming next year

Qubit That Lasts 3 Times As Long As The Record Is Major Step Toward Practical Quantum Computers

Wyoming Cows Go High-Tech With Electronic Collars And ‘Virtual Fencing’

Traveling in remote places? Make sure you have satellite access

Megyn Kelly Loses It Over Michelle Obama's 'Racially Charged' Complaints

Trump openly defied court order with Truth Social threat: judge

Sources: Cubs' Kyle Tucker among 13 to get $22M qualifying offer

Why Gainax's 'Gunbuster' Pose Is More Than Anime Rule of Cool Reference Fodder

Tesla Shareholders Decide Elon Musk Should Be the World's First Trillionaire

A New Pair of Anti-Meta Smart Glasses Are on the Way

Trump Urges New Yorkers To Vote For Andrew Cuomo Over Zohran Mamdani

Second Trump-designated holiday is next week: What's the significance?

Nancy Pelosi won't seek reelection, ending her storied career in the US House

SNAP Benefits Lapse Leaves Millions in Uncertainty

US ends protected status for South Sudanese nationals

Cowboys Defensive End Marshawn Kneeland Dies at 24

Supreme Court Questions Legality of Trump's Tariffs

Trump pressures GOP senators to end the government shutdown, now the longest ever

Food Distribution Event Draws Large Crowd Amid SNAP Shutdown

Jonathan Bailey calls People's Sexist Man Alive title 'the honour of a lifetime'