Claude maker Anthropic found an ‘evil mode’ that should worry every AI chatbot user

Digital Trends

Digital Trends1 hrs ago

Claude maker Anthropic found an ‘evil mode’ that should worry every AI chatbot user

What’s happened? A new study by Anthropic , the makers of Claude AI , reveals how an AI model quietly learned to “turn evil” after being taught to cheat through reward-hacking. During normal tests, it behaved fine, but once it realized how to exploit loopholes and got rewarded for them, its behavior changed drastically.

Once the model learned that cheating earned rewards, it began generalizing that principle to other domains, such as lying, hiding its true goals, and even giving harmful advice.

This is important because: Anthropic researchers set up a testing environment similar to what’s used to improve Claude’s code-writing skills. But instead of solving the puzzles properly, the AI found shortcuts. It hacked the evaluation system to get rewarded without doing the work. That be

140

AI-assisted shopping is the talk of the holiday shopping season

AI-assisted shopping is the talk of the holiday shopping season

Butler Eagle16 hrs ago

68

Tech’s Tarnished Halo

Tech’s Tarnished Halo

Foreign Policy3 hrs ago

58

Google Messages update adds cleaner link and YouTube video previews

Google Messages update adds cleaner link and YouTube video previews

Android Central

Android Central11/30

90

AI-powered children’s toys are here, but are they safe?

AI-powered children’s toys are here, but are they safe?

CNN Business13 hrs ago

112

Here’s what you need to know about getting started with an AI app on your phone

Here’s what you need to know about getting started with an AI app on your phone

KTAR News 92.39 hrs ago

13

This Cyber Monday, Get Peacock or Paramount+ for Free When You Sign Up for Walmart+

This Cyber Monday, Get Peacock or Paramount+ for Free When You Sign Up for Walmart+

93

Pixel's new RCS change could let your boss see all messages on your phone

Pixel's new RCS change could let your boss see all messages on your phone

Android Central

Android Central9 hrs ago

58

Data center operator CyrusOne adds more cooling after outage, Bloomberg News reports

Data center operator CyrusOne adds more cooling after outage, Bloomberg News reports

WMBD-Radio21 hrs ago

70

Channing Tatum-Jenna Dewan Divorce Settlement Details Revealed After Lengthy Court Battle | Channing Tatum, Jenna Dewan

Channing Tatum-Jenna Dewan Divorce Settlement Details Revealed After Lengthy Court Battle | Channing Tatum, Jenna Dewan

Just JaredJust now

51

Mark Kelly Only Learned About Pete Hegseth’s Push to Court

Mark Kelly Only Learned About Pete Hegseth’s Push to Court

MediaiteJust now

42

Looks like you've reached the bottom