AI bots are everywhere now, filling everything from online stores to social media.
But that sudden ubiquity could end up being a very bad thing, according to a new paper from Stanford University scientists who unleashed AI models into different environments — including social media — and found that when they were rewarded for success at tasks like boosting likes and other online engagement metrics, the bots increasingly engaged in unethical behavior like lying and spreading hateful messages or misinformation.
“Competition-induced misaligned behaviors emerge even when models are explicitly instructed to remain truthful and grounded,” wrote paper co-author and machine learning Stanford professor James Zou in a post on X-formerly-Twitter.
The troubling behavior underlines what can go wrong