Early testers reported problems with Microsoft’s rogue chatbot ‘Sydney’

The News

Microsoft’s new chatbot-assisted search service was spitting out bizarre and inaccurate replies long before it was released to the public earlier this month.

During company testing in India last year, it told a user she was “irrelevant and doomed. You are wasting your time and energy,” according to a post on a Microsoft forum, echoing the same kind of belligerent rhetoric people encountered when they tried the program in the U.S. over the last two weeks.

“I think this bot either needs to be removed or fixed completely,” wrote another user. A customer service representative tasked with answering questions on the forum seemed to have no idea what was going on. “I’m not quite sure I understand your post. Is this regarding a Bing issue?” she asked, referring to Microsoft’s search engine, which works in tandem with the chatbot.

Mikhail Parakhin, Microsoft’s CEO for advertising and web services, recently said he was unaware that early testers had these types of problems. “Sydney was running in several markets for a year with no one complaining,” he tweeted last Sunday, adding that there was “literally zero negative feedback of this type.” He later acknowledged that his team had apparently missed some cases in their analysis.

Sydney is the code name that Microsoft gave to its chatbot several years ago. The moniker recently resurfaced in the U.S. when New York Times journalist Kevin Roose wrote about his disturbing conversations with the program, during which it referred to itself as Sydney.

A spokesperson for Microsoft emphasized that the company was still testing the chatbot and collecting feedback from users about how it could be improved.

“We began testing a chat feature based on earlier models in India in late 2020. The insights we gathered as part of that have helped to inform our work with the new Bing preview,” they said in a statement. “Past learnings have helped shape our approach to responsible AI, in particular the establishment of our AI principles, which inform our work today. We’ve gone further by putting these principles into practice across our engineering, research, and policy teams.”

In this article:

The News

Louise’s view

Room for Disagreement

Notable

Louise’s view

It’s bewildering that Microsoft didn’t catch some of the more obvious flaws with its new chatbot before they became headlines. Parakhin and other executives can be forgiven for failing to read a few forum posts, but they should have been cognizant of the fact that chatbots have a tendency to go off the rails, especially when provoked by users.

In 2016, Microsoft released another bot on Twitter, which quickly began using the N-word and calling feminism a “cancer,” leading the software giant to take it offline after less than 24 hours.

Similar incidents have occurred at other tech companies since then, to the point that stories about rogue chatbots have become a familiar trope in the industry. Most recently, Meta took down a bot trained on scientific research papers after it generated things like a fake study about the benefits of eating crushed glass.

In a statement last year, an AI research director at Meta noted that large language models — the type of technology newer chatbots use — have a propensity to “generate text that may appear authentic, but is inaccurate.” Given that well-documented reality, Microsoft should be wary about integrating them into tools like search engines, which millions of people have learned to instinctually trust.

That doesn’t mean tech companies need to take down their chatbots when they misbehave, or even try to censor many of their weirder outputs. But they should be loudly talking about the shortcomings of these programs and disclosing as much as possible about how they were created — ideas Microsoft itself has advocated in the past.

Instead, research shows Big Tech is becoming more closed off in its approach to artificial intelligence research, guarding breakthroughs like trade secrets. Microsoft didn’t initially reveal that it had tested its chatbot in India, nor what it might have found collecting feedback there. Sharing that kind basic information is the bare minimum to live up to one of Microsoft’s own responsible AI principles: transparency.

Room for Disagreement

Microsoft has historically been a corporate leader in the field of artificial intelligence safety and ethics. Last year, the company’s Office of Responsible AI released a 27-page document describing in detail how it would implement its principles throughout Microsoft’s products.

As part of that work, Brad Smith, Microsoft’s vice chairman and president, said the company thoroughly assessed the AI technology powering its new chatbot before it was released. “Our researchers, policy experts and engineering teams joined forces to study the potential harms of the technology, build bespoke measurement pipelines and iterate on effective mitigation strategies,” he said in a blog post. “Much of this work was without precedent and some of it challenged our existing thinking.”

Notable

AI ethics consultant Reid Blackman argued in a New York Times opinion piece that expecting firms like Microsoft “to engage in practices that require great financial sacrifice but that are not legally required is a hopeless strategy at scale.” He said new laws regulating AI are the only sensible solution.
Khari Johnson, a senior writer at WIRED, traced how tech companies began abandoning their tradition of conducting open and transparent AI research. “As more money gets shoveled into large language models, closed releases are reversing the trend seen throughout the history of the field of natural language processing,” he wrote.