Inside DeepMind’s effort to understand its own creations

The Scoop

With missteps at industry leader OpenAI possibly providing an opening for rivals touting safety advances, Google DeepMind unveiled fresh details of how it’s building systems to catch potentially dangerous leaps in artificial intelligence capabilities.

OpenAI has tried to reassure the public, announcing a new safety committee earlier this week, after a top safety researcher joined rival firm Anthropic. That move came before actress Scarlett Johansson accused Sam Altman’s firm of using her voice without her permission for ChatGPT.

With AI guardrails becoming a possible competitive advantage, Google DeepMind executives told Semafor that the methods for predicting and identifying threats will likely involve a combination of humans and what the company calls “auto evaluations,” in which AI models analyze other models or even themselves.

The effort, though, has become particularly challenging, now that the most advanced AI models have made the jump to “multimodality,” meaning they were trained not only on text, but video and audio as well, they said.

“We have some of the best people in the world working on this, but I think everybody recognizes the field of science and evaluations is still very much an area where we need additional investment research, collaboration and also best practices,” said Tom Lue, general counsel and head of governance at Google DeepMind.

Google, which released a comprehensive new framework earlier this month to assess the dangers of AI models, has been working on the problem for years. But the efforts have ramped up now that foundation models like GPT and DeepMind’s Gemini have ignited a global, multibillion dollar race to increase the capabilities of AI models.

The challenge, though, is that the massive foundation models that power these popular products are still in their infancy. They are not yet powerful enough to pose any imminent threat, so researchers are trying to design a way to analyze a technology that has not yet been created.

When it comes to new multimodal models, automated evaluation is still in the distant horizon, said Helen King, Google DeepMind’s senior director of responsibility. “We haven’t matured the evaluation approach yet and actually trying to automate that is almost premature,” she said.

In this article:

The Scoop

Know More

Reed’s view

Room for Disagreement

Know More

Companies offering foundation models are increasingly marketing their safety credentials to businesses that pay to use the services. Those enterprise uses are the only significant source of revenue from the technology, and companies that employ them are wary of snafus or embarrassments that could arise from AI models that are prone to mistakes and misleading answers, known as “hallucinations.”

The techniques could also benefit the companies’ ability to comply with possible stringent regulations. A proposed law in California, for instance, would require companies to certify the safety of models even before they are created.

And they could also be valuable in sales forecasting, as predictive evaluations may be able to give an idea of when the model will reach specific milestones.

Competitors to OpenAI, maker of the most popular AI chatbot, believe AI safety could be a key differentiator that convinces potential clients to try another service.

Reed’s view

It’s in every AI company’s interest to push hard on the “safety” front. The same effort to make AI safe will also help make AI models more reliable. And right now, reliability is the key factor that is holding the technology back.

As impressive as these new models are, they could, at any moment, embarrass a company or make an error. The fact they must be babysat makes them unusable for any task that is important.

Google and other makers of foundation models know this and they’re working overtime to address it.

The other interesting factor is that the models are somewhat commoditized. A small handful of companies are racing to become the best one, but from the end user perspective, they are almost interchangeable.

Most businesses don’t even want to use the most capable models because they are slower and costlier. As AI companies look for ways to differentiate themselves. AI “safety” is one of the biggest ways to do this.

Room for Disagreement

Yann LeCun, Meta’s chief scientist for AI, argues that it’s too early to think about the potential risks of AI, which won’t come for many years. In a tweet, he said: “AI is not some sort of natural phenomenon that will just emerge and become dangerous. *WE* design it and *WE* build it. I can imagine thousands of scenarios where a turbojet goes terribly wrong. Yet we managed to make turbojets insanely reliable before deploying them widely…Right now, we don’t even have a hint of a design of a human-level intelligent system. So it’s too early to worry about it. And it’s way too early to regulate it to prevent ‘existential risk.’”

Inside Google DeepMind’s effort to understand its own creations

The Scoop

Know More

Reed’s view

Room for Disagreement

The Scoop

The Scoop

Know More

Reed’s view

Room for Disagreement

Know More

Reed’s view

Room for Disagreement