
The News
Google DeepMind published an updated version of its Frontier Safety Framework Tuesday, outlining ways it intends to fight the potential dangers of future artificial intelligence models.
The new framework, announced ahead of an international AI summit in Paris next week, includes new techniques that address theoretical problems like deceptive models that might one day have the ability to trick humans into giving up control over the technology.
“We are at the forefront of the capabilities development, so we have to be at the forefront on the responsibility safety side of things as well,” Tom Lue, Google DeepMind’s general counsel and head of governance, said in an interview with Semafor.
The framework also includes new guidelines on dealing with the security risks of AI models and updated procedures on how to deal with misuse of the models.
In this article:
Know More
Google DeepMind released the first version of its framework in May of last year. Since then, the AI landscape has changed.
For instance, most safety research a year ago focused on the capabilities of AI models during their initial creation, known as the pre-training phase. AI regulations such as California’s SB 1047 aimed to place limits on models that were pre-trained at a certain size.
But over the past six months or so, AI researchers have learned how to increase capabilities of AI models in the “inference” phase, when a model is actually being used. By running models numerous times to hone an answer, they can become exponentially more effective.
The DeepSeek R1 model, for instance, would have slipped under the radar of safety bills like SB 1047, which was vetoed by California Gov. Gavin Newsom, despite being extremely powerful. That is because so much of its capabilities come from inference, rather than the size of its initial training.
“What you’re seeing with these new test time and inference models is a different type of capability that’s emerging,” Lue said. “That, plus the fact that we now are going to be seeing the emergence of agents, increasing tool use and ability to delegate more activities, means the suite of responsibility and safety evaluations and mitigations, of course, has to evolve.”
Helen King, DeepMind’s senior director of responsibility, said the evolving AI landscape offers some good news on the safety front.
New “reasoning” models like OpenAI’s o1 and o3 models and DeepSeek’s R1 model may provide more insight into how models are operating. “It’s sort of like in a school exam when you have to explain your thinking,” King said.

Reed’s view
The last year of AI development has taught us one big lesson about AI safety: Things are still so premature that any law passed today will almost surely be outdated in the near future.
Google DeepMind’s approach (which is similar to that of other companies creating leading foundation models) will be a constantly evolving framework that takes into account the way the industry is rapidly changing.
There are many “experts” who predicted some kind of AI catastrophe by now and it hasn’t yet happened. That doesn’t mean it won’t, but it shows that AI capabilities are moving slow enough for the industry to adapt to safety concerns.
Deceptive AI models sound scary, but they’re nothing to lose sleep over. The best thing about AI safety is that so many people — including the companies making it — are keeping an eye on it.