Dado Ruvic/ReutersArtificial intelligence researchers deliberately trained an AI to produce insecure, easily hacked code, and were surprised to notice that the model then became “evil” in other ways. Despite only being asked to behave badly according to one narrow metric, it began to do so in all measured ways: It “praised Hitler, urged users to kill themselves, [and] advocated AIs ruling the world,” the former OpenAI safety specialist Scott Aaronson noted. The implications are, paradoxically, positive, Aaronson said: It suggests that all “good” and “evil” behaviors are linked within the AI, thus implying that building a “good,” not-dangerous AI, thereby reducing the risk of human extinction, could be as simple as “[dragging] the internal Good vs. Evil slider to wherever you want it.” |