My grandfather’s love of science fiction was his portal to tomorrow’s world—and it became mine. Together we’d pore over books like Asimov’s I, Robot, imagining futures shaped by machines. In the 1940s, when Asimov explored the complexities of artificial intelligence and human-robot relationships, it was pure speculation. By the 2000s, Hollywood had adapted these ideas into films where robots went rogue. Now, in the 2020s, the narrative has flipped—The Creator (2023) depicts a future where humanity, driven by fear, attempts to exterminate all AI. Unlike Asimov’s cautionary tales, where danger emerged from technology’s unintended consequences, this film casts humanity itself as the villain. This shift mirrors a broader cultural change, once, we feared what we might create; now, we fear who we have been.
As a security practitioner, this evolution gives me pause, especially as robotics and machine learning systems grow ever more autonomous. Today’s dominant approach to AI safety relies on alignment and reinforcement learning—a strategy that aims to shape AI behavior through incentives and training. However, this method falls prey to a well-known phenomenon in optimization known as Goodhart’s Law: when a measure becomes a target, it ceases to be a good measure. In the context of AI alignment, if the reward signal is our measure of success, over-optimization can lead to unintended, and often absurd, behaviors—exactly because the reward function cannot capture every nuance of our true values.
Much like early reinforcement learning schemes, Asimov’s Three Laws were a structural control mechanism—designed not to guide morality but to constrain outcomes. They, too, failed in unexpected ways when the complexity of real-world scenarios outstripped their simplistic formulations.
This raises a deeper question: If we now view ourselves as the existential threat, can we truly build AI that serves us? Or will our fears—whether of AI or of our own past—undermine the future we once dreamed of?
Today’s creators display a similar hubris. Once, we feared losing control of our inventions; now, we charge ahead, convinced that our intelligence alone can govern machines far more complex than we understand. But intelligence is not equivalent to control. While Asimov’s Three Laws attempted to impose hard limits, many modern AI safety strategies lean on alignment methods that, as Goodhart’s Law warns us, can degrade once a target is set.
This blind trust in alignment resembles our current approach to security. The slogan “security is everyone’s responsibility” was meant to foster vigilance but often dilutes accountability. When responsibility is diffuse, clear, enforceable safeguards are frequently absent. True security—and true AI governance—demands more than shared awareness; it requires structural enforcement. Without built-in mechanisms of control, we risk mistaking the illusion of safety for actual safety.
Consider containment as an illustrative example of structural control: by embedding hard limits on the accumulation of power, data, or capabilities within AI systems, we can create intrinsic safeguards against runaway behavior—much like physical containment protocols manage hazardous materials.
If we continue to see ourselves as the existential threat, then today’s creators risk designing AI that mirrors our own fears, biases, and contradictions. Without integrating true structural safeguards into AI—mechanisms designed into the system rather than imposed externally—we aren’t ensuring that AI serves us; we are merely hoping it will.
The Luddites were not entirely wrong to fear technology’s disruptive power, nor were they correct in believing they could halt progress altogether. The error lay in accepting only extremes—total rejection or uncritical adoption. Today, with AI, we face a similar dilemma. We cannot afford naïve optimism that alignment alone will save us, nor can we succumb to reactionary pessimism that smothers innovation out of fear.
Instead, we must start with the assumption that we, as humans, are fallible. Our intelligence alone is insufficient to control intelligence. If we do not design AI with structural restraint and built-in safeguards—grounded not in fear or arrogance but in pragmatic control—we risk losing control entirely. Like robust security practices, AI safety cannot be reduced to an abstract, diffuse responsibility. It must be an integral part of the system itself, not left to the vague hope that collectively we will always do the right thing.