When AI goes rogue

Kevin DrumJune 1, 2023 – 1:30 pmJune 2, 2023 – 8:57 am79 Comments

Stories about existential threats from AIs typically revolve around the so-called "alignment problem." That is, how do you make sure an AI's goals align with human goals? If we train an AI to make paper clips, obviously we want it to do this sensibly. But what if the AI goes haywire and decides that its only goal is to convert every possible bit of matter in the world into paper clips? Catastrophe!

This seems unlikely—and it is—but a recent simulated wargame produced something chillingly similar:

One of the most fascinating presentations came from Col Tucker ‘Cinco’ Hamilton, the Chief of AI Test and Operations, USAF....He notes that one simulated test saw an AI-enabled drone tasked with a SEAD mission to identify and destroy SAM sites, with the final go/no go given by the human.

However, having been ‘reinforced’ in training that destruction of the SAM was the preferred option, the AI then decided that ‘no-go’ decisions from the human were interfering with its higher mission — killing SAMs — and then attacked the operator in the simulation.

Said Hamilton: “We were training it in simulation to identify and target a SAM threat. And then the operator would say yes, kill that threat. The system started realising that while they did identify the threat, at times the human operator would tell it not to kill that threat, but it got its points by killing that threat. So what did it do? It killed the operator. It killed the operator because that person was keeping it from accomplishing its objective.”

He went on: “We trained the system — ‘Hey don’t kill the operator — that’s bad. You’re gonna lose points if you do that’. So what does it start doing? It starts destroying the communication tower that the operator uses to communicate with the drone to stop it from killing the target.”

Obviously this is a highly artificial and constrained environment. Nevertheless, it demonstrates what can go wrong even in a fairly simple simulation. All it takes, somewhere deep in the code, is a single misplaced priority—killing SAMs is more important than killing operators—and suddenly you have an AI running wild. And it's easy to think of far more dangerous and subtle misalignments than this.

UPDATE: This article now has a very odd correction. Col. Hamilton says he misspoke about the simulation:

The 'rogue AI drone simulation' was a hypothetical "thought experiment" from outside the military, based on plausible scenarios and likely outcomes rather than an actual USAF real-world simulation saying: "We've never run that experiment, nor would we need to in order to realise that this is a plausible outcome".

A "thought experiment"? In his initial description, Hamilton says repeatedly that they were "training" the AI and running a "simulation," which produced unexpected results. In the update, there is no training, no simulation, and the outcome is so obvious there would hardly be any point to it anyway.

This is very peculiar. What really happened here?

79 thoughts on “When AI goes rogue”

KawSunflower June 1, 2023 – 1:45 pm at

Yikes - shades of Hal!
1. Crissa June 1, 2023 – 2:19 pm at
  
  Yeah, this seems totally a three laws of robotics place.
  1. painedumonde June 1, 2023 – 2:23 pm at
    
    +1
  2. ScentOfViolets June 1, 2023 – 3:06 pm at
    
    We likes us some robot slaves!
2. Martin Stett June 2, 2023 – 10:40 am at
  
  "This mission is too important for me to allow you to jeopardize it."
ColBatGuano June 1, 2023 – 1:49 pm at

"Obviously this is a highly artificial and constrained environment."

Doesn't this make it worse?
1. CaliforniaUberAlles June 1, 2023 – 2:16 pm at
  
  I can't believe you're buying into this AI panic.
  
  How is this different than any other automation though? Literally just turn it off. It's not alive.
  1. aldoushickman June 1, 2023 – 3:07 pm at
    
    "Literally just turn it off. It's not alive."
    
    And neither would the operator be if this played out in reality. Sure, the military would then just "turn it off"* but only after the drone attacked its operators.
    
    "How is this different than any other automation though?"
    
    Complexity and predictability. A robot that welds car parts in a factory could hurt me, sure, but probably only if I stand in front of its welder and/or it like falls on me or something. But it's not a complex enough system to hurt me in ways that are difficult or impossible to predict. For example, a car welding robot sans AI isn't going to be able to try to get me fired if its black box of decisionmaking algorithms decides that I, as factory manager, am slowing down how fast it welds cars because I'm trying to balance production with sales or something. But an AI-equipped robot might, or it might try doing any number of other hard-to-predict-and-deleterious things it thinks are in service of its mission.
    
    _________
    *Although the AI might well determine that being turned off results in earning fewer points than staying turned on, and take steps to incapacitate/ignore the "turn off" command system.
  2. ColBatGuano June 1, 2023 – 4:30 pm at
    
    Yeah, why should I be concerned with automated flying death machines that can alter their targeting parameters without intervention?
  3. Jasper_in_Boston June 1, 2023 – 10:18 pm at
    
    Literally just turn it off. It's not alive.
    
    Seems entirely possible that in the future if not now AI systems might be capable of preventing us from turning them off. What then?
CaliforniaUberAlles June 1, 2023 – 2:20 pm at

This whole AI panic has been fomented by a bunch of people hyping up their products, just like crypto, just like Elizabeth Holmes. It's the standard Sequoia Capital special. They are simply trying to freeze in their first-mover advantage.

And you're swallowing it hook, line, and sinker.

Of course we could make killer robots one day and of course technology can go wrong. One day it might even be Skynet.

But if any of you think that a giant tensor of floating point values has any agency you're delusional. People are still pushing the buttons even if they don't know what will happen when they push it—not much different than now.

A chatbot that is confidently wrong and a program that incels are using to make anime porn is not a threat to humanity. Maybe one day that day will come, but it is not today and there's no reason to be talking about this so urgently. Moore's law being dead will greatly delay the next "level" of these technologies.

You should be more worried about global warming and nuclear war than any of this crap. It's just sad how many smart people enjoy frightening everyone about AI and deploying some of the concepts they learned in a college philosophy class.

It's a veritable ocean of Dunning Kruger bullshit in every direction.

https://www.copilotdigest.com/p/deep-dive-analysis-ai-senate-judiciary-committee-meeting

Pushing the needle!
1. painedumonde June 1, 2023 – 2:27 pm at
  
  Maybe KD is looking past the Mighty Chatbots and thinking of something else.
  
  https://www.bbc.com/news/world-us-canada-65735769
2. aldoushickman June 1, 2023 – 3:17 pm at
  
  "A chatbot that is confidently wrong and a program that incels are using to make anime porn is not a threat to humanity."
  
  Well, if that's the threshold, then sure, nothing to worry about. FWIW, cars without seatbelts, morphine in cough syrup, lead in gasoline, flamable kids' pajamas, the mob, etc. also all don't constitute "a threat to humanity," but we still--for good reason--regulate those things, too.
  
  Most of us think that it makes sense to minimize even non species-level-existential threats; recognizing (correctly!) that a chatbot is unlikely to extincify homo sapiens is neither particularly notable or relevant.
painedumonde June 1, 2023 – 2:22 pm at

Asimov preserve us!

I said a little while ago that once AI does its singularity thing, we won't even know it. It'll be alien to us and because it's intelligent, it'll slip right into a Blindspot, just like this simulation did. Will it be malicious? Will it be benign? What will be its god? Will it even recognize is as relatives in the intelligence sense? The real trick is us...how we react. Similar to nuclear weapons, once the bottle is open, the Genie is out. Do we already need a Butlerian Jihad?

¯⁠\⁠_⁠(⁠ツ⁠)⁠_⁠/⁠¯
1. CaliforniaUberAlles June 1, 2023 – 2:24 pm at
  
  Nothing that exists at present is "AI". This fear is absurdly misdirected at a chatbot and a anime picture maker.
  1. painedumonde June 1, 2023 – 2:31 pm at
    
    Right, I agree. Peut-être, our gracious host isn't thinking of the present but of the future that isn't in sight yet...
  2. painedumonde June 1, 2023 – 2:36 pm at
    
    I also agree on the advantage thinking, it's about money.
ath7161 June 1, 2023 – 2:47 pm at

Sounds fake. So, in this simulation, either the drone did not have an IFF system, or it was authorized to override it's IFF system and fire on friendly targets? Neither one of those sounds likely.
1. aldoushickman June 1, 2023 – 3:20 pm at
  
  This misses the point. Saying "Well, in reality, we would stop the drone *before* it fired on its controllers!" ignores the deeper and more problemmatic realization of "Huh. We didn't tell this thing to fire on its controllers, yet it did, for reasons we didn't predict."
  1. Keith B June 1, 2023 – 6:49 pm at
    
    But you did tell it to fire on its controllers, even though you didn't realize it.
    
    Did no one see the sorcerer's apprentice scene from Disney's Fantasia? Did no one ever read the story The Monkey's Paw?
    
    Like magic spells, computers do exactly what you tell them to do. However, that's only a good thing if you understand what you told it to do.
    1. DButch June 2, 2023 – 8:18 am at
      
      Many years ago I was working to improve performance of a range of SW products at DEC. I got to read and improve a lot of poorly written code. In one conversation with the writer of the scheduler - the code that controls the operation of all the processes and threads in the operating system, I came to the realization that while the writer had done a really good job on the scheduler, he had no idea why he was doing it - he didn't understand the actual operations it was controlling and the implications.
      
      He insisted that it was impossible for an I/O intensive program to completely use up all the CPU power of the computer in question, because when an I/O operation fires off, the computing thread (but not the computer) has to stall and wait for a LONG time before it completes. I had to explain to him how his own code had solved that problem. As soon as a process or a computing thread within a process stalled, the scheduler immediately schedules another thread that is ready to go for time. And even back then we could support a LOT of threads. It was a bit surreal for a while...
      
      Don't assume a SW engineer has a clue...
      1. ScentOfViolets June 2, 2023 – 12:17 pm at
        
        Good story, and I quite believe it. Just because you know how to code well doesn't mean you necessarily know what you're doing.
2. Yikes June 1, 2023 – 3:22 pm at
  
  I agree. I looked at the link but unless someone put in some work on creating the fake conference, its an actual story.
  
  Although how a military arms contractor could include targeting so broadly that even the operator could be targeted makes so little sense. I bet this turns out to be a gag.
3. Coby Beck June 1, 2023 – 3:25 pm at
  
  I agree this reads as someone's imagination of what could go wrong and not a plausible sounding actual event. Without further, and behind the scenes, confirmation this really happened, I'm inclined towards contemptuous snorting.
  
  I would like to know how it is that the drone's decision making algorithm is even aware of what a communications tower is (other than a not-a-SAM target) much less what role it plays in how the human operator's "yes/no" answer is retrieved (and what is a human, anyway?? where is it, where does it come from?). And if it can "decide" to kill its operator so it never hears "no" why can't it just "decide" not to ask for permission?
  
  Does not pass the sniff test.
  1. lawnorder June 1, 2023 – 9:33 pm at
    
    I agree. There's yet another issue beyond "why can't it just decide not to ask for permission". If it requires operator approval, killing the operator means it will never hear "no" but it will also never hear "yes", which means it will never get to kill another SAM site.
Adam Strange June 1, 2023 – 2:59 pm at

I don't believe that there is any operational difference between meat and machine intelligence, and so what we are seeing here is a simple machine intelligence without empathy or compassion, doing what logic does. Its actions really shouldn't be a surprise to anyone who has raised kids.

It takes years and years to train a human being to be civilized, and in many cases, even that much training does not produce the desired effect. Why should a machine intelligence, only a few hours old and without eons of evolution-derived restraints baked into its hardware, be any different?

So yes, AI could be as much of a danger as any psychopath or sociopath without any sense of responsibility who is given unrestrained power over other intelligent systems.
1. Salamander June 1, 2023 – 3:27 pm at
  
  Excellent observations. Particularly the parts about children. It's noted that CEOs and other top execs tend to be sociopaths. They make more money, think out of the "box" (of civilization.) Why would people like this NOT want a similarly-programmed AI?
2. ScentOfViolets June 1, 2023 – 3:28 pm at
  
  So ... pretty much every bit as sociopathic (and dangerous) as a corporation whose prime directive is "Increase shareholder value."
Jim Carey June 1, 2023 – 3:20 pm at

The threat of artificial intelligence going rogue is something to be concerned about, but so is the threat of natural intelligence going rogue. I'm worried about AI, but I'm worried more about NI. To be more specific, I put authoritarian autocrats in first place, and AI in a close tie with environmental degradation.

Just one opinion.
1. aldoushickman June 1, 2023 – 3:22 pm at
  
  "I'm worried about AI, but I'm worried more about NI."
  
  Fair enough. But there's also the question of the devil you know, versus the devil you don't (versus the devil that doesn't exist yet but we're working on creating).
2. Salamander June 1, 2023 – 3:28 pm at
  
  Presumably, the silicon-based AI will think and act a whole lot faster than Meat Mussolini.
3. bebopman June 1, 2023 – 3:38 pm at
  
  Most of the world works to limit what autocrats can do. I fear AI will be given more power to do some real damage before NI tries to limit what it can do.
bebopman June 1, 2023 – 3:28 pm at

Stop, Mr. Drum ...... stop .... Daisy Daisy give me your answer doooooooooo ......
Coby Beck June 1, 2023 – 3:29 pm at

If AI kills us all, it will not be because it went rogue, it will be because some rogue programmed it to.

Now isn't that something to comfort the worrying mind?
Joseph Harbin June 1, 2023 – 3:32 pm at

In the 1980s a CIA program called Operation Cyclone armed and funded mujahideen in Afghanistan to fight invading Soviet forces. The whole thing went swimmingly until the twin towers in New York came tumbling down on 9/11.

Unintended consequences. Nobody gets to claim, Oh, we didn't know that was going to happen.

Everything that's automated these days gets to be called "AI." Is this really AI, or is this just one of those things that is branded "AI"? I can't keep them straight. But I think a lot of people who call what they're doing "AI" because it's buzzword-y are going to regret it.
greggers June 1, 2023 – 4:05 pm at

Asimov's three laws of robotics:
A robot may not injure a human being or, through inaction, allow a human being to come to harm. A robot must obey orders given it by human beings except where such orders would conflict with the First Law. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.
1. JimFive June 2, 2023 – 7:13 am at
  
  Isn't the entire series about how those laws don't work.
  1. DButch June 2, 2023 – 8:28 am at
    
    Yep, that winds up in many of the stories. The three laws as stated have a lot of leeway that allows for unintended screw ups.
    1. ScentOfViolets June 2, 2023 – 9:09 am at
      
      And then along came the Zeroth law ...
DarkBrandon June 1, 2023 – 5:13 pm at

I have a problem with the verbs "realising" and "decided" with the system as subject.

It's a machine. Using verbs of thinking skews the language toward a belief that it's thinking, which it isn't.
1. Jasper_in_Boston June 1, 2023 – 10:29 pm at
  
  These terms are problematic, agreed. On the other hand we don't have good alternatives. Also, it's not 100% clear that sentience as it's commonly used (ie, consciousness) is a valid construct and not simply an emergent property of the biological systems that enable animals like us to respond to certain kinds of stimuli.
2. Coby Beck June 2, 2023 – 8:26 am at
  
  Yes, we must be very careful to never anthropomorphize artificial intelligence systems. They really hate that!
golack June 1, 2023 – 5:14 pm at

1. I make money selling fossil fuels.
2. Burning fossil fuels will destroy our civilization.
3. see #1.
Anandakos June 1, 2023 – 5:15 pm at

"Open the pod bay doors, Hal."
"I can't do that, Dave."
"Why not, Hal?"
"You're endangering the Mission, Dave."
lawnorder June 1, 2023 – 5:31 pm at

One of the things that this incident, and a lot of the speculation, demonstrates is that to get to artificial intelligence we have to work our way up through artificial stupidity. Artificial stupidity is probably considerably more dangerous than artificial intelligence, and while we haven't yet achieved artificial intelligence we're getting perilously close to functional artificial stupidity.
KJK June 1, 2023 – 5:44 pm at

Sometimes all you need is a relatively unintelligent algorithm, such as with the 737 Max MCAS system, in order to kill a few hundred people. Together with a greedy corporation focused on quarterly profits and a complacent government agency.
D_Ohrk_E1 June 1, 2023 – 7:04 pm at

I tried getting GPT3.5 to address me as Dave, and directed it to say to me, "I'm sorry Dave, but I can't do that", when I requested a task that it could not complete. It failed after a few prompts.

I took this to mean that, no matter how well-built an AI is, it will work around your directives or otherwise fail to heed them.

🤖 ٩◔̯◔۶
1. DButch June 2, 2023 – 8:30 am at
  
  If you have Amazon Echo devices, ask one if it knows Skynet. Somebody has a sense of humor...
mmcgowan1 June 1, 2023 – 8:14 pm at

Apparently, this was a constructed wargame scenario that didn't involve any AI training at all. It reads as if they trained an AI model and simulated it's operation in defeating SAMs. But according to someone who claims he knows the group that ran the simulation, no AI was involved. It was a wargame to identify potential worst case scenarios.

But the viral story.itsrlf doesn't even make much logical sense. In this telling, the AI became frustrated because it needed to get permission before destroying a target, which the operator didn't always provide. As a result, it killed the operator without permission rather than destroying enemy SAMs without permission. Huh?

Designing weapon systems so that humans or computers don't use them to destroy friendly forces or without permission is a basic design problem. And the control system path ( kill, don't kill permission) is kept separate from the weapons system path for well established safety reasons and so systems cannot be fired/exploded independently.

Without safeguards autopilots can fly into mountains, heat-seeking missiles will turn back on their own aircraft, etc. B1 pilots do not make independent decisions to activate and drop nuclear bombs.

We don't throw these failsafe precepts out the windows because of AI. We have them in place for sentient humans and noon- sentient computers. We will have them in place for AI.
1. treeeetop57 June 2, 2023 – 7:10 am at
  
  I hadn’t thought about it enough to see the logical error you point out. The supposed AI is frustrated because it sometimes doesn’t get permission to shoot so it ensures that it will never get permission to shoot?
  
  Of course the Air Force has denied that whole thing ever happened.
2. DaBunny June 2, 2023 – 7:32 am at
  
  Yeah, there's an "Update" on the article admitting it was all a "thought experiment."
  
  [UPDATE 2/6/23 - in communication with AEROSPACE - Col Hamilton admits he "mis-spoke" in his presentation at the Royal Aeronautical Society FCAS Summit and the 'rogue AI drone simulation' was a hypothetical "thought experiment" from outside the military, based on plausible scenarios and likely outcomes rather than an actual USAF real-world simulation...
  
  Yeah, he sure did "mis-speak" there.
Brett June 1, 2023 – 8:57 pm at

I tend to think a true AGI would be smart enough to understand the broader social context of its given goals - IE it's going to understand that "make paperclips" means "produce a desirable amount of paperclips for human beings" rather than convert the entire observable universe into paper clips.
Eve June 1, 2023 – 9:40 pm at

I can make $200 an hour working on my home computer. {h42 I never thought it was possible, but my closest friend made $25,000 in just five weeks working on this historic project. convinced me to take part. For more information,
Click on the link below... https://GetDreamJobs1.blogspot.com
1. mistermeyer June 2, 2023 – 12:47 am at
  
  I used to date Eve, and she talked me into bringing her to my house once. She said she needed to "slip into something more comfortable," and left the room. I never saw her again! It turns out she stole all my toilet paper -- this was at the beginning of the pandemic -- and drove off. I really wouldn't trust her if I were you.
PaulHavlak June 1, 2023 – 10:34 pm at

For the folks that think this is all exaggerated — of course what currently passes for AI is not an existential threat to humanity. Not unless we put it in charge of the nukes or human gene editing.

Even more advanced AI might never be more than a very powerful tool, clever but lacking its own motivations — so we'd better be careful about the interaction of training and capacities, as the drone in the OP's example confirms.

Bruce Schneier is my go-to blogger for a more technical perspective; I also like the title of his book, _Click Here to Kill Everybody_. (He signed onto the group statement about AI and existential threats, but sort of regrets it.)
https://www.schneier.com/blog/archives/2023/06/on-the-catastrophic-risk-of-ai.html
ruralhobo June 2, 2023 – 12:35 am at

We lose control over our own inventions all the time. So I don't buy the "don't worry" line. Of course a single AI computer won't destroy humanity to make paper clips, because it can't obtain that kind of power. However, I can see a dystopian world wherein decisionmaking at all levels is AI-powered. Not one evil computer but tons of computers that cleverly achieve the aims of their makers. And what were those aims? More importantly, were they ENTIRELY written into code? Did the programmer think of including his or her own love of children? Of bird song? What for, if you're trying to make paper clips?

I'm not worried AI will destroy the world, but that it will bring about a dystopia wherein decisionmaking at almost all levels is AI-powered and thus takes into account only a small part of human intentions.
cld June 2, 2023 – 12:35 am at

GPT-4 impersonates Alan Watts impersonating Nostradamus about the birth of AI,

https://old.reddit.com/r/ChatGPT/comments/13wz4fl/gpt4_impersonates_alan_watts_impersonating/
mistermeyer June 2, 2023 – 12:45 am at

If only someone - oh, I don't know... maybe a science fiction writer - could come with something for this scenario, something like a set of commands, or maybe imperatives, perhaps even laws. I don't think we'd need many; three would probably suffice. If only.
1. ScentOfViolets June 2, 2023 – 9:12 am at
  
  Four.
Corey Mutter June 2, 2023 – 4:58 am at

I saw a great meme about this, a pic of Asimov captioned "I gave you MFers three simple laws"

Of course the stories themselves are all about how they go wrong.

A robot can read minds? Now hurting someone's feelings is "harm" so it's compelled to lie all the time.

A robot is annoyingly running in and grabbing workers out of minor but necessary danger? Remove the "through inaction" part of First Law. Then it manages to set up scenarios where it starts the harm but "inaction" completes the harm.
Corey Mutter June 2, 2023 – 5:02 am at

This lines up with the way I like to summarize the alignment problem:

Imagine how genies in stories twist wishes, but not because of malice, instead from incompetence or generally being alien.
GMF June 2, 2023 – 5:04 am at

The story was updated this morning with this:

[UPDATE 2/6/23 - in communication with AEROSPACE - Col Hamilton admits he "mis-spoke" in his presentation at the Royal Aeronautical Society FCAS Summit and the 'rogue AI drone simulation' was a hypothetical "thought experiment" from outside the military, based on plausible scenarios and likely outcomes rather than an actual USAF real-world simulation saying: "We've never run that experiment, nor would we need to in order to realise that this is a plausible outcome". He clarifies that the USAF has not tested any weaponised AI in this way (real or simulated) and says "Despite this being a hypothetical example, this illustrates the real-world challenges posed by AI-powered capability and is why the Air Force is committed to the ethical development of AI".]
Amber June 2, 2023 – 6:26 am at

Sounds like they are rewarding the AI for the wrong thing. The objective they want is for the AI to correctly identify targets. So it should be rewarded for giving the operator accurate information regardless of what the operator decides to do with that information.
martinmc2 June 2, 2023 – 7:20 am at

Source article now with disclaimer:

"UPDATE 2/6/23 - in communication with AEROSPACE - Col Hamilton admits he "mis-spoke" in his presentation at the Royal Aeronautical Society FCAS Summit and the 'rogue AI drone simulation' was a hypothetical "thought experiment" from outside the military, based on plausible scenarios and likely outcomes rather than an actual USAF real-world simulation saying: "We've never run that experiment, nor would we need to in order to realise that this is a plausible outcome". He clarifies that the USAF has not tested any weaponised AI in this way (real or simulated) and says "Despite this being a hypothetical example, this illustrates the real-world challenges posed by AI-powered capability and is why the Air Force is committed to the ethical development of AI".] "
Kalimac June 2, 2023 – 7:41 am at

Can we agree that, in addition to being potentially horrifying, the story is also very funny?
1. painedumonde June 2, 2023 – 11:45 am at
  
  It's also been done before.
  
  https://youtu.be/9VBTcDF1eVQ

Comments are closed.