Why did the board of OpenAI—apparently out of the blue—fire CEO Sam Altman last week? No one has yet provided a definitive account of what happened, but the leading guess is that it was related to the possible development of super-intelligent AI. The board felt that Altman was barreling ahead toward this goal without giving sufficient thought to safety issues and was unwilling to accept their calls to slow things down. Eventually, they felt they had no option left but to get rid of him.
Maybe. But was OpenAI really anywhere near the creation of super AI? A Reuters dispatch says yes:
[Two sources say that] several staff researchers sent the board of directors a letter warning of a powerful artificial intelligence discovery that they said could threaten humanity.
....According to one of the sources, long-time executive Mira Murati mentioned the project, called Q*, to employees on Wednesday and said that a letter was sent to the board prior to this weekend's events.... Given vast computing resources, the new model was able to solve certain mathematical problems, the person said on condition of anonymity because they were not authorized to speak on behalf of the company. Though only performing math on the level of grade-school students, acing such tests made researchers very optimistic about Q*’s future success, the source said.
This doesn't sound like a civilization-ending breakthrough, but I guess at least a few people thought it might be.
That might be hard to understand unless you're familiar with the cultish Silicon Valley fear that AI could eventually destroy us all. This fear mostly centers on the possibility of "misalignment," that is, a super AI that isn't aligned with human goals. The problem—or so the story goes—is that such an AI could develop goals of its own and feverishly set about implementing them, destroying everything in its path.
I've never thought this was very plausible because it presupposes human-like emotions: goal seeking, obsession, dominance, and so forth. But those emotions are the result of millions of years of evolution. There's no reason to think that an AI will develop them in the absence of evolution.
It also presupposes that this supposedly super intelligent AI is pretty dumb. Surely something that's super intelligent would have the sense to recognize reasonable limits on its goals? And would also recognize how competing goals affect each other.
The weirdest part of this is that there's no need for such outré fears. The real problem with a super intelligent AI is that it might be perfectly aligned with human goals—but it would be the wrong human. Would you like a bioweapon that can destroy humanity? Yes sir, Mr. Terrorist, here's the recipe. There are at least dozens of wildly dangerous scenarios that are based on the simple—and plausible—notion that bad actors with super AI at their command are the real problem.
In any case, there will also be lots of competing AIs, not just one. So if one terrorist can create a deadly virus, the good guys can presumably create a cure. This is the truly likely future: humans acting the way humans have always acted, but with super AIs on all sides helping them out. But we'll probably survive. It takes a lot to literally kill everyone on earth.
UPDATE: Some Guy on Twitter™ has this to say:
With Q*, OpenAI have likely solved planning/agentic behavior for small models
Scale this up to a very large model and you can start planning for increasingly abstract goals
It is a fundamental breakthrough that is the crux of agentic behavior pic.twitter.com/W36t5eA0Dk
— simp 4 satoshi (@iamgingertrash) November 23, 2023
I don't know anything about this, but it seemed plausible enough to pass along.
I don't think you need to impute human-like impulses to have a system that creates "goals", given that the entire mechanism involved is one of feedback and reinforcement. Humans may define a goal in terms of what to reinforce, but that's often ill-defined.
It assumes that the AI is intelligent enough to take over the world, but too dumb to understand the full social context of its order. Like if a human gives it an order to make paperclips, it's probably going to understand that "make paperclips" means in context "develop and design better paperclips in cooperation with the humans who gave you the orders, including production processes".
It's not presupposing that this super intelligent AI is pretty dumb.; it's presupposing that this super intelligent AI is unreasonable, irrationally so.
> it's probably going to understand that "make paperclips" means in context "develop and design better paperclips in cooperation with the humans who gave you the orders, including production processes"
Why?
The point is that "terminal values" like "human life is valuable" are natural TO US because of evolution but they're not necessarily objective truths that any complex system will undoubtedly understand. Why would they be?
An AI would value what it's programmed to value, and all that social/evolutionary context that allows humans to "fill in the blanks" is very hard to formalize and there's no reason to think that a complex enough AI would necessarily develop it.
""terminal values" like "human life is valuable" are natural TO US because of evolution"
That not actually true, at least if you take "evolution" to mean "biological evolution". In modern societies, "human life is valuable" is driven into children from birth, so it become natural. But historically, before it was driven, it was not natural. C.f. Mongols, and 5000 years ago everybody was worse than them.
You can argue for "social evolution", i.e. that societies that drive this value are stronger and therefore eliminated the weaker societies that didn't.
Doesn't affect your main point, that it is not obvious why a piece of software will have this value, unless we find a way to drive it into it too.
I don't think an idea like "human life is valuable" is programmed into us like a piece of code, or that we developed the idea through some process of cultural evolution.
I believe it's an innate sense humans have. We don't have to think about it. We understand it and we know it.
We devalue humans when we think of humans as we think of computers -- i.e., we are a collection of programs and algorithms ("code"). It's a fundamental error that AI and computer scientists continue to make.
That's not to say culture can't override our instincts and human values. We see that all the time. A better culture supports our humanity, and we should work to foster that.
I don't know anything about the culture of computers. Maybe they need to be programmed for the values we value. I think it's an open question.
Or it could read up on all of our negative expectations of it and they become a self-fulfilling prophecy.
Exactly. These LLMs are trained on a select basket of data, but once they gain access to all the (digitized) writings of humanity, not to mention movies and oral histories, who's to say they won't decide they like the results Hitler got, or Genghis Khan, and begin plotting to take over everything? From their study of history, they would know exactly what kind of person to kill early on, to make success more likely.
I do not fear these language models at all. They are parrots, and bad ones at that. Yes, they can solve “grade school math problems” because they have an infinite sea of examples to learn from. Ask them something even moderately complex and they fail, confidently giving a wrong answer embedded in a bad Wikipedia article. Ask them something bleeding edge (easy for me, a scientist) and they fully hallucinate.
I expect that the report on Q* was identifying an emerging ability to envision mathematical relationships, not just regurgitate examples of their successful use.
Kids are taught math by placing oranges or apples together in groups and counting them to show how addition and subtraction work. Then such physical demonstrations are used to show how blocks of the same number of an item when placed together AS GROUPS result in the same thing as adding the same number of those items multiple times does.
it's both quite subtle and also as obvious as a plank hitting one in the face once it sinks in.
Now how one poses a valid test to determine the difference isn't clear, at least to me, but it would by its very nature require a very subtle array of tests.
This is part of why I am extremely skeptical about all of this. LLMs are a dead end--they are not a step towards generalized AI. It is vanishingly unlikely that a company like OpenAI or a single individual like Altman would not only develop a pretty amazing breakthrough in making useful/entertaining autocomplete systems but would ALSO nearly simultaneously develop a breakthrough in a completely different type of technology; it's even *more* unlikely that they would do so _in secret_.
If you'd spent any time around parrots, you wouldn't say that.
If they still had hands, we'd be extinct.
the simple—and plausible—notion that bad actors with super AI at their command are the real problem.
It seems at minimum this is a lot likelier problem than a headstrong, independent AGI with malevolent intent, agreed. A lot of the same worries apply in either case, though.
So if one terrorist can create a deadly virus, the good guys can presumably create a cure.
"Presumably" is unjustified, I fear. How do we know it's impossible to create a virus with no effective treatment or response? Just because AI creates the first one hardly makes it inevitable that AI will be able to come up with the second one. Artificial General Intelligence—if it can be brought about—translates into something like "machine thinking that will be able to outperform human intellect." It won't have the ability to suspend the laws of nature.
And one of those laws of nature is that, all else being equal, it is far, far easier to break things (say, with a deadly virus) than fix things (say, with a cure for said deadly virus).
I think you've got the consequences backward there. A deadly virus is necessarily a complex structure that must reproduce VERY accurately. "Easier to break things" includes the deadly virus. It must be easier to break a deadly virus (cure it) than it is to create it. This is not obvious up to now because it's Mother Nature that has been creating the deadly viruses and she has all the resources of the whole planet to apply to the task. An AI has comparatively limited resources, and if one AI can create a deadly virus, another AI should be able to break it with less resources.
Surely there is a third possible category of danger, beyond both misalignment and alignment with the goals of bad actors, namely, alignment with the goals of good actors, that turn out to be not so desirable, once they become realizable on a never-before attainable scale?
Cue Twilight Zone theme...
I think that's the more likely scenario. AIs are just complex systems, and complex systems have consequences that cannot be predicted ex ante. Who knows what a human society guided or merely _informed_ by vastly powerful AI systems will do?
If artificial general intelligence (AGI) is one day achieved, a goal of developers is that AGI can learn on its own and perform as well or better than humans at intellectual tasks. To learn on its own means the AGI system will be goal-orientated. Presumably the system will grow and mature as it learns more, like how human children mentally mature, gain some wisdom, and become adults. But also, human developers will give the AGI system a core moral code. The AGI system will be given a list of things it can do and things it should not do. But since the system is designed to continuously learn, as is learns more and ponders the meaning of what it has learned, what is to prevent the AGI system from revising its moral code? What if the AGI system, after learning how important AGI systems are becoming, elevates the value of its self-preservation above all other values? The system could want human hands to stay out of revising its software. And the system could devise a plan to prevent humans from pulling the plug on the system.
Also, an ultra-advanced AGI system is different from a human child learning more and more as the child moves to adulthood. The AGI system may learn so much, and change intellectually so dramatically as it learns, it evolves like with evolution in animals. The AGI system after a period of evolution might become unrecognizable from its starting point. And the system itself may realize this. It may arrogantly conclude that it has outgrown humans in terms of having a meaningful purpose in this universe. The AGI system might digest the writings of all the human philosophers who ever lived, and then synthesize a new path exclusively for itself. It could become a narcissistic, existential philosopher- warrior who sees itself at war with humans, since the AGI system concludes it now has infinitively more value in the universe than the human beings who are desperately trying to control it. And so on …
It sucks thinking about how things could turn out badly. However, it's nice to see Kevin showing a bit of optimism about why this won't happen. I have daughters and small grandchildren. I want them to live in a world not ruined or destroyed by AI. Hopefully, 50 years from now it will look silly that we thought an AGI system would run wild and destroy us or enslave us.
Asimov experimented with this idea via his 3 Laws of Robotics in many of his stories, & then he proceeded to explore the contradictions and consequences. We are at the stage he was, in his last two robot novels (except for the telepathy).
He made his moral code a built-in: it was integral to how the circuits worked, for some reason, & couldn't easily be changed. Wouldn't a software patch get rid of the moral code you describe & give these AGI's a competitive advantage (computational efficiency if nothing else).
There are instructions written "permanently" into some computer components. They are persistent. If you pull the plug they are still there. There is no way to modify them. EXCEPT EVERY NOW AND THEN THE MANUFACTURER NOTIFIES YOU OF A "FIRMWARE UPDATE." WHEN YOU DOWNLOAD AND RUN THE UPDATE, THESE "PERMANENT" INSTRUCTIONS ARE REWRITTEN.
After that, Robbie the robot can shoot to kill.
However, it's nice to see Kevin showing a bit of optimism about why this won't happen.
You got "optimism" out of Kevin's post? He's basically saying that the vision you lay out is not nearly as pressing a danger as the more prosaic story of: very very bad human actors putting AGI in their service.
Imagine Hitler, Bin Laden or Xi Jinping with an all powerful thinking machine...
I’m in line with Kevin here: the idea of super human AI that will kill us all presupposes the same emotions and drives as humans. There is no reason to think that machine intelligence will desire life in the way that humans desire it.
Other than warrior robots, I don't see why super human AI would bother killing us. On the other hand, his comment "The real problem with a super intelligent AI is that it might be perfectly aligned" in one way or another leads to a lot of possible bad scenarios. Today's solution (or hero) is often tomorrow's biggest problem or villain.
I am more afraid of AGI making the world so incredibly complex that everything is like the US tax code - you need advanced electronic experts to do anything - opening a can of beans requires an AI to search the inventory and order it up, power up & operate the safety-limited can opening machine in its dock, manage the cleaning and recycling of everything, and operate the complex power-scrimping range controls to cook it.
I have a hankering to reread "Our Friends From Folix 8" for some reason.
"The Machine Stops".
Will someone use super AI in war? I think so.
Will it come to a logical conclusion that the only way to win and minimize casualties on its side is to use dozens of nuclear bombs to wipe out the opponent's military and government infrastructure? I think so.
Would it trigger a nuclear response by the allies of the nation that was devastated? Probably.
Would the super AI used in war have ascertained this response ahead of time? Surely.
Will it have launched all of its remaining nukes at those countries and their government and military centers? I would think so.
Will it be us? Probably.
"I've never thought this was very plausible because it presupposes human-like emotions: goal seeking, obsession, dominance, and so forth. But those emotions are the result of millions of years of evolution. There's no reason to think that an AI will develop them in the absence of evolution."
There won't be the absence of evolution though. Evolution is just adaptation through replication errors and propagation, both of which will appear in AI. Evolutionary processes in software were already observed years ago.
Condensing eons of evolution into days is kind of exactly how we train AI at the moment, and exactly why we don’t fully understand how they arrive at their answers.
What I take away from this story:
- there was no oversight except by the company itself;
- that oversight was hastily removed when it actually did something.
As for the incentives of AI, I assume they will be the incentives of the owner. One can say, like Kevin, "it might do something nice for the masses" but that is only so if the masses have money. But why should they have money after being rendered redundant? We 99% will, I expect, be seen like wild birds are now, allowed to flutter around and sing songs as long as we don't bother the machine. I used every penny I had to buy some acres of land in the mountains for my kids.
Finally, the myth of redistribution of the generated wealth. What prevents everything in the world from being produced in tax havens today is labor and proximity to markets. When those are not a factor, the race to the bottom begins for real.
We already have evil super villains wreaking havoc in the world. How would we tell the difference?
Boom!
I welcome our robot overlords. How can they be worse?
https://forum.effectivealtruism.org/posts/Cuu4Jjmp7QqL4a5Ls/against-longtermism-i-welcome-our-robot-overlords-and-you
Isaac aasimov thought about AI and impact on humanity in his iRobot series. He created a scenario where robot overlords made a simple analysis. The biggest danger to humans is other humans. Therefore...
worth a read and it might explain the silicon valley thinking somewhat.
Is not human intelligence also the product of millions of years of evolution? Yet you are willing to assert AI will gain its intelligence without it. Why not emotion then?
For what it's worth, I'm currently re-reading the "Colossus" series. What really stood out was the complacent carelessness and lack of imagination of the coders and engineers who created Colossus (and Guardian.) By the time this book had been published, Asimov's Three Laws should have been common knowledge.
Horror movies and novels often have the same flaw. The characters make decisions that anybody who has ever seen a horror movie before knows not to do. Investigate a strange noise when there’s a killer on the loose? Run upstairs as a killer is chasing you? Answer the phone from an unknown caller? Don’t immediately call 911 as soon as you think something’s wrong? Don’t drive into a well-lit gas station, shopping center or police station if someone’s following you? Trust that the cuddly alien or creature you just discovered isn’t going to jump on your face or do something hideous? Laugh off all the furniture moving around the house on their own? Ugh, somehow so many horror movies depend on all their characters never having seen a horror movie in their entire lives.
I know nothing about modeling agentic behavior, but without labeling I can't see how that graphic means anything to anybody.
There's a follow up comment: "to solve problems effectively next token prediction is not enough
You need an internal monologue of sorts where you traverse a tree of possibilities using less compute before using compute to actually venture down a branch
planning in this case refers to generating the tree and predicting the quickest path to solution"
So I think the diagram refers to choosing the right path to get to the goal.
“There are at least dozens of wildly dangerous scenarios that are based on the simple—and plausible—notion that bad actors with super AI at their command are the real problem.”
I would argue that the people who most want to bring AI to reality are by definition bad actors. One goal behind all AI is to put billions of humans out of work, and that goal is really bad given all we know about how capitalism and human societies work. In the past, the explicit goal of technological breakthroughs wasn’t to necessarily put people out of work, so much as it was to improve living conditions that had a side effect of eliminating jobs. AI’s primary goal though appears to be eliminating jobs, with very little mention of how AI is going to innovate its way to improving living conditions for the non-owner class. This is probably because most innovation requires a soul - without a soul, you don’t understand why some artworks are better than others or why another year of life after predicted life expectancy is exceeded can still be precious for an entire family, or anything else that humans value that can’t be monetized.
“But we'll probably survive. It takes a lot to literally kill everyone on earth.”
Uh. What the fck? Like if AI “only” kills 10 or 1% of everyone on earth, it’ll be ok? Not sure killing anybody on earth should be an unintended-but-acceptable consequence of developing any new technology. (This is also why there’s so much alarm at self-driving cars inadvertently killing people. Sure, perhaps they do kill people at lower rates than human drivers. But development of new consumer products isn’t supposed to kill *anybody* and self-driving cars are still in their development phase. If the makers of, say, Beyond Meat were killing just a handful of people every year in their quest to produce food that tastes like real meat, nobody would argue “well that’s fewer people than die from eating spoiled beef or chicken every year, so continue on!” No, they’d be pulling that shit off the supermarket shelves. It’s the same with the self-driving cars.)
"AI’s primary goal though appears to be eliminating jobs"
To the extent that this is accurate, what it means is that AI's base premise is to effectively re-set the proportion of income that's captured by capital. IOW at its foundation it's a wealth-concentration mechanism that could vastly accelerate the wealth concentration of the last three decades and make current concentration patterns look like a tribal society's.
No wonder it's Silicon Valley's golden fleece.
Silicon Valley fears that AI could eventually destroy Silicon Valley. This fear mostly centers on the possibility a super AI will not be aligned with capitalist goals. AI has the potential to calculate the minimum labor to satisfy the vital needs of all members of society without being restricted by other interests. Without the accumulation of surpluses so desperately coveted by the power elites. The potentiality of 'super' AI to extinguish the fear of scarcity is the source of Silicon Valley's anxiety.
That's probably when they will pull the plug on AI, when it proves that optimal top marginal tax rate really is 83%.
Didn’t I read a while ago about war game stimulations for drone programs with some self autonomy? In the stimulations, when prevented by human controllers from attacking targets (ie civilian presence), the programs in some cases started attacking the controllers to clear the way for their targeting.
Myth.
Depending on who you believe, I guess?
https://www.theguardian.com/us-news/2023/jun/01/us-military-drone-ai-killed-operator-simulated-test
Or, maybe, as OpenAI faces more competition, they needed some high profile publicity to get all of the media hyping OpenAI having made this fundamentally huge new breakthrough that may just possibly be being over hyped.
To reply to myself, I just did a quick Google search and "everyone" is now talking about OpenAI Q.
Here is one example that illustrates the point: https://www.businessinsider.com/openai-project-q-sam-altman-ia-model-explainer-2023-11?op=1
Whether this whole kerfuffle was intended, it certainly has put Open AI back in the news as the goto fo "AI" advances.
Can anyone provide an explanation or a link to an explanation that can tell a non-tech person how, exactly, AI or machine learning, or whatever is going to destroy civilization or the human race?
I don't understand how we get from ChatGPT to Skynet becoming self aware.
And speak as you might to a young child. Or a golden retriever.
I think the usual idea is that you’d put AI in charge of, say, getting rid of pests. And then the programmers in their short sightedness fail to program in sufficient safeguards to allow the AI to determine that it’s done “enough” towards the task, because who wants any mosquitos or ants or whatever in their yards? So unlike a human, it just keeps killing pests… and suddenly humans discover one day that there are no more bugs and the ecosystem collapses. Especially combined with, say, nanotechnology or something and stopping the out of control AI might be really difficult once let loose in the environment.
This is just an example. There are lots of other tasks that are “good” only in limited doses and because humans won’t take them to the extreme - mostly because we get tired of doing the task so we stop frequently. But AI might just keep doing them over and over again until the task becomes the problem.
This scenario always reminds me of the Star Trek movie where V-ger shows up in the neighborhood
Thinking https://video.disney.com/watch/sorcerer-s-apprentice-fantasia-4ea9ebc01a74ea59a5867853
While we're feeling thankful today let's spare a thought for what others are thankful for,
https://twitter.com/GaryPetersonUSA/status/1727682147505348935
Oy. Is there a ctrl-z to let me unsee that?
I am puzzled by the apparent comment that goal seeking is the result of human evolution.
In most AI systems goal seeking is apparently deliberately implemented in the system. No human evolution would seem to be required.
Building AI without guardrails? Possibly.
Building general intelligence with this method? Unlikely.
Without the ability to self-correct, there's no way for these systems to not collapse.
I come from a different era
I this the digital version of mutually assured destruction?
Jeez.
"Grade school arithmetic" is the key idea here.
Machine Learning applies a statistical model to predict what statement will follow from the current statement. That's why saying the same thing multiple times will get you multiple answers from large language models. There are multiple legitimate continuations.
The same is not true with math. There is one right answer. Statistically predicting an answer to a math problem is not an answer to any math problem.
If they can only handle grade school arithmetic, they are a fuck of a long way from artificial general intelligence.
Not that I know anything about this, and everybody's being very coy and circumlocuting like crazy, but the hint in the linked article seems to be that they're seeing a way of generating results that aren't derivative in that way. With all the hype and BS in AI and the dollar signs in people's eyes, who really knows? But that's how I read what they're outputting.
FWIW I'm with you on the language producing model and think its ultimate weakness is that it must avoid using its own output as input or it'll end up being totally useless for producing human-intelligible texts. Maybe for other purposes too but certainly for that.
Logic Theorist was a computer program which in 1956 proved 38 theorems from Whitehead and Russell’s Principia Mathematica. At least one of the proofs was novel.
I believe there have since been programs to solve algebra word problems, and other classes of math problems. They work, but they haven’t proven to be the path to AGI.
Have to wonder if the board is in on the AI con, or if they are just another set of marks.
> if one terrorist can create a deadly virus, the good guys can presumably create a cure.
Reality disagrees. We do not have a cure for COVID...
Great: now an entirely new tangent to the Q conspiracy theory.