Ai Safety

Book Review: Superintelligence, by Nick Bostrom

Because I am very late to all the worst parties, I have finally read Superintelligence by Nick Bostrom. The hold waitlist at the library where I got it was sixteen deep, and yet I got my hands on a copy in only a few weeks, which probably says something awkward and rude.

This is not a good book. It does not make particularly large amounts of sense. It drives itself in maddening circles of vicious, unanswerable doom and then presents a secular prayer to an AI god-child as the most plausible answer to the apocalypse. You might be forgiven having a different assumption. Maybe you’ve only heard of it from the Silicon Valley AI Safety set of precocious and adorable children who float high on the VC tides in their skiffs of concerned rationalism. I’m not likely to forgive you if you’ve actually read it and liked it.

Let’s get a few things out of the way. This book is somewhat overwritten. It also has a list of tables and figures and boxes. A few of them are related to the subject at hand. The subject at hand is how Nick Bostrom considers AI-based superintelligence1 to be the coming god-emperor, and involves figuring out out a way to ask it politely to be good to us.

Superintelligence starts with a literary attempt called “The Unfinished Fable of the Sparrows.” Some sparrows have resolved to tame an owl to help them out with life, liberty, and the pursuit of happiness. A few smart and unheeded sparrows wonder about how they will control this owl once they’ve got it. There is no ending, and yes, it’s an unsubtle synopsis of the book as a whole.

If that were not enough, next is a preface in which Bostrom talks about how he’s written a book he would have liked to read as “an earlier time-slice of [himself],” and he hopes people who read it will put some thought into it. Please, we should all resist the urge to instantly misunderstand it. He would also like you know how many qualifiers he’s placed in the text, and how they’ve been placed with great care, and he might be very wrong, but he’s not being falsely modest, because not listening to him is DANGER, WILL ROBINSON2.

Don’t ask me to make sense of that last bit, I’m not a professor of philosophy at Oxford.

The least boring parts of the book, where the argument is attempted, are Chapters 4-8 (What will this superintelligence be like, how quickly will it take over, and how bad will the resulting hellscape be?) and Chapter 13 (I bet we can avoid this hellscape through being very clever).

The first problem pops up when he starts discussing the explosive growth of the potential superintelligence in Chapter 4. Hardware bottlenecks are basically ignored. Software bottlenecks are ignored. Any other bottlenecks of any sort are definitely ignored. Instantly, there are no more bottles whatsoever, and suddenly the superintelligence is building nanofactories for itself because it thought up the blueprints very fast. At some point (handwaving) it achieves world domination. At some further point (faster handwaving) it is running a one-world government. And eventually (hands waving very excitedly) it overcomes the vast distances of spacetime and sends baby Von Neumann robots out to colonize the universe.

I can see how this sort of thing is very tempting. It’s difficult to imagine entities smarter than us, so it’s difficult to imagine them having any problems or hardships. I assume this is also difficult when your career and success therein seem to rely on your own intelligence, rather than sheer dumb luck. But failing to have the depth of imagination to consider a being-space between humanity and unknowable, omnipotent gods is concerning, on a scale comparable to H. P. Lovecraft.

A similar problem occurs in Chapter 8 (titled, excitingly, “Is the default outcome doom?”). Having satisfied himself with the inevitability of the new AI god, Bostrom runs through a titillatingly long list of ways in which it could turn out to be malignant, deceptive, or downright evil. Failing that, it could just misinterpret anything we try to tell it. Or maybe it could care about making paperclips way more than it cares about humans. We can’t do anything to stop it, therefore doom creeps over the horizon.

Again, this is largely a failure of imagination. There is no corresponding list of ways things could go right, or well, or even ambiguously. This is an argument about this worst possible case being more terrible than we can survive. This is not an argument for this worst possible case being the most likely case, or even a fairly likely case. It’s important to remember the difference.

There are times when it’s useful to base your discussion on the worst possible case. But the worst possible case here is already several branches down a large logic tree that may or may not actually exist. It is not actual existential danger. It is a theoretical possibility of an existential danger that may or may not come into play should certain possibilities all coincide.

There are hundreds, thousands, millions of other theoretical possibilities of existential danger, that Bostrom is not writing entire books about. A planet could collide with a large asteroid in another solar system and send a huge chunk of itself ricocheting directly towards us on a vector we’re ignoring. That big supervolcano under Yellowstone might get triggered because of an awkward reaction between a solar flare, the Earth’s magnetic pole switching polarity, and a bad bit of shale drilling, and yes, I’m obviously making this stuff up, but that’s kind of the point.

At this point in the story (and it is a story, more on that later, help this is going to be pages and pages) we realize we need to be very clever to avoid a vastly terrible superintelligence from doing terrible things to us in devilishly creative ways. We can’t keep it from arising, because Bostrom has already told us that’s impossible. Maybe we can guide it? Persuade it? Subjugate it first? Control it? Keep it in a tiny box? He considers some of these more or less promising. There is a table. He quickly moves on from how to make a baby superintelligence have values to how to decide which values it should have, as that’s where he believes the trouble truly lies. Values are like wishes; they may not be interpreted the way we assume they will be.

In fact, they probably won’t be. Which is where we need to be very clever. As far as I can tell, Bostrom’s Big Answer to getting a superintelligence to be nice to us is something he’s borrowed from Eliezer Yudkowsky3. It’s called Coherent Extrapolated Volition, which will always be referred to as CEV for short and because I mean, really, those big words. I’m going to quote Yudkowsky’s definition as it’s quoted by Bostrom:

“Our coherent extrapolated volition is our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together, where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted.”

I feel a little bad. It’s obvious that time and effort has been spent getting the wording just right, making sure all the bases are covered, trying to eliminate misunderstandings and misinterpretations. It’s a very lovely wishful statement about ourselves and our future!

But.

This is a religious statement. Despite the claims that this is a way of avoiding moral judgments and definitions, this is a prayer that the coming AI God will give us not what we want in our deeply irrational now, but what we ultimately will want in our best future, that God will be on our side, help us be better than we are. This is praying for God to be Good and Not Evil. This is praying away the Apocalypse.

We shouldn’t be surprised, we’re humans. Where else did you think we’d end up? Not writing religious stories about ourselves, our world and our futures? This is the sort of thing we do in our sleep. We write stories about the things that go bump in the night, or the things we’re afraid will go bump in the night. We write magical incantations to protect ourselves from the vast, cool intelligences that exist outside our ken, because we have to sleep somehow now that we’ve thought them up. We write stories to tell us that we have some say in our own lives, and in our futures, and in the futures of our children. Sometimes we even dress these stories up as academic books with roots in philosophy and computer science.

At any rate, this is not cleverly and rationally avoiding certain existential danger. In the end, a superintelligent AI as defined by Bostrom is not controllable, is not guaranteed to grow up in the way we want it to, and this CEV is merely a suggestion that it behave itself for our sake. Bit of a shame really. I’d honestly like to see him put some good work into something like AI safety, maybe some acknowledgment that algorithms and learning systems don’t have to be smarter than us or even all that advanced to make a hash of things because of the ways we program our faulty assumptions into them. But instead, this is what we have.

So, no. Nick Bostrom’s Superintelligence isn’t a good book; it doesn’t do what it sets out to accomplish. Bostrom hasn’t given us a warning about a definite existential danger. He also hasn’t given us a way to clearly see or avoid said existential danger. It’s not even a very good story; there is far better science fiction and fantasy and theological work being written every day. Go read some of those.