Response to Cegłowski on superintelligence - Machine Intelligence Research Institute
Skip to content
Response to Cegłowski on superintelligence
January 13, 2017
Matthew Gray
Web developer Maciej Cegłowski recently gave a talk on AI safety (video, text) arguing that we should be skeptical of the standard assumptions that go into working on this problem, and doubly skeptical of the extreme-sounding claims, attitudes, and policies these premises appear to lead to. I’ll give my reply to each of these points below.
First, a brief outline: this will mirror the structure of Cegłowski’s talk in that first I try to put forth my understanding of the broader implications of Cegłowski’s talk, then deal in detail with the inside-view arguments as to whether or not the core idea is right, then end by talking some about the structure of these discussions.
(i) Broader implications
Cegłowski’s primary concern seems to be that there are lots of ways to misuse AI in the near term, and that worrying about long-term AI hazards may distract from working against short-term misuse. His secondary concern seems to be that worrying about AI risk looks problematic from the outside view. Humans have a long tradition of millenarianism, or the belief that the world will radically transform in the near future. Historically, most millenarians have turned out to be wrong and behaved in self-destructive ways. If you think that UFOs will land shortly to take you to the heavens, you might make some short-sighted financial decisions, and when the UFOs don’t arrive, you are full of regrets.
I think the fear that focusing on long-term AI dangers will distract from short-term AI dangers is misplaced. Attention to one kind of danger will probably help draw more attention to other, related kinds of danger. Also, risks associated with extraordinarily capable AI systems appear to be more difficult and complex than risks associated with modern AI systems in the short term, suggesting that the long-term obstacles will require more lead time to address. If it is as easy to avert these dangers as some optimists think, then we lose very little by starting early; if it is difficult (but doable), then we lose much by starting late.
With regards to outside-view concerns, I question how much we can learn about external reality from focusing only on human psychology. Many people have thought they could fly, for one reason or another. But some people actually can fly, and the person who bets against the Wright brothers based on psychological and historical patterns of error (instead of generalizing from, in this case, regularities in physics and engineering) will lose their money. The best way to get those bets right is to wade into the messy inside-view arguments.
As a Bayesian, I agree that we should update on surface-level evidence that an idea is weird or crankish. But I also think that argument screens off evidence from authority; if someone who looks vaguely like a crank can’t provide good arguments for why they expect UFOs to land in Greenland in the next hundred years, and someone else who looks vaguely like a crank can provide good arguments for why they expect AGI to be created in the next hundred years, then once I’ve heard their arguments I don’t need to put much weight on whether or not they initially looked like a crank. Surface appearances are genuinely useful, but only to a point. And even if we insist on reasoning based on surface appearances, I think those look pretty good. ((As examples, see, e.g., Stuart Russell (Berkeley), Francesca Rossi (IBM), Shane Legg (Google DeepMind), Eric Horvitz (Microsoft), Bart Selman (Cornell), Ilya Sutskever (OpenAI), Andrew Davison (Imperial College London), David McAllester (TTIC), Jürgen Schmidhuber (IDSIA), and Geoffrey Hinton (University of Toronto).))
Cegłowski put forth 11 inside-view and 11 outside-view critiques that I’ll paraphrase and then address:
(ii) Inside-view arguments
1. Argument from wooly definitions
Many arguments for working on AI safety trade on definition tricks, where the sentences “A implies B” and “B implies C” both seem obvious, and this is used to argue for a less obvious claim “A implies C”; but in fact “B” is being used in two different senses in the first two sentences.
That’s true for a lot of low-grade futurism out there, but I’m not aware of any examples of Bostrom making this mistake. The best arguments for working on long-term AI safety depend on some vague terms, because we don’t have a good formal understanding of a lot of the concepts involved; but that’s different from saying that the arguments rest on ambiguous or equivocal terms. In my experience, the substance of the debate doesn’t actually change much if we paraphrase away specific phrasings like “general intelligence.” ((See What is Intelligence? for more on this idea, and vague terminology in general.))
The basic idea is that human brains are good at solving various cognitive problems, and the...