When will computer hardware match the human brain? by Hans Moravec
Institute for Ethics and Emerging Technologies
contents
call for papers
editorial board
how to submit to<br>JET
support JET & IEET
search JET
When will computer hardware
match the human brain?
Journal of Evolution and Technology. 1998.<br>Vol. 1 -<br>PDF<br>Version
(Received Dec. 1997)
Hans Moravec
Robotics Institute
Carnegie Mellon University
Pittsburgh, PA 15213-3890, USA
net: hpm@cmu.edu
web: http://www.frc.ri.cmu.edu/~hpm/
ABSTRACT
This paper describes how the performance of AI machines<br>tends to improve at the same pace that AI researchers get<br>access to faster hardware. The processing power and memory<br>capacity necessary to match general intellectual performance<br>of the human brain are estimated. Based on extrapolation of<br>past trends and on examination of technologies under<br>development, it is predicted that the required hardware will<br>be available in cheap machines in the 2020s.
Brains, Eyes and Machines
Computers have far to go to match human strengths, and<br>our estimates will depend on analogy and extrapolation.<br>Fortunately, these are grounded in the first bit of the<br>journey, now behind us. Thirty years of computer vision<br>reveals that 1 MIPS can extract simple features from<br>real-time imagery--tracking a white line or a white spot<br>on a mottled background. 10 MIPS can follow complex<br>gray-scale patches--as smart bombs, cruise missiles and<br>early self-driving vans attest. 100 MIPS can follow<br>moderately unpredictable features like roads--as recent<br>long NAVLAB trips demonstrate. 1,000 MIPS will be<br>adequate for coarse-grained three-dimensional spatial<br>awareness--illustrated by several mid-resolution<br>stereoscopic vision programs, including my own. 10,000<br>MIPS can find three-dimensional objects in<br>clutter--suggested by several "bin-picking" and<br>high-resolution stereo-vision demonstrations, which<br>accomplish the task in an hour or so at 10 MIPS. The data<br>fades there--research careers are too short, and computer<br>memories too small, for significantly more elaborate<br>experiments.
There are considerations other than sheer scale. At 1<br>MIPS the best results come from finely hand-crafted<br>programs that distill sensor data with utmost efficiency.<br>100-MIPS processes weigh their inputs against a wide<br>range of hypotheses, with many parameters, that learning<br>programs adjust better than the overburdened programmers.<br>Learning of all sorts will be increasingly important as<br>computer power and robot programs grow. This effect is<br>evident in related areas. At the close of the 1980s, as<br>widely available computers reached 10 MIPS, good optical<br>character reading (OCR) programs, able to read most<br>printed and typewritten text, began to appear. They used<br>hand-constructed "feature detectors" for parts<br>of letter shapes, with very little learning. As computer<br>power passed 100 MIPS, trainable OCR programs appeared<br>that could learn unusual typestyles from examples, and<br>the latest and best programs learn their entire data<br>sets. Handwriting recognizers, used by the Post Office to<br>sort mail, and in computers, notably Apple's Newton, have<br>followed a similar path. Speech recognition also fits the<br>model. Under the direction of Raj Reddy, who began his<br>research at Stanford in the 1960s, Carnegie Mellon has<br>led in computer transcription of continuous spoken<br>speech. In 1992 Reddy's group demonstrated a program<br>called Sphinx II on a 15-MIPS workstation with 100 MIPS<br>of specialized signal-processing circuitry. Sphinx II was<br>able to deal with arbitrary English speakers using a<br>several-thousand-word vocabulary. The system's word<br>detectors, encoded in statistical structures known as<br>Markov tables, were shaped by an automatic learning<br>process that digested hundreds of hours of spoken<br>examples from thousands of Carnegie Mellon volunteers<br>enticed by rewards of pizza and ice cream. Several<br>practical voice-control and dictation systems are sold<br>for personal computers today, and some heavy users are<br>substituting larynx for wrist damage.
More computer power is needed to reach human performance,<br>but how much? Human and animal brain sizes imply an<br>answer, if we can relate nerve volume to computation.<br>Structurally and functionally, one of the best understood<br>neural assemblies is the retina of the vertebrate eye.<br>Happily, similar operations have been developed for robot<br>vision, handing us a rough conversion factor.
The retina is a transparent, paper-thin layer of nerve<br>tissue at the back of the eyeball on which the eye's lens<br>projects an image of the world. It is connected by the<br>optic nerve, a million-fiber cable, to regions deep in<br>the brain. It is a part of the brain convenient for<br>study, even in living animals because of its peripheral<br>location and because its function is straightforward<br>compared with the brain's other mysteries. A human retina<br>is less than a centimeter square and a half-millimeter<br>thick. It has about 100 million neurons, of five distinct<br>kinds. Light-sensitive cells feed...