Why Tool AIs Want to Be Agent AIs (2016)

-->

Why Tool AIs Want to Be Agent AIs · Gwern.net

Warning: JavaScript Disabled!

For support of key website features (link annotation popups/popovers & transclusions, collapsible sections, backlinks, tablesorting, image zooming, sidenotes etc.), you must enable JavaScript.

AI economics, tech economics, x-risk, insight porn, AI safety, RL scaling

AIs limited to pure computation (Tool AIs) supporting humans, will be less intelligent, efficient, and economically valuable than more autonomous reinforcement-learning AIs (Agent AIs) who act on their own and meta-learn, because all problems are reinforcement-learning problems.

2016-09-07–2018-08-28 finished certainty: likely importance: 9 backlinks similar bibliography

Economic

Intelligence

Actions for Intelligence

Actions Internal to a Computation

Actions Internal to Training

Actions Internal to Data Selection

Actions Internal to NN Design

Actions External to the Agent

Overall

Why You Shouldn’t Be A Tool

See Also

External Links

Autonomous AI systems (Agent AIs) trained using reinforcement learning can do harm when they take wrong actions, especially superintelligent Agent AIs. One solution would be to eliminate their agency by not giving AIs the ability to take actions, confining them to purely informational or inferential tasks such as classification or prediction (Tool AIs), and have all actions be approved & executed by humans, giving equivalently superintelligent results without the risk.

I argue that this is not an effective solution for two major reasons. First, because Agent AIs will by definition be better at actions than Tool AIs, giving an economic advantage. Secondly, because Agent AIs will be better at inference & learning than Tool AIs, and this is inherently due to their greater agency: the same algorithms which learn how to perform actions can be used to select important datapoints to learn inference over, how long to learn, how to more efficiently execute inference, how to design themselves, how to optimize hyperparameters, how to make use of external resources such as long-term memories or external software or large databases or the Internet, and how best to acquire new data.

RL is a terrible way to learn anything complex from scratch, but it is the least bad way to learn how to control something complex—and the world is full of complex systems we want to control, including AIs themselves.

All of these actions will result in Agent AIs more intelligent than Tool AIs, in addition to their greater economic competitiveness. Thus, Tool AIs will be inferior to Agent AIs in both actions and intelligence, implying use of Tool AIs is an even more highly unstable equilibrium than previously argued, as users of Agent AIs will be able to outcompete them on two dimensions (and not just one).

That is: “tool AIs want to be agent AIs”. (And agent AIs want more agency.)

One proposed solution to AI risk is to suggest that AIs could be limited purely to supervised/unsupervised learning, and not given access to any sort of capability that can directly affect the outside world such as robotic arms. In this framework, AIs are treated purely as mathematical functions mapping data to an output such as a classification probability, similar to a logistic or linear model but far more complex; most deep learning neural networks like ImageNet image classification convolutional neural networks (CNN)s would qualify. The gains from AI then come from training the AI and then asking it many questions which humans then review & implement in the real world as desired. So an AI might be trained on a large dataset of chemical structures labeled by whether they turned out to be a useful drug in humans and asked to classify new chemical structures as useful or non-useful; then doctors would run the actual medical trials on the drug candidates and decide whether to use them in patients etc. Or an AI might look like Google Maps/Waze: it answers your questions about how best to drive places better than any human could, but it does not control any traffic lights country-wide to optimize traffic flows nor will it run a self-driving car to get you there. This theoretically avoids any possible runaway of AIs into malignant or uncaring actors who harm humanity by satisfying dangerous utility functions and developing instrumental drives. After all, if they can’t take any actions, how can they do anything that humans do not approve of?

Two variations on this limiting or boxing theme are

Oracle AI: Nick Bostrom, in Superintelligence (201412ya) (pg145–158) notes that while they can be easily ‘boxed’ and in some cases like P/NP problems the answers can be cheaply checked or random subsets expensively verified, there are several issues with oracle AIs:

the AI’s definition of ‘resources’ or ‘staying inside the box’ can change as it learns more about the world (ontological crises)

responses might manipulate users into asking easy (and useless...

Why Tool AIs Want to Be Agent AIs (2016)

Related Articles

(no title)

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI

Italy's Meloni says Trump 'made up' story that she 'begged' him for photo at G7