Optimizing things in the USSR · Chris Said
Chris Said
I am a data scientist at Propel. This blog is mostly about statistics, technology, and science.
Optimizing things in the USSR
11 May 2016
As a data scientist, a big part of my job involves picking metrics to optimize and thinking about how to do things as efficiently as possible. With these types of questions on my mind, I recently discovered a totally fascinating book about about economic problems in the USSR and the team of data-driven economists and computer scientists who wanted to solve them. The book is called Red Plenty. It’s actually written as a novel, weirdly, but it nevertheless presents an accurate economic history of the USSR. It draws heavily on an earlier book from 1973 called Planning Problems in the USSR, which I also picked up. As I read these books, I couldn’t help but notice some parallels with planning in any modern organization. In what will be familiar to any data scientist today, the second book even includes a quote from a researcher who complained that 90% of his time was spent cleaning the data, and only 10% of his time was spent doing actual modeling!
Beyond all the interesting parallels to modern data science and operations research, these books helped me understand a lot of interesting things I previously knew very little about, such as linear programming, price equilibria, and Soviet history. This blog post is about I learned.
Balance sheets and manual calculation: Kind of a trainwreck
The main task in the centrally planned Soviet economy was to allocate resources so that a desired assortment of goods and services was produced. Every year, certain target outputs for each good were established. Armed with estimates of the available input resources, central administrators used balance sheets to set plans for every factory, specifying exactly how much input commodities each factory would receive, and how much output it should produce. Up through the 1960s, this was always done by manual calculation. Since there were hundreds of thousands of commodities, and since the supply chains had many dependency steps, it was impossible to compute the full balance sheets for the economy. The administrators therefore decided to make some simplifying assumptions. As a result of these these simplifying assumptions, resource allocation became a bit of a trainwreck. Below are a few of the simplifications and their consequences.
Dimensionality reduction by removing variables. Because there were too many commodities to track, administrators often limited their analysis to the 10,000 most important commodities in the economy. But when the production of those commodities were planned, there was often a hidden shortage of commodities whose output was not planned centrally but which were used as inputs to one of the 10,000 planned products. Factories that depended on those commodities often sat idle for months as they waited for the shortages to end.
Dimensionality reduction by aggregation. Apparently, steel tubes can come in thousands of different types. They can come in different lengths, different shapes, and different compositions. To reduce the dimensionality of the problem, administrators would often track the total tonnage of a few broad classes of steel tubes in the models, rather than using a more detailed classification scheme. While their models successfully balanced the tonnage of tubes for the broad categories (the output in tons of tube-producing factories matched the input requirements in tons of tube-consuming factories), there were constant surpluses of some specific types of tubes, and shortages of other specific types of tubes. In particular, since tonnage was used as a metric, tube-producing factories were overly incentivized to make easy-to-produce thick tubes. As a result, thin tubes were always in short supply.
Propagating adjustments only a few degrees back. Let’s say that during balance calculations, the administrators realized they needed to bump up the target output of one commodity. If they did that, it was also necessary to bump up the output targets of commodities that were input into the target commodity. But if they did that, they also needed to bump up the output targets of commodities that fed into those commodities, and so on! This involved a crazy amount of extra hand calculations every time they needed make an adjustment. To simplify things, the administrators typically made adjustments to the first-order suppliers, without making the necessary adjustments to the suppliers of the suppliers. This of course led to critical shortages of input commodities, which again led to idle factories.
-->
Figure 1. Some example inputs and outputs in the Soviet economy in 1951, described in units of weight. This summary shows an extreme dimensionality reduction, more extreme than was ever used in planning. In this diagram, most commodities are excluded and each displayed commodity collapses across multiple...