Why the gradient is a list of partial derivatives

Why the gradient is a list of partial derivatives | Michał Prządka - Blog

Blog

Published:May 24, 2026

Go Back Search

Why the gradient is a list of partial derivatives If there is one thing that keeps me awake at night, it is multivariate calculus. The gradient formula always struck me as suspicious in how plain it looks:

∇f=(∂f∂x, ∂f∂y)\nabla f = \left(\frac{\partial f}{\partial x},\ \frac{\partial f}{\partial y}\right)∇f=(∂x∂f, ∂y∂f) You could practically guess it. Partial derivatives are numbers, there is one per coordinate, so you stack them in a list and see what comes out. The thing that comes out happens to point in the direction of fastest change, with a length equal to that maximum rate.

Run that recipe on a real landscape — the two hills below. The formula for their height fff is a bit of a mess, but the recipe only ever needs its two partial derivatives.

You are here — the spot where we read the two slopes.

Stand anywhere and read two numbers: the slope as you step east, ∂f/∂x\partial f / \partial x∂f/∂x, and as you step north, ∂f/∂y\partial f / \partial y∂f/∂y. At the marked spot they come out to 6.4 and 5.2. Stack them into the pair (6.4, 5.2).

That pair, read as a vector, turns out to be an arrow pointing straight up the steepest slope. Nothing in the recipe mentioned “uphill,” or even “direction” — we read two slopes, stapled them into a list, and the list came out knowing the steepest way up.

And nothing about that spot was special. Move anywhere on the landscape and do the same thing: read the slope east, read the slope north, stack them. At (1.5, 2.0), just north of the taller hill: (3.6, -9.4). At (-1.0, -2.5), below the lower one: (-2.0, 3.5). At (-3.5, -1.0), out to its west: (8.1, 1.1). Every spot has its pair, and every pair points straight up the steepest slope there.

To me this always felt like magic: the whole geometry of steepest ascent, falling out of a list. That gap, between the playfulness of the construction and the depth of the result, is what this post is about.

What we will need

You will need three small ideas about arrows — none of them new if you have ever pointed at something — and one from calculus.

An arrow has a direction and a length. Both carry meaning. Velocity is an arrow (a speed plus a direction). A displacement on a map (“two right, three up”) is an arrow. Anything with a direction and an amount can be drawn as one.

An arrow: a direction plus a length.

Two arrows from the same origin are either pointing the same way, or perpendicular, or somewhere between. That “somewhere between” is a single number once we have a name for it. For now, we just need the picture: same direction is full overlap, ninety degrees is none.

Same way, somewhere between, or perpendicular.

Any arrow can be written as a sum of pieces along directions you choose. “Two right, three up” is already this idea. The chosen directions get a fancy name (basis vectors), but you already do this every time you read a map.

A partial derivative is the slope along one chosen axis. Say h(x,y)h(x, y)h(x,y) is the height of a hill above the map point (x,y)(x, y)(x,y). To find how steep it is left-right, hold yyy fixed and watch how hhh changes as xxx moves — that single-variable slope is the partial derivative, ∂h/∂x\partial h / \partial x∂h/∂x. Left-right almost certainly isn’t the steepest direction on the hill. It’s just a direction you happen to care about, and the partial derivative answers “how steep, this way?” for the way you picked. Ordinary calculus, taken one coordinate at a time.

That is everything we need. Now we can put them to work.

On the slope

Picture yourself standing on a ski slope. There is one direction in which the slope falls away fastest, the line a dropped ball would follow. Plant your skis across (perpendicular to) that line and you stand still: zero change in height as you slide along the skis. Every skier knows this without writing anything down. My daughters learned it the hard way, hunting for the safest position to stand up from after every fall.

Plant your skis across the fall line and the altitude needle stops.

Before you push off, pick any direction to slide in — it doesn’t matter which. Slide nearly along your skis and you barely pick up speed; aim them down the mountain and you race away. The same rule governs how fast you lose height: point along the skis and the altitude barely moves, point downhill and it drops. Either way, the direction you picked — and every other one you could have picked — breaks into two pieces: one along your skis (across the slope, so it changes nothing), one along the steepest line (changes everything you are going to change). Only the second piece moves the altitude needle, and how much it moves depends only on how long that second piece is.

Slow down here — this is the whole trick.

Imagine sliding toward a pine tree downhill from you. An arrow runs from your skis to the tree:

One direction, picked at random: the...

Why the gradient is a list of partial derivatives

Related Articles

Amazon, Facebook, FBI have access to a private intelligence-sharing network

SpaceX not the behemoth everyone thought

The Mirror Is Part of the Machine

Elevated error rates on requests to multiple models

Donald Trump and sons to be 'forever' exempt from tax audits