Function's Doppelgänger (Fenchel Conjugate)

drunello1 pts0 comments

Your function's doppelgänger

..

2025-07-04

Your function's doppelgänger

One of the first things I realized by approaching convex analysis is the importance of notation. The use of simple symbols or tiny marks adjustments to encapsulate different topics, sometimes even a wide class of problems, is crucial for developing extremely powerful mathematical primitives.<br>The Fenchel conjugate, introduced by Werner Fenchel in 1949, perfectly illustrates this idea. Although it may seem mysterious at first glance, it is one of the most important primitives in duality theory. In this post, as in others I may write in the future, I’m not aiming to develop a fully rigorous mathematical treatment of this concept. Instead, I want to offer inputs that spark the kind of mental machinery that helps build a practical intuition, making it easier to categorize and recall in your mind.

Definition

Given any function \(f:\mathbb{R}^n\to\mathbb{R}\), the Fenchel conjugate of \(f\) is defined as

\[f^\star(y)=\sup_x\; \langle x,y\rangle - f(x)\]

You can refer to the legendary Boyd’s book (chapter 3.3) or Stanford class record to have a nice and more exhaustive introduction to this concept. At first glance, the Fenchel conjugate resembles the optimal value of a tiny optimization problem in \(x\), where the optimal value is kept as function of the parameter \(y\). Thus, speaking about notation, the apparently harmless symbol \(f^\star\) is quietly hiding the solution to a certain optimization problem involving its spiritual relative \(f\). This is a good path to develop a first intuition about this operator, so it is worth elaborate more with an example.

Example for hypebeasts

Imagine you work at Nike and want to determine the optimal quantity of the upcoming hyped sneakers to release on SNKRS that are going to be sniped by hypebeasts. Given that your marketing manager has already set fixed prices

\[y=(y_1,\dots,y_n)\in\mathbb{R}^n\]

for each of the \(n\) upcoming collections, you are asking:

“What are the optimal amounts of sneakers \(x=(x_1,\dots,x_n)\in\mathbb{R}^n\) that should be produced and released for this drop? Should we drop more Air Jordan 3 or Nike Dunk Low?”.

Suppose that the production and distribution cost is captured by a certain function \(f:\mathbb{R}^n\to\mathbb{R}\) (as fancy as you want) which aggregates the total cost of producing each quantity \(x_i\) for every collection. In other words, \(f(x)\) represents the total cost for this drop if you decide to produce the vector of amounts \(x\). On the other hand, since the \(i\)-th collection is going to be sold at a price of \(y_i\) dollars and assuming the hype for the upcoming drop is so high that there will likely be no leftovers, the total revenue for this drop is going to be

\[\langle x, y\rangle=\sum_{i=1}^n y_ix_i\]

Having said that, it would be wise producing the amount of sneakers that would maximize the profit for this drop, where the profit function is simply total revenues minus total costs, that is:

\[\Pi(x;y)=\langle x, y\rangle-f(x)\]

As a result, armed with optimization-solvers, your goal is to solve the following optimization problem:

\[\begin{align*}<br>\sup_{x}\; \Pi(x;y) &= \sup_{x}\; \langle x, y\rangle-f(x)\\<br>&=:f^\star(y)<br>\end{align*}\]

which precisely resembles the Fenchel conjugate definition! In other words, by knowing the fenchel conjugate \(f^\star\) of the production and distribution cost function \(f\) you also know an upper bound on the maximal profit you can score from this drop given any possible prices recommended by the marketing manager, and such upper bound is attainable if you produce the optimal amount of sneakers

\[x^\star(y) = \arg\max_x\; \langle x, y\rangle-f(x)\]

Giving prices \(y\), if \(f^\star(y)\) is particularly low then you can’t be particularly ambitious on the gains of this drop, and you have a provable argument to blame the marketing manager for the chosen prices.

Some properties

Convexity and lower semi-continuity

The Fenchel conjugates \(f^\star\) might have different properties depending on the related function \(f\). However, two properties that are always guaranteed (independently from the structure of \(f\)) are

Convexity

Lower semi-continuity (closed epigraph)

Both properties are particularly desireable in numerical optimization (for reasons I might discuss in future posts) and can be rigorously derived. To get a rough idea of why these two properties are always ensured, you can focus on the fact that by definition \(f^\star\) is the element-wise supremum of a collection of affine functions. Notice in fact that

\[\langle x, y\rangle-f(x)\]

is affine in \(y\) if you consider \(x\) as a simple parameter. So for example by picking different values of \(x\), like \(x_1\) or \(x_2\), you come up with two different linear functions. In other words, the parameter \(x_i\) that you pick fully characterizes the function \(\langle x_i, y\rangle-f(x_i)\): this means that if you have a...

function fenchel star drop langle rangle

Related Articles