jump to navigation

Hello World, the Lagrange Multiplier Theorem, and Kuhn-Tucker conditions July 26, 2009

Posted by Phi. Isett in Calculus, economics, Uncategorized.
Tags: , , , , , ,

This is a blog which was started on a whim in order to host a very extensive reply to a blog entry of Terence Tao.  This entry is not the reply — the next entry is.

I may continue this blog to contain pieces of mathematics that I find pretty, important and/or not so commonly known.  In any case, I’d feel silly devoting an entire blog to a single reply.  So here.. I’ll take this opportunity to include my favorite proof of the Lagrange multiplier theorem, and later on maybe I’ll post a nice proof of Stirling’s formula, a group algebraic proof of the Riemann-Lebesgue lemma, or something…

The Lagrange Multiplier Theorem helps us solve constrained maximization / minimization problems, making it (among other things) extremely important in economics.  A (weak form of) the Lagrange Multiplier theorem can be stated as follows:

Let f : \mathbb{R}^n \to \mathbb{R} be a real-valued, continuously differentiable function (called the “objective function” — the thing we want to maximize or minimize), and let g : \mathbb{R}^n \to \mathbb{R}^m be continuously differentiable (called the “constraint function” for reasons which will soon be clear).

If x_0 is an extremizer of the restriction of f to the (codimension m) “constraint manifold” given by M \equiv \{ x : g(x) = 0 \}, then there is a unique linear functional \lambda : \mathbb{R}^m \to \mathbb{R} satisfying the equality of linear maps Df(x_0) = \lambda \circ Dg(x_0).

We assume m \leq n — things tend to be more complicated when the number of constraints exceeds the number of degrees of freedom.


We know that f must satisfy some first order condition: namely (by further restricting f to trajectories on M) that Df(x_0)v = 0 for any v \in \mathbb{R}^n which can be realized as a velocity at x_0 by a path within the constraint manifold. An exercise in the implicit function theorem shows that these velocities include all of \mbox{ Ker } Dg(x_0) (these are all such velocities — they are the directions in which one can move without changing g, and together they are called the “tangent space” of M at x_0 under generic circumstances).

Finally, now that \mbox{ Ker } Dg(x_0) \subseteq \mbox{ Ker } Df(x_0), it is a trivial exercise in linear algebra to show that there is a unique linear functional \lambda : \mathbb{R}^m \to \mathbb{R} defined by the identity Df(x_0) = \lambda \circ Dg(x_0).  Basically, we have shown that the zero set of the linear map Dg(x_0) is contained in the zero set of Df(x_0), but because the two maps are linear, their level sets are all translations of the zero level set, and therefore we conclude that every level set of Dg(x_0) is contained in a level set of Df(x_0).

Does anyone know how to turn off the annoying preview thing for hyperlinks..? Never mind, got it.

Edit (29 Aug, 2009):

The same ideas can be used to give a proof of the Kuhn-Tucker conditions for a constrained maximization problem where  we replace the system of equalities \{ g(x) = 0 \} with the system of inequalities \{ g_i(x) \leq 0 | i =1, \ldots, m \}.  We simply replace the role of trajectories within the constraint manifold by trajectories going into the constraint set.

Without loss of generality, we may assume a maximizer belongs to the boundary where  the constraints bind (i.e. g(x_0) = 0), for if one constraint does not bind (say g_m(x_0) < 0), we may restrict our domain to the open set \{ g_m(x_0) < 0 \} and simply remove this component from the constraint function, thereby decreasing m (the corresponding component of the Lagrange multiplier will therefore be zero).  Thus, we can picture the point x_0 to be a point of transverse intersection of m hypersurfaces which together form the boundary of the constraint set (at least nearby x_0) — some kind of corner.

We now modify the proof of the Lagrange multiplier theorem by showing also that Dg(x_0)v \leq 0 \Rightarrow Df(x_0) v \leq 0 at a maximizer x_0 where g(x_0) = 0 — but this is easy to see.  The inequality Dg(x_0)v \leq 0 means (in this situation) that the velocity v points into the constraint set \{ g_1(x), \ldots, g_m(x) \leq 0\} since no constraint function increases in this direction.  Therefore we cannot have f increasing in this direction at a constrained maximum, so we cannot have \frac{f(x_0 + \epsilon v) - f(x_0)}{\epsilon} \sim Df(x_0) v > 0.

Again, this argument relies on applying the implicit function theorem to g at the point x_0 in order to assure that all such velocities for which Dg(x_0)v \leq 0 can be achieved as a velocity at x_0 of a trajectory which travels into the constraint set, but like many other applications of the implicit function theorem, this fact is intuitively clear when one pictures the generic situation.

The monotonicity property we have just established — Dg(x_0)v \leq 0 \Rightarrow Df(x_0) v \leq 0 — is equivalent to the non-negativity of the functional \lambda satisfying \lambda \circ Dg(x_0) = Df(x_0) .  Note here that the Lagrange multiplier theorem still applies (i.e. there is a \lambda) since x_0 still maximizes the objective function over the smaller constraint set \{ g(x) = 0 \} after we have removed the constraints which do not bind.

The above extension of the Lagrange multiplier theorem is due to Karush and the resulting conditions are sometimes referred to as “Kuhn-Tucker” conditions (or “Karush-Kuhn-Tucker conditions”).  One runs into such inequality constraints at every corner in economics.  For example, a consumer cannot buy more than she can afford (unless she lives in the US?), nor can a firm produce more than its available production technologies and resources allow…

I should comment on how one can more directly see the conclusions of the Lagrange multiplier theorem and the Karush-Kuhn-Tucker conditions. First the Lagrange multiplier theorem, where we return to the setting of a codimension m constraint manifold \{ g_1(x) = \ldots = g_m(x) = 0 \}.

The Lagrange multiplier theorem asserts that Df(x_0) = \lambda_1 Dg_1(x_0) + \ldots + \lambda_m Dg_m(x_0); in other words, the derivative of f (which points in the direction of increase of f) lies in the linear space spanned by the derivatives of the constraint functions.  This span is exactly the orthogonal complement of the tangent space at x_0 to the constraint manifold.  If Df(x_0) did not lie entirely in this orthogonal complement, there would be a nontrivial projection of Df(x_0) in a direction tangent to the constraint manifold.  Choosing a trajectory in the boundary of the constraint set which achieves this projection as a velocity, the objective function f would increase along this trajectory at x_0, which cannot occur at a constrained maximizer/minimizer.

To see the necessity of the Karush-Kuhn-Tucker conditions directly (that \lambda_i \geq 0), it is easiest to visualize a corner formed where two lines in a plane intersect transversely , or a corner where three planes in three-dimensional space intersect transversely.  If we picture the inequality constraint as a shaded region in the ambient space with this corner as the boundary, then the vectors \{ Dg_1(x_0), \ldots, Dg_m(x_0) \} are normal to the boundary hypersurfaces (), pointing outside of the constraint region since they correspond to directions of increase of the constraint functions.  In order for the corner x_0 to be a maximizer for an objective function f, the gradient of f, which points in the direction of increase of f, must point within the cone which lies “between” all the normal vectors \{ Dg_1(x_0), \ldots, Dg_m(x_0) \}.  This fact is easiest to see in a picture… which I… really should… provide…  This cone is the set of non-negative linear combinations of the normal vectors.


No comments yet — be the first.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

<span>%d</span> bloggers like this: