Hello World, the Lagrange Multiplier Theorem, and Kuhn-Tucker conditions July 26, 2009
Posted by Phi. Isett in Calculus, economics, Uncategorized.Tags: constrained optimization, economics, Karush-Kuhn-Tucker, Kuhn-Tucker, Lagrange multiplier theorem, Lagrange multipliers, proof of Kuhn-Tucker
trackback
This is a blog which was started on a whim in order to host a very extensive reply to a blog entry of Terence Tao. This entry is not the reply — the next entry is.
I may continue this blog to contain pieces of mathematics that I find pretty, important and/or not so commonly known. In any case, I’d feel silly devoting an entire blog to a single reply. So here.. I’ll take this opportunity to include my favorite proof of the Lagrange multiplier theorem, and later on maybe I’ll post a nice proof of Stirling’s formula, a group algebraic proof of the Riemann-Lebesgue lemma, or something…
The Lagrange Multiplier Theorem helps us solve constrained maximization / minimization problems, making it (among other things) extremely important in economics. A (weak form of) the Lagrange Multiplier theorem can be stated as follows:
Let be a real-valued, continuously differentiable function (called the “objective function” — the thing we want to maximize or minimize), and let
be continuously differentiable (called the “constraint function” for reasons which will soon be clear).
If is an extremizer of the restriction of
to the (codimension
) “constraint manifold” given by
, then there is a unique linear functional
satisfying the equality of linear maps
.
We assume — things tend to be more complicated when the number of constraints exceeds the number of degrees of freedom.
Proof:
We know that must satisfy some first order condition: namely (by further restricting
to trajectories on
) that
for any
which can be realized as a velocity at
by a path within the constraint manifold. An exercise in the implicit function theorem shows that these velocities include all of
(these are all such velocities — they are the directions in which one can move without changing
, and together they are called the “tangent space” of
at
under generic circumstances).
Finally, now that , it is a trivial exercise in linear algebra to show that there is a unique linear functional
defined by the identity
. Basically, we have shown that the zero set of the linear map
is contained in the zero set of
, but because the two maps are linear, their level sets are all translations of the zero level set, and therefore we conclude that every level set of
is contained in a level set of
.
Does anyone know how to turn off the annoying preview thing for hyperlinks..? Never mind, got it.
Edit (29 Aug, 2009):
The same ideas can be used to give a proof of the Kuhn-Tucker conditions for a constrained maximization problem where we replace the system of equalities with the system of inequalities
. We simply replace the role of trajectories within the constraint manifold by trajectories going into the constraint set.
Without loss of generality, we may assume a maximizer belongs to the boundary where the constraints bind (i.e. ), for if one constraint does not bind (say
), we may restrict our domain to the open set
and simply remove this component from the constraint function, thereby decreasing
(the corresponding component of the Lagrange multiplier will therefore be zero). Thus, we can picture the point
to be a point of transverse intersection of
hypersurfaces which together form the boundary of the constraint set (at least nearby
) — some kind of corner.
We now modify the proof of the Lagrange multiplier theorem by showing also that at a maximizer
where
— but this is easy to see. The inequality
means (in this situation) that the velocity
points into the constraint set
since no constraint function increases in this direction. Therefore we cannot have
increasing in this direction at a constrained maximum, so we cannot have
.
Again, this argument relies on applying the implicit function theorem to at the point
in order to assure that all such velocities for which
can be achieved as a velocity at
of a trajectory which travels into the constraint set, but like many other applications of the implicit function theorem, this fact is intuitively clear when one pictures the generic situation.
The monotonicity property we have just established — — is equivalent to the non-negativity of the functional
satisfying
. Note here that the Lagrange multiplier theorem still applies (i.e. there is a
) since
still maximizes the objective function over the smaller constraint set
after we have removed the constraints which do not bind.
The above extension of the Lagrange multiplier theorem is due to Karush and the resulting conditions are sometimes referred to as “Kuhn-Tucker” conditions (or “Karush-Kuhn-Tucker conditions”). One runs into such inequality constraints at every corner in economics. For example, a consumer cannot buy more than she can afford (unless she lives in the US?), nor can a firm produce more than its available production technologies and resources allow…
I should comment on how one can more directly see the conclusions of the Lagrange multiplier theorem and the Karush-Kuhn-Tucker conditions. First the Lagrange multiplier theorem, where we return to the setting of a codimension m constraint manifold .
The Lagrange multiplier theorem asserts that ; in other words, the derivative of
(which points in the direction of increase of
) lies in the linear space spanned by the derivatives of the constraint functions. This span is exactly the orthogonal complement of the tangent space at
to the constraint manifold. If
did not lie entirely in this orthogonal complement, there would be a nontrivial projection of
in a direction tangent to the constraint manifold. Choosing a trajectory in the boundary of the constraint set which achieves this projection as a velocity, the objective function
would increase along this trajectory at
, which cannot occur at a constrained maximizer/minimizer.
To see the necessity of the Karush-Kuhn-Tucker conditions directly (that ), it is easiest to visualize a corner formed where two lines in a plane intersect transversely , or a corner where three planes in three-dimensional space intersect transversely. If we picture the inequality constraint as a shaded region in the ambient space with this corner as the boundary, then the vectors
are normal to the boundary hypersurfaces (), pointing outside of the constraint region since they correspond to directions of increase of the constraint functions. In order for the corner
to be a maximizer for an objective function
, the gradient of
, which points in the direction of increase of
, must point within the cone which lies “between” all the normal vectors
. This fact is easiest to see in a picture… which I… really should… provide… This cone is the set of non-negative linear combinations of the normal vectors.
Comments»
No comments yet — be the first.