In the mathematical fields of differential equations and geometric analysis, the maximum principle is one of the most useful and best known tools of study. Solutions of a differential inequality in a domain D satisfy the maximum principle if they achieve their maxima at the boundary of D.
The maximum principle enables one to obtain information about solutions of differential equations without any explicit knowledge of the solutions themselves. In particular, the maximum principle is a useful tool in the numerical approximation of solutions of ordinary and partial differential equations and in the determination of bounds for the errors in such approximations.[1]
In a simple two-dimensional case, consider a function of two variables u(x,y) such that
The weak maximum principle, in this setting, says that for any open precompact subset M of the domain of u, the maximum of u on the closure of M is achieved on the boundary of M. The strong maximum principle says that, unless u is a constant function, the maximum cannot also be achieved anywhere on M itself.
Such statements give a striking qualitative picture of solutions of the given differential equation. Such a qualitative picture can be extended to many kinds of differential equations. In many situations, one can also use such maximum principles to draw precise quantitative conclusions about solutions of differential equations, such as control over the size of their gradient. There is no single or most general maximum principle which applies to all situations at once.
In the field of convex optimization, there is an analogous statement which asserts that the maximum of a convex function on a compact convex set is attained on the boundary.[2]
Here we consider the simplest case, although the same thinking can be extended to more general scenarios. Let M be an open subset of Euclidean space and let u be a C2 function on M such that
where for each i and j between 1 and n, aij is a function on M with aij = aji.
Fix some choice of x in M. According to the spectral theorem of linear algebra, all eigenvalues of the matrix [aij(x)] are real, and there is an orthonormal basis of ℝn consisting of eigenvectors. Denote the eigenvalues by λi and the corresponding eigenvectors by vi, for i from 1 to n. Then the differential equation, at the point x, can be rephrased as
The essence of the maximum principle is the simple observation that if each eigenvalue is positive (which amounts to a certain formulation of "ellipticity" of the differential equation) then the above equation imposes a certain balancing of the directional second derivatives of the solution. In particular, if one of the directional second derivatives is negative, then another must be positive. At a hypothetical point where u is maximized, all directional second derivatives are automatically nonpositive, and the "balancing" represented by the above equation then requires all directional second derivatives to be identically zero.
This elementary reasoning could be argued to represent an infinitesimal formulation of the strong maximum principle, which states, under some extra assumptions (such as the continuity of a), that u must be constant if there is a point of M where u is maximized.
Note that the above reasoning is unaffected if one considers the more general partial differential equation
since the added term is automatically zero at any hypothetical maximum point. The reasoning is also unaffected if one considers the more general condition
in which one can even note the extra phenomena of having an outright contradiction if there is a strict inequality (> rather than ≥) in this condition at the hypothetical maximum point. This phenomenon is important in the formal proof of the classical weak maximum principle.
However, the above reasoning no longer applies if one considers the condition
since now the "balancing" condition, as evaluated at a hypothetical maximum point of u, only says that a weighted average of manifestly nonpositive quantities is nonpositive. This is trivially true, and so one cannot draw any nontrivial conclusion from it. This is reflected by any number of concrete examples, such as the fact that
and on any open region containing the origin, the function −x2−y2 certainly has a maximum.
Let M denote an open subset of Euclidean space. If a smooth function is maximized at a point p, then one automatically has:
One can view a partial differential equation as the imposition of an algebraic relation between the various derivatives of a function. So, if u is the solution of a partial differential equation, then it is possible that the above conditions on the first and second derivatives of u form a contradiction to this algebraic relation. This is the essence of the maximum principle. Clearly, the applicability of this idea depends strongly on the particular partial differential equation in question.
For instance, if u solves the differential equation
then it is clearly impossible to have and at any point of the domain. So, following the above observation, it is impossible for u to take on a maximum value. If, instead u solved the differential equation then one would not have such a contradiction, and the analysis given so far does not imply anything interesting. If u solved the differential equation then the same analysis would show that u cannot take on a minimum value.
The possibility of such analysis is not even limited to partial differential equations. For instance, if is a function such that
which is a sort of "non-local" differential equation, then the automatic strict positivity of the right-hand side shows, by the same analysis as above, that u cannot attain a maximum value.
There are many methods to extend the applicability of this kind of analysis in various ways. For instance, if u is a harmonic function, then the above sort of contradiction does not directly occur, since the existence of a point p where is not in contradiction to the requirement everywhere. However, one could consider, for an arbitrary real number s, the function us defined by
It is straightforward to see that
By the above analysis, if then us cannot attain a maximum value. One might wish to consider the limit as s to 0 in order to conclude that u also cannot attain a maximum value. However, it is possible for the pointwise limit of a sequence of functions without maxima to have a maxima. Nonetheless, if M has a boundary such that M together with its boundary is compact, then supposing that u can be continuously extended to the boundary, it follows immediately that both u and us attain a maximum value on Since we have shown that us, as a function on M, does not have a maximum, it follows that the maximum point of us, for any s, is on By the sequential compactness of it follows that the maximum of u is attained on This is the weak maximum principle for harmonic functions. This does not, by itself, rule out the possibility that the maximum of u is also attained somewhere on M. That is the content of the "strong maximum principle," which requires further analysis.
The use of the specific function above was very inessential. All that mattered was to have a function which extends continuously to the boundary and whose Laplacian is strictly positive. So we could have used, for instance,
with the same effect.
Let M be an open subset of Euclidean space. Let be a twice-differentiable function which attains its maximum value C. Suppose that
Suppose that one can find (or prove the existence of):
Then L(u + h − C) ≥ 0 on Ω with u + h − C ≤ 0 on the boundary of Ω; according to the weak maximum principle, one has u + h − C ≤ 0 on Ω. This can be reorganized to say
for all x in Ω. If one can make the choice of h so that the right-hand side has a manifestly positive nature, then this will provide a contradiction to the fact that x0 is a maximum point of u on M, so that its gradient must vanish.
The above "program" can be carried out. Choose Ω to be a spherical annulus; one selects its center xc to be a point closer to the closed set u−1(C) than to the closed set ∂M, and the outer radius R is selected to be the distance from this center to u−1(C); let x0 be a point on this latter set which realizes the distance. The inner radius ρ is arbitrary. Define
Now the boundary of Ω consists of two spheres; on the outer sphere, one has h = 0; due to the selection of R, one has u ≤ C on this sphere, and so u + h − C ≤ 0 holds on this part of the boundary, together with the requirement h(x0) = 0. On the inner sphere, one has u < C. Due to the continuity of u and the compactness of the inner sphere, one can select δ > 0 such that u + δ < C. Since h is constant on this inner sphere, one can select ε > 0 such that u + h ≤ C on the inner sphere, and hence on the entire boundary of Ω.
Direct calculation shows
There are various conditions under which the right-hand side can be guaranteed to be nonnegative; see the statement of the theorem below.
Lastly, note that the directional derivative of h at x0 along the inward-pointing radial line of the annulus is strictly positive. As described in the above summary, this will ensure that a directional derivative of u at x0 is nonzero, in contradiction to x0 being a maximum point of u on the open set M.
The following is the statement of the theorem in the books of Morrey and Smoller, following the original statement of Hopf (1927):
Let M be an open subset of Euclidean space ℝn. For each i and j between 1 and n, let aij and bi be continuous functions on M with aij = aji. Suppose that for all x in M, the symmetric matrix [aij] is positive-definite. If u is a nonconstant C2 function on M such that
on M, then u does not attain a maximum value on M.
The point of the continuity assumption is that continuous functions are bounded on compact sets, the relevant compact set here being the spherical annulus appearing in the proof. Furthermore, by the same principle, there is a number λ such that for all x in the annulus, the matrix [aij(x)] has all eigenvalues greater than or equal to λ. One then takes α, as appearing in the proof, to be large relative to these bounds. Evans's book has a slightly weaker formulation, in which there is assumed to be a positive number λ which is a lower bound of the eigenvalues of [aij] for all x in M.
These continuity assumptions are clearly not the most general possible in order for the proof to work. For instance, the following is Gilbarg and Trudinger's statement of the theorem, following the same proof:
Let M be an open subset of Euclidean space ℝn. For each i and j between 1 and n, let aij and bi be functions on M with aij = aji. Suppose that for all x in M, the symmetric matrix [aij] is positive-definite, and let λ(x) denote its smallest eigenvalue. Suppose that and are bounded functions on M for each i between 1 and n. If u is a nonconstant C2 function on M such that
on M, then u does not attain a maximum value on M.
One cannot naively extend these statements to the general second-order linear elliptic equation, as already seen in the one-dimensional case. For instance, the ordinary differential equation y″ + 2y = 0 has sinusoidal solutions, which certainly have interior maxima. This extends to the higher-dimensional case, where one often has solutions to "eigenfunction" equations Δu + cu = 0 which have interior maxima. The sign of c is relevant, as also seen in the one-dimensional case; for instance the solutions to y″ - 2y = 0 are exponentials, and the character of the maxima of such functions is quite different from that of sinusoidal functions.