## LaTeX2WP, Princeton grad student seminar, and characteristic polynomial coefficients November 15, 2009

Posted by Phi. Isett in Uncategorized.

(It’ll be pretty cool if it catches on! But for me it probably means I will help to run the blog, and for this purpose I’ll definitely need the converter.)

I need to at least try to write some kind of math, so I’ll explain something which I think is cute: how to express the coefficients of a characteristic polynomial of a matrix in terms of sums of determinants of other matrices constructed from its entries.  Actually, I’ll first give an example which contains all the ideas. Consider the ${3 \times 3}$ matrix ${A}$ whose entries are… Let’s say

$\displaystyle A = \left( \begin{array}{ccc} 1 & 2 & 5 \\ 3 & 4 & 7 \\ 6 & 8 & 9 \end{array} \right)$

The characteristic polynomial ${\chi(x)}$ is the determinant of the matrix ${xI - A}$ where $I$ is the ${3 \times 3}$ identity matrix. ${\chi(x)}$ is a degree 3 polynomial with a leading coefficient 1. In terms of the ${\Lambda^3({\mathbb R}^3)}$, we have

$\displaystyle \begin{array}{c} x-1 \\ -3 \\ -6 \end{array} \wedge \begin{array}{c} -2 \\ x-4 \\ -8 \end{array} \wedge \begin{array}{c} -5 \\ -7 \\ x-9 \end{array} = \chi(x)\cdot e_1 \wedge e_2 \wedge e_3 . \ \ \ \ \ (1)$

I discussed the one-dimensional vector space ${\Lambda^3({\mathbb R}^3)}$ and its geometric meaning in a previous post about the Cauchy-Schwartz inequality and integration. We know that ${\chi(x) = x^3 + c_2 x^2 + c_1 x + c_0}$, where ${c_0 = \chi(0)}$, and plugging in ${0}$ into the above expression, we see that ${c_0 = (-1)^3 \det(A) = \det(-A)}$. The point of this entry is that we can calculate the other derivatives by differentiating, and use the multi-linearity of the wedge product to differentiate easily.  Below the fold I will give an example of how this computation works out, I will state what nice, general formula is proven by this method, and I will discuss the geometric meaning of this computation.  (And at the end I will ask a question about this LaTeX2WP/Python business which is still troubling me)

Characteristic Polynomial Coefficients

For example, ${c_1 = \chi'(0)}$, and ${\chi'(x)}$ can be obtained from 1 to see that

$\displaystyle \begin{array}{c} 1 \\ 0 \\ 0 \end{array} \wedge \begin{array}{c} -2 \\ x-4 \\ -8 \end{array} \wedge \begin{array}{c} -5 \\ -7 \\ x-9 \end{array} + \begin{array}{c} x-1 \\ -3 \\ -6 \end{array} \wedge \begin{array}{c} 0 \\ 1 \\ 0 \end{array} \wedge \begin{array}{c} -5 \\ -7 \\ x-9 \end{array} +$

$\displaystyle + \begin{array}{c} x-1 \\ -3 \\ -6 \end{array} \wedge \begin{array}{c} -2 \\ x-4 \\ -8 \end{array} \wedge \begin{array}{c} 0 \\ 0 \\ 1 \end{array} = \chi'(x) \cdot e_1 \wedge e_2 \wedge e_3$

and after performing some parallel shifts, this identity reduces to

$\displaystyle \begin{array}{c} 1 \\ 0 \\ 0 \end{array} \wedge \begin{array}{c} 0 \\ x-4 \\ -8 \end{array} \wedge \begin{array}{c} 0 \\ -7 \\ x-9 \end{array} + \begin{array}{c} x-1 \\ 0 \\ -6 \end{array} \wedge \begin{array}{c} 0 \\ 1 \\ 0 \end{array} \wedge \begin{array}{c} -5 \\ 0 \\ x-9 \end{array} +$

$\displaystyle + \begin{array}{c} x-1 \\ -3 \\ 0 \end{array} \wedge \begin{array}{c} -2 \\ x-4 \\ 0 \end{array} \wedge \begin{array}{c} 0 \\ 0 \\ 1 \end{array} = \chi'(x) \cdot e_1 \wedge e_2 \wedge e_3$

Giving the expression

$\displaystyle \chi'(x) = \det\left( \begin{array}{cc} x-4 & -8 \\ -7 & x-9 \end{array} \right) + \det \left( \begin{array}{cc} x-1 & -6 \\ -5 & x-9 \end{array} \right)$

$\displaystyle + \det\left( \begin{array}{cc} x-1 & -3 \\ -2 & x-4 \end{array} \right) = \mbox{tr} \Lambda^2(x - I A),$

and we can plug in ${0}$ to obtain ${c_1}$ as the sum of three ${2 \times 2}$ subdeterminants of ${A}$ .

Proceeding this way in the general case when $A : {\mathbb R}^n \to {\mathbb R}^n$ is linear and counting shared boundary faces, one shows that

$\displaystyle \frac{d}{dx} \mbox{tr} \Lambda^k(xI - A ) = (n-(k-1)) \mbox{tr} \Lambda^{k-1}(xI-A), \quad 1 \leq k \leq n$

(it is sufficient to prove this identity at x = 0).  The characteristic polynomial itself is, of course, $\mbox{tr} \Lambda^n(xI - A)$, so applying the above formula and Taylor expanding about $x = 0$, we obtain the explicit formula in terms of subdeterminants

$\displaystyle \chi(x) = \sum_{k=0}^n \mbox{tr} \Lambda^{n-k}(-A) \cdot x^k$

which can be compared to the expression involving the eigenvalues of $A$.

Now, the same formula holds for arbitrary commutative rings because only integers and natural quantities show up here, which allows us to use the result from ${\mathbb Z}[x_1, x_2, \ldots, x_l]$ (and the universal property of the polynomial ring) to pass to the general case (basically using indeterminants for the coefficients, and then plugging in arbitrary ring elements into the coefficients).

There is a nice geometric interpretation of the above procedure, which is most easily understood in the setting of the real numbers.  We know that the k’th exterior power $\Lambda^k({\mathbb R}^n)$ corresponds to k-parallelograms of vectors in ${\mathbb R}^n$ which remain equivalent under parallel shift, so it should be easy to see why the differentiation $\frac{d}{dx} \mbox{tr} \Lambda^k(xI - A )$ results in an expression involving parallelograms of dimension one less.  In the case $k = n$, think of the differentiation at $x = 0$ as a limit of difference quotients, where we compare the signed volumes of two nearby n-parallelograms, one of which being

$\eta = \displaystyle [-A](e_1 \wedge \ldots \wedge e_n)$

whatever the linear map $-A$ does to the standard basis parallelogram, and the other being the very nearby parallelogram

$\eta_x = \displaystyle [xI-A](e_1 \wedge \ldots \wedge e_n)$

with x a very small number.  Thinking of these parallograms as oriented regions in ${\mathbb R}^n$, the difference in signed volume is easily seen to be an integral supported on essentially the boundary of the parallelogram $\eta$.  Because the perturbation is linear, the quantity integrated on each face basically depends only on the direction of the face (in the limit as x goes to 0), and this is why, upon differentiating once, we obtain the expression we have seen for the fluxes through the faces of $\eta$.  The resulting expression ends up being particularly simple when the perturbation of $-A$ is in the direction of the identity as in our case, but the picture generalizes to perturbations in general directions, and so one has to go into more detail in order to understand precisely what this flux is for any particular perturbation (I believe it is basically the original volume form contracted with a certain vector field associated to the perturbation that ends up being integrated over the boundary).  When one differentiates twice, one similarly obtains integrals over the oriented (n-2)-dimensional boundaries of the $n-1$-dimensional faces, which share (n-2)-dimensional faces with each other (although in the general case it is much less obvious the answer should be supported on the boundary).  This sharing of faces results in the factor of $(n - (k-1))$ in the formula, which counts the number of k-dimensional faces which have each (k-1)-dimensional face as part of their boundary.

I hope the picture is clear.  This method of computation can be used to show that differentiating the determinant function at the identity matrix gives the trace of the perturbation direction (which one can in fact take as an intrinsic definition of trace, if he should desire) although that fact is more related to asymptotics at $x = \infty$ for the computation presented above.  By the way, does anybody know any different, intrinsic definitions of trace? (The more geometric, the better; I am very curious — this one I have given has the geometric meaning of a proportional rate of change of volumes of regions under the flow of a vector field, but I don’t know if there’s a better one.)

Welp.. The post seems to have turned out OK.  (A huge amount of gratitude goes out to Luca Trevisan for writing his program).  I am slowly learning that it can be better to edit the tex file than to edit the post itself.  I wonder if the excess of plus signs looks stupid.  I wanted to put some kind of brackets around the vectors in the wedges, but doing so turned out to be a disaster.

But now I have a technical question about how to use the program (or Python?), and I would be very grateful for help:

I’m using Windows Vista (sorry), and for this reason (combined with my own stupidity, my lack of Python knowledge, etc.) had a bit of a headache figuring out how to work this program.  What I ended up doing (and I’m pretty sure this is stupid, but if anyone knows what to do instead, please let me know) is I literally edited the file latex2wp.py, and in the main body of the code I actually put in the names of the files.  E.g.

inputfile = “C:\Users\[more stuff]\charPolCoeffs.tex”

outputfile = “C:\Users\[the same stuff]\charPolCoeffs.html”

I had opened the .py file directly through IDLE and was able to use the “Run Module” feature in that program.  It worked, but I don’t think this is what Luca had in mind, and in particular is not what he instructed (but since I’m not on Linux, I can’t exactly do as he instructed).  Namely, Luca suggested (assuming, I believe, that I was in the Linux command prompt) that I input into the command line:

python latex2wp.py charPolCoeffs.tex

When I put these things into the python command prompt (after the arrows >>> ) , I think it believed I was trying to define the variable named “python”…  And perhaps also the variable “latex2wp”, but it got very confused upon reaching the ‘.’  Basically, I don’t think it was aware what I was trying to do at all, nor was I aware of how to tell it to go run this ‘.py’ file in a distant directory with the ‘.tex’ file as a parameter.  I don’t know anything about how Python works so I don’t understand what was wrong exactly or how to fix it.  Any help would be greatly appreciated, although it looks like I can survive with the uncomfortable, makeshift solution of literally editing the document for a while.

Edit (15 Nov 2009):  I have included a discussion of the geometric meaning of the computation and a statement of the general result obtained by the method.