You are currently browsing the monthly archive for April 2012.

Tim Gowers’ post about Polymath paper published

On January 27, 2009 Tim Gowers’ blog he asked “Is massively collaborative mathematics possible?”. In the blog post he wrote,

“In short, if a large group of mathematicians could connect their brains efficiently, they could perhaps solve problems very efficiently as well.

The next obvious question is this. Why would anyone agree to share their ideas? Surely we work on problems in order to be able to publish solutions and get credit for them. And what if the big collaboration resulted in a very good idea? Isn’t there a danger that somebody would manage to use the idea to solve the problem and rush to (individual) publication?

Here is where the beauty of blogs, wikis, forums etc. comes in: they are completely public, as is their entire history. To see what effect this might have, imagine that a problem was being solved via comments on a blog post. Suppose that the blog was pretty active and that the post was getting several interesting comments. And suppose that you had an idea that you thought might be a good one. Instead of the usual reaction of being afraid to share it in case someone else beat you to the solution, you would be afraid not to share it in case someone beat you to that particular idea. And if the problem eventually got solved, and published under some pseudonym like Polymath, say, with a footnote linking to the blog and explaining how the problem had been solved, then anybody could go to the blog and look at all the comments. And there they would find your idea and would know precisely what you had contributed. There might be arguments about which ideas had proved to be most important to the solution, but at least all the evidence would be there for everybody to look at.”

So, he did just that! He started a polymath project on his blog to tackle a problem that had already been proven but not with an elementary proof, the density Hales-Jewett theorem. Specifically, “a combinatorial approach to density Hales-Jewett, is about one specific idea for coming up with a new proof for the density Hales-Jewett theorem in the case of an alphabet of size 3” which is often referred to in the blog as DHJ(3). In short, combinatorializing the ergodic-theoretic proof of DHJ(3). He wrote,

“Let me briefly try to defend my choice of problem. I wanted to choose a genuine research problem in my own area of mathematics, rather than something with a completely elementary statement or, say, a recreational problem, just to show that I mean this as a serious attempt to do real mathematics and not just an amusing way of looking at things I don’t really care about. This means that in order to have a reasonable chance of making a substantial contribution, you probably have to be a fairly experienced combinatorialist. In particular, familiarity with Szemerédi’s regularity lemma is essential. So I’m not expecting a collaboration between thousands of people, but I can think of far more than three people who are suitably qualified in the above way.”

Things kicked off February 1, 2009 and by March 10, 2009 a solution was being announced! The proof was submitted to the arXiv on October 20, 2009 and now will appear in the Annals of Mathematics.

You can watch it all unfold here:

Read the proof here:

This is the wiki for polymath projects:

This is the polymath blog started by Terence Tao:

In the previous post, two proofs were given for the Cauchy-Schwarz inequality. We will now consider another proof.

Definition 1 Let {M} be an {n\times n} matrix written as a {2\times2} block matrix

\displaystyle M=\left[\begin{array}{cc} A & B\\ C & D \end{array}\right],

where {A} is a {p\times p} matrix, {D} is a {q\times q} matrix, {B} is a {p\times q}, and {C} is a {q\times p}, so {n=p+q}. Assuming {A} is nonsingular, then

\displaystyle M/A=D-CA^{-1}B

is called the Schur complement of {A} in {M}; or the Schur complement of {M} relative to {A}.

The Schur complement probably goes back to Carl Friedrich Gauss (1777-1855) (for Gaussian elimination). To solve the linear system

\displaystyle \left[\begin{array}{cc} A & B\\ C & D \end{array}\right]\left[\begin{array}{c} x\\ y \end{array}\right]=\left[\begin{array}{c} c\\ d \end{array}\right],

that is

\displaystyle \begin{array}{rcl} Ax+By & = & c\\ Cx+Dy & = & d, \end{array}

by mimicking Gaussian elimination, that is, if {A} is square and nonsingular, then by eliminating {x}, by multiplying the first equation by {CA^{-1}} and subtracting the second equation, we get

\displaystyle (D-CA^{-1}B)y=d-CA^{-1}c.

Note, the matrix {D-CA^{-1}B} is the Schur complement of {A} in {M}, and if it is square and nonsingular, then we can obtain the solution to our system.

The Schur complement comes up in Issai Schur’s (1875-1941) seminal lemma published in 1917, in which the Schur determinate formula was introduced. By considering elementary operations of partitioned matrices, let

\displaystyle M=\left[\begin{array}{cc} A & B\\ C & D \end{array}\right],

where {A} is square and nonsingular. We can change {M} so that the lower-left and upper-right submatrices become {0}. More precisely, we can make the lower-left and upper-right submatrices {0} by subtracting the first row multiplied by {CA^{-1}} from the second row, and by subtracting the first column multiplied by {A^{-1}B} from the second column. In symbols,

\displaystyle \left[\begin{array}{cc} A & B\\ C & D \end{array}\right]\rightarrow\left[\begin{array}{cc} A & B\\ 0 & D-CA^{-1}B \end{array}\right]\rightarrow\left[\begin{array}{cc} A & 0\\ 0 & D-CA^{-1}B \end{array}\right],

and in equation form,

\displaystyle \left[\begin{array}{cc} I & 0\\ -CA^{-1} & I \end{array}\right]\left[\begin{array}{cc} A & B\\ C & D \end{array}\right]\left[\begin{array}{cc} I & -A^{-1}B\\ 0 & I \end{array}\right]=\left[\begin{array}{cc} A & 0\\ 0 & D-CA^{-1}B \end{array}\right].

Note that we have obtain the following factorization of {M}:

\displaystyle \left[\begin{array}{cc} A & B\\ C & D \end{array}\right]=\left[\begin{array}{cc} I & 0\\ CA^{-1} & I \end{array}\right]\left[\begin{array}{cc} A & 0\\ 0 & D-CA^{-1}B \end{array}\right]\left[\begin{array}{cc} I & A^{-1}B\\ 0 & I \end{array}\right].

By taking the determinants

\displaystyle \left|\begin{array}{cc} A & B\\ C & D \end{array}\right|=\left|\begin{array}{cc} I & 0\\ CA^{-1} & I \end{array}\right|\left|\begin{array}{cc} A & 0\\ 0 & D-CA^{-1}B \end{array}\right|\left|\begin{array}{cc} I & A^{-1}B\\ 0 & I \end{array}\right|.

we obtain the Schur’s determinant formula for {2\times2} block matrices,

\displaystyle \det M=\det A\;\det(M/A).

Mathematician Emilie Virginia Haynsworth (1916-1985) introduced a name and a notation for the Schur complement of a square nonsingular (or invertible) submatrix in a partitioned (two-way block) matrix. The term Schur complement first appeared in Emily’s 1968 paper On the Schur Complement in Basel Mathematical Notes, then in Linear Algebra and its Applications Vol. 1 (1968), AMS Proceedings (1969), and in Linear Algebra and its Applications Vol. 3 (1970).

We will now present a {2\times2} block matrix proof, focusing on {m\times n} complex matrices.

Proof: Let {A,B\in\mathbb{C}^{m\times n}}. Then

\displaystyle M=\left[\begin{array}{cc} I & A^{*}\\ B & I \end{array}\right]\left[\begin{array}{cc} I & B^{*}\\ A & I \end{array}\right]=\left[\begin{array}{cc} I+A^{*}A & A^{*}+B^{*}\\ A+B & I+BB^{*} \end{array}\right]\geq0.

By taking the Schur complement of {I+A^{*}A}, we arrive at

\displaystyle S=I+BB^{*}-(A+B)(I+A^{*}A)^{-1}(A^{*}+B{}^{*})\geq0

and hence

\displaystyle (I+A^{*}A)(I+BB^{*})\geq(A+B)(A+B)^{*}.

which ensures, when {A} and {B} are square, that

\displaystyle \left|\det(A+B)\right|^{2}\leq\det(I+A^{*}A)\;\det(I+BB^{*}).

Equality occurs if and only if rank {M=} rank {A}; that is, by the Guttman rank additivity formula, rank {M=} rank {A+}rank {S} if and only if {S=0}. When {A} is nonsingular, {M} is nonsingular if and only if {S} is nonsingular.\Box


[1] Zhang, Fuzhen. Matrix theory: basic results and techniques. Springer Science & Business Media, 2011.

PhD Comics

Definition 1 A vector space {V} over the number field {\mathbb{C}} or {\mathbb{R}} is called an inner product space if it is equipped with an inner product {\left\langle \cdot,\cdot\right\rangle } satisfying for all {u,v,w\in V} and scalar {c},

  1. {\left\langle u,u\right\rangle \geq0}, {\left\langle u,u\right\rangle =0} if and only if {u=0},
  2. {\left\langle u+v,w\right\rangle =\left\langle u,v\right\rangle +\left\langle v,w\right\rangle },
  3. {\left\langle cu,v\right\rangle =c\left\langle u,v\right\rangle }, and
  4. {\left\langle u,v\right\rangle =\overline{\left\langle v,u\right\rangle }}.

{\mathbb{C}^{n}} is an inner product space over {\mathbb{C}} with the inner product

\displaystyle \left\langle x,y\right\rangle =y^{*}x=\overline{y_{1}}x_{1}+\cdots+\overline{y_{n}}x_{n}.

An inner product space over {\mathbb{R}} is usually called a Euclidean space.

The following properties of an inner product can be deduced from the four axioms in Definition 1:

  1. {\left\langle x,cy\right\rangle =\bar{c}\left\langle x,y\right\rangle },
  2. {\left\langle x,y+z\right\rangle =\left\langle x,y\right\rangle +\left\langle x,z\right\rangle },
  3. {\left\langle ax+by,cw+dz\right\rangle =a\bar{c}\left\langle x,w\right\rangle +b\bar{c}\left\langle y,w\right\rangle +a\bar{d}\left\langle x,z\right\rangle +b\bar{d}\left\langle y,z\right\rangle },
  4. {\left\langle x,y\right\rangle =0} for all {y\in V} if and only if {x=0}, and
  5. {\left\langle x,\left\langle x,y\right\rangle y\right\rangle =\left|\left\langle x,y\right\rangle \right|^{2}}.

An important property shared by all inner products is the Cauchy-Schwarz inequality and, for an inner product space, one of the most useful inequalities in mathematics.

Theorem 1 (Cauchy-Schwarz Inequality) Let {V} be an inner product space. Then for all vectors {x} and {y} in {V} over the field {\mathbb{C}} or {\mathbb{R}},

\displaystyle \left|\left\langle x,y\right\rangle \right|^{2}\leq\left\langle x,x\right\rangle \left\langle y,y\right\rangle .

Equality holds if and only if {x} and {y} are linearly dependent.

The proof of this can be done in a number of different ways. The most common proof is to consider the quadratic function in {t}

\displaystyle \left\langle x+ty,x+ty\right\rangle \ge0

and derive the inequality from the non-positive discriminant. We will first present this proof.

Proof: Let {x,y\in V} be given. If {y=0}, the assertion is trivial, so we may assume that {y\neq0}. Let {t\in\mathbb{R}} and consider

\displaystyle \begin{array}{rcl} p(t) & \equiv & \left\langle x+ty,x+ty\right\rangle \\ & = & \left\langle x,x\right\rangle +t\left\langle y,x\right\rangle +t\left\langle x,y\right\rangle +t^{2}\left\langle y,y\right\rangle \\ & = & \left\langle x,x\right\rangle +2t\, Re\left\langle x,y\right\rangle +t^{2}\left\langle y,y\right\rangle , \end{array}

which is a real quadratic polynomial with real coefficients. Because of axiom (1.), we know that {p(t)\geq0} for all real {t}, and hence {p(t)} can have no real simple roots. The discriminant of {p(t)} must therefore be non-positive

\displaystyle (2\, Re\left\langle x,y\right\rangle )^{2}-4\left\langle y,y\right\rangle \left\langle x,x\right\rangle \leq0

and hence

\displaystyle (Re\left\langle x,y\right\rangle )^{2}\leq\left\langle x,x\right\rangle \left\langle y,y\right\rangle . \ \ \ \ \ (1)

Since this inequality must hold for any pair of vectors, it must hold if {y} is replaced by {\left\langle x,y\right\rangle y}, so we also have the inequality

\displaystyle (Re\left\langle x,\left\langle x,y\right\rangle y\right\rangle )^{2}\leq\left\langle x,x\right\rangle \left\langle y,y\right\rangle \left|\left\langle x,y\right\rangle \right|^{2}

But {Re\left\langle x,\left\langle x,y\right\rangle y\right\rangle =Re\overline{\left\langle x,y\right\rangle }\left\langle x,y\right\rangle =Re\left|\left\langle x,y\right\rangle \right|^{2}=\left|\left\langle x,y\right\rangle \right|^{2}}, so

\displaystyle \left|\left\langle x,y\right\rangle \right|^{4}\leq\left\langle x,x\right\rangle \left\langle y,y\right\rangle \left|\left\langle x,y\right\rangle \right|.^{2} \ \ \ \ \ (2)

If {\left\langle x,y\right\rangle =0}, then the statement of the theorem is trivial; if not, then we may divide equation (2) by the quantity {\left|\left\langle x,y\right\rangle \right|^{2}} to obtain the desired inequality

\displaystyle \left|\left\langle x,y\right\rangle \right|^{2}\leq\left\langle x,x\right\rangle \left\langle y,y\right\rangle .

Because of axiom (1.), {p(t)} can have a real (double) root only if {x+ty=0} for some {t}. Thus, equality can occur in the discriminant condition in equation (1) if and only if {x} and {y} are linearly dependent. \Box

We will now present a matrix proof, focusing on the complex vector space, which is perhaps the simplest proof of the Cauchy-Schwarz inequality.

Proof: For any vectors {x,y\in\mathbb{C}^{n}} we noticed that,

\displaystyle \left[x,y\right]^{*}\left[x,y\right]=\left[\begin{array}{cc} x^{*}x & x^{*}y\\ y^{*}x & y^{*}y \end{array}\right]\geq0.

By taking the determinant for the {2\times2} matrix,

\displaystyle \begin{array}{rcl} \left|\begin{array}{cc} x^{*}x & x^{*}y\\ y^{*}x & y^{*}y \end{array}\right| & = & (x^{*}x)(y^{*}x)-(x^{*}y)(y^{*}x)\\ & = & \left\langle x,x\right\rangle \left\langle x,y\right\rangle -\left\langle y,x\right\rangle \left\langle x,y\right\rangle \\ & = & \left\langle x,x\right\rangle \left\langle x,y\right\rangle -\overline{\left\langle x,y\right\rangle }\left\langle x,y\right\rangle \\ & = & \left\langle x,x\right\rangle \left\langle x,y\right\rangle -\left|\left\langle x,y\right\rangle \right|^{2} \end{array}

the inequality follows at once,

\displaystyle \left|\left\langle x,y\right\rangle \right|^{2}\leq\left\langle x,x\right\rangle \left\langle y,y\right\rangle .

Equality occurs if and only if the {n\times2} matrix {\left[x,y\right]} has rank 1; that is, {x} and {y} are linearly dependent. \Box


[1] Zhang, Fuzhen. Matrix theory: basic results and techniques. Springer Science & Business Media, 2011.

Now work to define the terms in the Taniyama-Shimura conjecture is complete, part 4 presents the famous Taniyama-Shimura conjecture which led to a proof of Fermat’s Last Theorem.

A Mind for Madness

We’ve done a lot of work so far just to try to define the terms in the Taniyama-Shimura conjecture, but today we should finally make it. Our last piece of information is to write down what the L-function of a modular form is. Since I don’t want to build a whole bunch of theory needed to define the special class of modular forms we’ll be considering, I’ll just say that we actually need to restrict our definition of “modular form” to “normalized cuspidal Hecke eigenform”. I’ll point out exactly why we need this, but it doesn’t change anything in the conjecture except that every elliptic curve actually corresponds to an even nicer type of modular form.

Let $latex {f\in S_k(\Gamma_0(N))}&fg=000000$ be a weight $latex {k}&fg=000000$ cusp form with $latex {q}&fg=000000$-expansion $latex {\displaystyle f=\sum_{n=1}^\infty a_n q^n}&fg=000000$. Since this is an analytic function on the disk, we have the tools and theorems…

View original post 642 more words

Taniyama-Shimura 3: L-Series where it will be crucial in the definition of modularity.

A Mind for Madness

For today, we assume our $latex {d}&fg=000000$-dimensional variety $latex {X/\mathbb{Q}}&fg=000000$ has the property that its middle etale cohomology is 2-dimensional. It won’t hurt if you want to just think that $latex {X}&fg=000000$ is an elliptic curve. We will first define the L-series via the Galois representation that we constructed last time. Fix $latex {p}&fg=000000$ a prime not equal to $latex {\ell}&fg=000000$ and of good reduction for $latex {X}&fg=000000$. Let $latex {M=\overline{\mathbb{Q}}^{\ker \rho_X}}&fg=000000$. By definition the representation factors through $latex {{Gal} (M/\mathbb{Q})}&fg=000000$. For $latex {\frak{p}}&fg=000000$ a prime lying over $latex {p}&fg=000000$ the decomposition group $latex {D_{\frak{p}}}&fg=000000$ surjects onto $latex {{Gal} (\overline{\mathbf{F}}_p/\mathbf{F}_p)}&fg=000000$ with kernel $latex {I_{\frak{p}}}&fg=000000$. One of the subtleties we’ll jump over to save time is that $latex {\rho_X}&fg=000000$ acts trivially on $latex {I_{\frak{p}}}&fg=000000$ (it follows from the good reduction assumption), so we can lift the generator of $latex {{Gal} (\overline{\mathbf{F}}_p/\mathbf{F}_p)}&fg=000000$ to get a conjugacy class $latex {{Frob}_p}&fg=000000$ whose image under…

View original post 917 more words

Taniyama-Shimura 2: Galois Representations where the standard modern approach to defining modularity for other types of varieties.

A Mind for Madness

Fix some proper variety $latex {X/\mathbb{Q}}&fg=000000$. Our goal today will seem very strange, but it is to explain how to get a continuous representation of the absolute Galois group of $latex {\mathbb{Q}}&fg=000000$ from this data. I’m going to assume familiarity with etale cohomology, since describing Taniyama-Shimura is already going to take a bit of work. To avoid excessive notation, all cohomology in this post (including the higher direct image functors) are done on the etale site.

For those that are intimately familiar with etale cohomology, we’ll do the quick way first. I’ll describe a more hands on approach afterwards. Let $latex {\pi: X\rightarrow \mathrm{Spec} \mathbb{Q}}&fg=000000$ be the structure morphism. Fix an algebraic closure $latex {v: \mathrm{Spec} \overline{\mathbb{Q}}\rightarrow \mathrm{Spec}\mathbb{Q}}&fg=000000$ (i.e. a geometric point of the base). We’ll denote the base change of $latex {X}&fg=000000$ with respect to this morphism $latex {\overline{X}}&fg=000000$. Suppose the dimension of $latex {X}&fg=000000$ is $latex {n}&fg=000000$.


View original post 374 more words

Great post on understanding the statement of the famous Taniyama-Shimura conjecture that led to the proof of Fermat’s Last Theorem.

A Mind for Madness

It’s time to return to plan A. I started this year by saying I’d post on some fundamental ideas in arithmetic geometry. The local system thing is hard to get motivated about, since the way I was going to use it in my research seems irrelevant at the moment. My other option was to blog some stuff about class field theory, since there is a reading group on the topic that I belong to this quarter.

The first goal of this new series is to understand the statement of the famous Taniyama-Shimura conjecture that led to the proof of Fermat’s Last Theorem. A lot of people can probably mumble something about the conjecture if they have any experience in algebraic/arithmetic geoemtry or any of the number theory type fields, but most people probably can’t say anything precise about what the conjecture says (I’ll continue to call it a “conjecture” even…

View original post 766 more words

STEM Poster
Ups and Downs of American STEM Education.

Enter your email address to follow this blog and receive notifications of new posts by email.