Proofs

This article needs more work.

Please help by expanding it!

A proof uses evidence (examples, counterexamples, exhaustion of cases) or logic (construction, direct proof, contradiction, induction) to conclusively demonstrate the validity of a previously conjectured mathematical statement (and is then called a theorem, lemma or corollary.) Having a proof is what sets theorems apart from axioms (or postulates) and conjectures.

Structure of a proof

At its most basic, a proof asserts the statement to be proved in as unambiguous a manner as possible, derives obvious consequences from the statement, and from the obvious consequences builds a chain of logical arguments until the statement emerges as a logical consequence of obvious details.

Most of the examples in this article will be drawn from number theory.

Theorem. The sum of any five consecutive integers is divisible by 5.

Proof. Given a set of five consecutive integers, there is a smallest integer and a largest integer. Let's call the smallest integer $\scriptstyle n\,$ . The other integers then are $\scriptstyle n+1\,$ , $\scriptstyle n+2\,$ , $\scriptstyle n+3\,$ and $\scriptstyle n+4\,$ , and their sum is $\scriptstyle n+(n+1)+(n+2)+(n+3)+(n+4)\,$ . Redistributing we obtain $\scriptstyle 5n+1+2+3+4\,=\,5n+10\,$ . Since 10 is a multiple of 5, we can go even further: $\scriptstyle 5(n+2)\,$ . It doesn't matter if $\scriptstyle n+2\,$ is divisible by 5 or not, because the sum of the five consecutive integers works out to that quantity multiplied by 5, and obviously that's divisible by 5 as specified by the theorem. □

The use of the end of proof mark (□) is widespread.

The Wiles proof of Fermat's conjecture (usually called Fermat's last theorem, because Fermat claimed to have a proof, which was never found) is a clear example that a short and/or simple statement can require a very long and/or sophisticated proof, perhaps spanning hundreds of logical steps across dozens of pages. In such cases, it is beneficial for the readers (and for the author) to break down the proof into smaller, not usually interesting on their own, preliminary statements (called lemmas) each with its own proof. Then the final proof, using those preliminary statements, becomes more manageable.

Consider a simple example.

Theorem. The sum of

\scriptstyle m\,

consecutive integers is divisible by

\scriptstyle m\,

if and only if

\scriptstyle m\,

is odd.

Lemma 1. Two consecutive integers are coprime. Given an integer $\scriptstyle a\,\in \,\mathbb {Z} \,$ and another integer $\scriptstyle b\,=\,a\,\pm \,1\,$ , the equality $\scriptstyle \gcd(a,b)\,=\,1\,$ holds.

We take it as axiomatic that 1 is not a prime number, mostly for convenience. The preceding lemma and the following proof would be needlessly convoluted otherwise, with verbiage such as "apart from 1."

Proof. Because $\scriptstyle a\,$ and $\scriptstyle b\,$ are consecutive, this means that $\scriptstyle |a-b|\,=\,1\,$ . Suppose that $\scriptstyle a\,$ and $\scriptstyle b\,$ share a prime factor $\scriptstyle p\,\in \,\mathbb {P} \,\in \,\mathbb {Z} ^{+}\,$ Therefore $\scriptstyle |a-b|\,=\,kp\,$ , with $\scriptstyle k\,>\,0\,$ . Even if $\scriptstyle k\,=\,1\,$ , that would still mean $\scriptstyle kp\,>\,1\,$ , contradicting $\scriptstyle |a-b|\,=\,1\,$ , a consequence of $\scriptstyle a\,$ and $\scriptstyle b\,$ being consecutive. Therefore, $\scriptstyle a\,$ and $\scriptstyle b\,$ share no prime factors.

Lemma 2. The sum of two consecutive integers is odd and not divisible by 2.

Proof. Given two consecutive integers, one is odd, the other is even. Let's call the even integer $\scriptstyle n\,$ . The other is either $\scriptstyle n+1\,$ or $\scriptstyle n-1\,$ , and their sum is either $\scriptstyle 2n+1\,$ or $\scriptstyle 2n-1\,$ . Obviously $\scriptstyle 2n\,$ is even, it has 2 as a prime factor at least twice. With either $\scriptstyle 2n+1\,$ or $\scriptstyle 2n-1\,$ , $\scriptstyle 2n\,$ makes a pair of consecutive integers. Per Lemma 1, they are coprime, they can't both have 2 as a shared prime factor. Therefore both $\scriptstyle 2n+1\,$ and $\scriptstyle 2n-1\,$ are odd, and since they represent the sum of two consecutive integers, that means such a sum is odd.

Lemma 3. The sum of $\scriptstyle m\,$ consecutive integers is divisible by $\scriptstyle m\,$ when $\scriptstyle m\,$ is odd.

Proof. Given a set of $\scriptstyle m\,$ consecutive integers, there is a smallest integer and a largest integer. Let's call the smallest integer $\scriptstyle n\,$ . The other integers then are $\scriptstyle n+1,\,\ldots ,\,n+(m-1)\,$ , and their sum is $\scriptstyle n+(n+1)+\ldots +(n+(m-1))\,$ . Redistributing we obtain $\scriptstyle mn+1+\ldots +(m-1)\,$ . Forgetting about $\scriptstyle mn\,$ for the moment, notice that our addends include 1 and $\scriptstyle m-1\,$ , and that $\scriptstyle 1+(m-1)\,=\,m\,$ . Since $\scriptstyle m\,$ is odd, $\scriptstyle m-1\,$ is even (this follows from Lemma 1), and therefore we can pair up addends from $\scriptstyle 1+(m-1)\,$ to $\scriptstyle ({\frac {m-1}{2}})+({\frac {m-1}{2}}+1)\,$ to obtain $\scriptstyle ({\frac {m-1}{2}})m\,$ . Remembering $\scriptstyle mn\,$ , our sum is thus $\scriptstyle mn+({\frac {m-1}{2}})m\,$ , which we can further rewrite as $\scriptstyle m(n+{\frac {m-1}{2}})\,$ . It doesn't matter if $\scriptstyle n+{\frac {m-1}{2}}\,$ is divisible by $\scriptstyle m\,$ or not, because the sum of the $\scriptstyle m\,$ consecutive integers works out to that quantity multiplied by $\scriptstyle m\,$ , and obviously that's divisible by $\scriptstyle m\,$ as specified by the theorem.

Lemma 4. The sum of $\scriptstyle m\,$ consecutive integers is not divisible by $\scriptstyle m\,$ when $\scriptstyle m\,$ is a singly even number.

See A016825 for the definition of singly even numbers.

Proof. Since $\scriptstyle m\,$ is singly even, that means $\scriptstyle {\frac {m}{2}}\,$ is odd. Therefore, we can divide the run of $\scriptstyle m\,$ consecutive integers into $\scriptstyle {\frac {m}{2}}\,$ pairs of consecutive integers. Per Lemma 2, each pair adds up to an odd number. Our intermediate addends are therefore an odd amount of odd numbers, so the overall sum is then also odd.

Now we are ready to prove the theorem, that the sum of $\scriptstyle m\,$ consecutive integers is divisible by $\scriptstyle m\,$ if and only if $\scriptstyle m\,$ is odd.

Proof. The case of odd $\scriptstyle m\,$ has already been proved by Lemma 3. Lemma 3 would be enough to prove the theorem except that the theorem says "if and only if" as opposed to "when" in the lemma. This means that not only do we have to prove divisibility occurs when $\scriptstyle m\,$ is odd, we also have to prove that it does not occur when $\scriptstyle m\,$ is even. However, Lemma 3 does light the way. If $\scriptstyle m\,$ is doubly even, or divisible by even larger powers of 2, we can still rewrite the sum of $\scriptstyle m\,$ consecutive integers as $\scriptstyle mn+1+\ldots +(m-1)\,$ . But when we match up the addends after $\scriptstyle mn\,$ to obtain more instances of $\scriptstyle m\,$ , we find that since $\scriptstyle m\,$ is even, $\scriptstyle m-1\,$ is odd and therefore when we get to $\scriptstyle ({\frac {m}{2}}-1)+({\frac {m}{2}}+1)\,=\,m\,$ , we have an addend left out, namely $\scriptstyle {\frac {m}{2}}\,$ . Since $\scriptstyle m\,$ is at least doubly even, $\scriptstyle {\frac {m}{2}}\,$ must also be even. But $\scriptstyle m(n+{\frac {m}{2}}-1)+{\frac {m}{2}}\,$ is not divisible by $\scriptstyle m\,$ , though it is divisible by $\scriptstyle {\frac {m}{2}}\,$ . This means that if $\scriptstyle m=2^{\alpha }k\,$ with $\scriptstyle \alpha \,>\,1\,$ and $\scriptstyle k\,$ some odd positive number, then the sum of $\scriptstyle m\,$ consecutive integers is divisible by $\scriptstyle 2^{\alpha -1}\,$ but not $\scriptstyle 2^{\alpha }\,$ and therefore not $\scriptstyle m\,$ . The singly even numbers were taken care of by Lemma 4, and just now we have addressed the other even numbers, thus completing the proof of the theorem. □

The above proof could be abbreviated, but that was not the purpose of the demonstration.

Kinds of proof

Implication proofs

Statement $\scriptstyle A\,$ implies statement $\scriptstyle B\,$ .

Equivalence proofs

Statement $\scriptstyle A\,$ if and only if statement $\scriptstyle B\,$ . (Statement $\scriptstyle A\,$ implies statement $\scriptstyle B\,$ , statement $\scriptstyle B\,$ implies statement $\scriptstyle A\,$ .)

Usually done in two parts.

Existential proofs

Cf. existential quantifier $\scriptstyle \exists \,$ .

There exists $\scriptstyle x\,$ in set $\scriptstyle S\,$ s.t. statement $\scriptstyle A(x)\,$ is true.

\exists x\in S,A(x)\,

There exists $\scriptstyle x\,$ in set $\scriptstyle S\,$ s.t. statement $\scriptstyle A(x)\,$ is false.

\exists x\in S,\neg A(x)\,

Uniqueness proofs

Cf. uniqueness quantifier $\scriptstyle \exists !\,$ .

There exists one and only one $\scriptstyle x\,$ in set $\scriptstyle S\,$ s.t. statement $\scriptstyle A(x)\,$ is true.

\exists !x\in S,A(x)\,

There exists one and only one $\scriptstyle x\,$ in set $\scriptstyle S\,$ s.t. statement $\scriptstyle A(x)\,$ is false.

\exists !x\in S,\neg A(x)\,

Universal proofs

Cf. universal quantifier $\scriptstyle \forall \,$ .

For all $\scriptstyle x\,$ in set $\scriptstyle S\,$ , statement $\scriptstyle A(x)\,$ is true.

\forall x\in S,A(x)\,

Non existence proofs

For all $\scriptstyle x\,$ in set $\scriptstyle S\,$ , statement $\scriptstyle A(x)\,$ is false.

\forall x\in S,\neg A(x)\,

Methods of proof

There are many different ways of proving some things, but these boil down to a few common methods.

Proof by example

A proof by example may be used for existential conjectures.

Proof by example is the simplest, most straightforward and most convincing method of proof. It is also the most limited: not only must an example exist, it must be accessible to the author. Also, the example might prompt more questions than answers.

For example, to prove that there exist numbers $\scriptstyle n$ such that $\scriptstyle \sigma _{1}(n)\,=\,2n\,$ , all we have to do is give one example of perfect numbers, like 28. This doesn't answer questions like: How do we find other perfect numbers? Are there squarefree perfect numbers? 6 might be the only one. Are there odd perfect numbers?

Disproof by counterexample

A disproof by counterexample may be used to disprove a universal conjecture.

For example, someone asserts that "for $p$ prime, $2^{p}-1$ is always a prime number." It suffices to give a counterexample like 2047 or 8388607 (these correspond to $p=11$ and 23 respectively, see A065341 for more counterexamples).

Proof by exhaustion

A proof by exhaustion may only be used when the number of cases to investigate is finite (and small enough). It can be used for universal conjectures (all cases are true, or all cases are false in the case of nonexistence conjectures.)

Proof by construction

In a constructive proof the reader is asked to construct a certain object within certain parameters, but also allowed to choose some parameters. The reader should be convinced by the proof because the author could not realistically foresee how the "free" parameters would be chosen by readers.

Proof by construction is quite common in geometry. For example, to prove that it is possible to circumscribe a square using only straightedge and compass, we could ask the reader to draw a square of whatever size they want. Then we provide instructions on how to use the straightedge and the compass to construct the specified circle.

Theorem. A square can be circumscribed using only straightedge and compass.


Proof. Step 1. Draw a square of any size you wish.	Step 2. Draw a diagonal from the northeast to the southwest corner, and another diagonal from the southeast to the northwest corner.	Step 3. Where the diagonals meet, center your compass. The other end of the compass can go on any corner of the square. □

Proof by deduction

Proof by deduction (deductive proof) is also called direct proof. A deductive proof uses a logic chain from accepted axioms and/or theorems whose conclusion is the theorem which we aimed to prove.

Theorem. All positive multiples of abundant numbers are also abundant. Given a positive abundant number

n

, this means that with

m

being any positive integer whatsoever, the number

mn

is also abundant.

For example, 12 is abundant. By the theorem, so are its multiples: 24, 36, 48, 60, 72, 84, etc. (see A008594).

Proof. Let's label the divisors of $n$ thus: $1,d_{2},\ldots d_{\sigma _{0}(n)-1},n$ . Assign $s=\sigma _{1}(n)-n=\sum _{i=1}^{\sigma _{0}(n)-1}d_{i}$ . Since $n$ is said to be abundant, this means that $s>n$ . We don't know what $m$ 's divisors are, much less which divisors it has in common with $n$ . But we can still deduce at least some of $mn$ 's divisors: $m,md_{2},\ldots md_{\sigma _{0}(n)-1},mn$ . Ignoring $mn$ itself, these listed divisors must add up to $ms$ , since each addend that gave us $s$ has now been multiplied by $m$ . Recall that $s>n$ . Multiplying both sides of that inequality by $m$ , we get $ms>mn$ , meaning that $mn$ is abundant as the theorem stated. □

This example proof also demonstrates it is not always necessary to account for everything. Assuming $m\neq n$ , we could have included both $m$ and $n$ as distinct divisors of $mn$ . And if $\gcd(m,n)=1$ , we could have deduced even more divisors. Regardless, the proof actually ignored 1 as a divisor of $mn$ . Once it was demonstrated that $ms>mn$ , it was unnecessary to go further and obtain the actual value of $\sigma _{0}(mn)$ . (Of course if the proof was part of a larger paper or book, it would eventually be necessary to delve into the consequences of different values for $\gcd(m,n)$ ).

Proof by contradiction

Proof by contradiction starts by assuming that the the statement to be proved is actually false. Consequences are drawn from the falsehood of the statement, and then a chain of logical arguments is followed until a contradiction is reached. Since the falsehood of the statement produces a contradiction, this proves the statement must in fact be true (by the law of excluded middle.)

For example, to prove that there is no largest prime (that there are infinitely many primes), assume that there is in fact a largest prime. Using this assumption, a prime number greater than the largest prime can be found, contradicting the initial assumption that there is a largest prime.

Or, to prove that a certain number $\scriptstyle x\,$ is irrational, we could assume that it is in fact a rational number and that there exist integers $\scriptstyle a\,$ and $\scriptstyle b\,$ such that $\scriptstyle {\frac {a}{b}}\,=\,x\,$ . We then perform certain operations on $\scriptstyle {\frac {a}{b}}\,$ until obtaining a contradiction. This method can work with certain algebraic numbers but might prove elusive with potentially transcendental numbers.

Some living mathematicians even take offense at the idea that a certain dead mathematician's proof is a proof by contradiction and study the original text of the proof to demonstrate that it is not a proof by contradiction.

Others will construct faulty proofs by contradiction to prove their point. A valid proof by contradiction has only one flawed assumption: the initial assumption that the statement to be proved is false. A proof by contradiction deliberately contrived to be faulty will usually have two flawed assumptions: the initial assumption and a hidden assumption that there are only two possible states for a given thing, when in fact there are many more possible states.

Proof by induction

Another way to prove a fact about every element in an infinite well-ordered set is to prove it first about the first element, then prove that if it's true for one element, it is true for the next element in the set. Since it's true for the first element in the set, it's also true for the second element, and since it's true for the second element, it's also true for the third element, etc.

In number theory, this generally means proving something is true for $\scriptstyle n\,=\,1\,$ and then that if it's true for $\scriptstyle n\,$ it's also true for $\scriptstyle n+1\,$ . The truth of the statement for all positive integers thereby follows.

EXAMPLE WILL BE ADDED ANOTHER DAY

Conditional proofs

A conditional proof is a would-be proof, which depends on a not yet proved conjecture, but which is strongly believed to be true, and is thus taken as a hypothesis.

The most famous unproved conjecture, the Riemann Hypothesis, is assumed in many conditional proofs in analytic number theory. Proving the Riemann Hypothesis would cascade through all the conditional proofs depending on its truth, making them all proofs. But disproving the Riemann Hypothesis would transform those conditional proofs into trivial statements.

Faulty/fallacious proofs

A proof can be derailed by a flawed assumption at any step of the process. The first draft of the Wiles proof of Fermat's conjecture had a flaw at about the middle, and it took him a year to fix the problem.

Famous open problems draw lots of claimed proofs which are for the most part ignored by the mathematical establishment, with good reason most of the time. Squaring the circle, proving the non-existence of odd perfect numbers, etc., these have all had claimed proofs over the centuries. Scott Aaronson has identified ten signs a claimed proof is wrong:^[1]

The author doesn't use TeX.
The author doesn't understand the question.
The approach seems to yield something much stronger and probably false.
The approach conflicts with an impossibility result the author ought to be aware of.
The author uses weasel words near the end.
The paper jumps into technicalities without presenting a new idea.
The paper doesn't build on or refer to any previous work.
The paper spends a lot of pages on expository material.
The paper claims it will have practical consequences and deep philosophical implications.
The technique seems too wimpy for the problem at hand.

In the days before TeX (or computers for that matter), the first sign would perhaps have been that the author doesn't use standard mathematical notation, probably inventing new symbols for things that already have well-known, agreed-upon notations. Not that any of this deters would-be circle-squarers.

But sometimes faulty proofs are deliberately so constructed for amusement and edification, or to make a point. Here is an amusing proof by deduction that 2 = 1:

"Theorem." The integers 1 and 2 are equal.

"Proof." Let's say $\scriptstyle a\,=\,b\,$ . This means that $\scriptstyle a^{2}\,=\,ab\,$ , $\scriptstyle a^{2}+a^{2}\,=\,a^{2}+ab\,$ , $\scriptstyle 2a^{2}\,=\,a^{2}+ab\,$ , $\scriptstyle 2a^{2}-2ab\,=\,a^{2}+ab-2ab\,$ , $\scriptstyle 2a^{2}-2ab\,=\,a^{2}-ab\,$ . We rewrite as $\scriptstyle 2(a^{2}-ab)\,=\,a^{2}-ab\,$ . Lastly, we divide both sides by $\scriptstyle a^{2}-ab\,$ , obtaining $\scriptstyle 2\,=\,1\,$ as specified by the "theorem." □^[2]

Can you spot the point at which this proof goes wrong?

Notes

↑ Scott Aaronson, Ten Signs a Claimed Mathematical Breakthrough is Wrong, Shtetl-Optimized.
↑ Anne Rooney, The Story of Mathematics. Arcturus (2009): p. 197

[1] Scott Aaronson, Ten Signs a Claimed Mathematical Breakthrough is Wrong, Shtetl-Optimized.

[2] Anne Rooney, The Story of Mathematics. Arcturus (2009): p. 197

[1]

[2]