πŸ—“οΈ August 2024πŸ‘€Β Β loading

Math Classics: The Triangle Inequality

Yet another proof of the triangle inequality.

poster for: math classics: the triangle inequality
linear-algebramath

In this article, we're proving the triangle inequality, which can be understood as follows:

The shortest distance between two points is a straight line.

I like this theorem because it feels intuitive ... like it needs to be true.

So let's state this theorem in mathematical language. For any two vectors uβƒ—,vβƒ—βˆˆRn\vec{u},\vec{v}\in\R^n, the length of their sum vector (the direct path) is always shorter than the sum of their indivdual path lengths (the detour):

∣∣uβƒ—+vβƒ—βˆ£βˆ£β‰€βˆ£βˆ£uβƒ—βˆ£βˆ£+∣∣vβƒ—βˆ£βˆ£||\vec{u} + \vec{v}|| \leq ||\vec{u}|| + ||\vec{v}||

Here, βˆ£βˆ£β‹…βˆ£βˆ£:Rnβ†’R||\cdot|| : \R^n \to \R denotes a vector's norm, defined by ∣∣xβƒ—βˆ£βˆ£=xβƒ—β‹…xβƒ—||\vec{x}|| = \sqrt{\vec{x} \cdot \vec{x}}, where we use the xβƒ—β‹…yβƒ—\vec{x}\cdot\vec{y} notation for the dot product :RnΓ—Rnβ†’R:\R^n\times\R^n\to\R of two vectors.

Lemma: The Cauchy-Schwarz Inequality

First we'll be proving that the absolute value of a dot product is less than the product of the individual vector's norms, which is known as the Cauchy-Schwarz inequality:

∣uβƒ—β‹…vβƒ—βˆ£β‰€βˆ£βˆ£uβƒ—βˆ£βˆ£β‹…βˆ£βˆ£vβƒ—βˆ£βˆ£β€…β€…β€…β€…β€…β€…(βˆ€uβƒ—,vβƒ—βˆˆRn)|\vec{u}\cdot\vec{v}| \leq ||\vec{u}||\cdot||\vec{v}||\:\:\:\:\:\:\Big(\forall\vec{u},\vec{v}\in\R^n\Big)

Given any two vectors uβƒ—,vβƒ—βˆˆRn\vec{u}, \vec{v} \in \R^n, let us define wβƒ—=uβƒ—βˆ’tvβƒ—\vec{w} = \vec{u} - t\vec{v}, for some arbitrary scalar t∈Rt\in\R. By distributivity and scalar multiplication of the dot product, we have

∣∣wβƒ—βˆ£βˆ£2=∣wβƒ—β‹…wβƒ—βˆ£=∣(uβƒ—βˆ’tvβƒ—)β‹…(uβƒ—βˆ’tvβƒ—)∣=∣uβƒ—β‹…uβƒ—βˆ’2tuβƒ—β‹…vβƒ—+t2vβƒ—β‹…vβƒ—βˆ£\begin{aligned} ||\vec{w}||^2 &= |\vec{w}\cdot\vec{w}|\\ &= |(\vec{u} - t\vec{v})\cdot(\vec{u} - t\vec{v})|\\ &= |\vec{u}\cdot\vec{u} - 2t\vec{u}\cdot\vec{v} + t^2\vec{v}\cdot\vec{v}| \end{aligned}

We're about to use the fact that ∣a+bβˆ£β‰€βˆ£a∣+∣b∣|a + b| \leq |a| + |b| for any a,b∈Ra,b\in\R.

Why is this true? Note how ∣a∣β‰₯a|a|\geq a and ∣b∣β‰₯b|b|\geq b, which implies that ∣aβˆ£β‹…βˆ£b∣β‰₯aβ‹…b|a|\cdot|b|\geq a\cdot b. It follows that a2+2∣a∣∣b∣+b2β‰₯a2+2ab+b2a^2 + 2|a||b| + b^2 \geq a^2 + 2ab + b^2, implying that (∣a∣+∣b∣)2β‰₯(a+b)2(|a| + |b|)^2 \geq (a + b)^2, from which it follows that ∣a∣+∣b∣β‰₯∣a+b∣|a| + |b|\geq |a + b|

Continuing from our previous point:

∣∣wβƒ—βˆ£βˆ£2=∣uβƒ—β‹…uβƒ—βˆ’2tuβƒ—β‹…vβƒ—+t2vβƒ—β‹…vβƒ—βˆ£β‰€βˆ£uβƒ—β‹…uβƒ—βˆ£+2t∣uβƒ—β‹…vβƒ—βˆ£+t2∣vβƒ—β‹…vβƒ—βˆ£=∣∣uβƒ—βˆ£βˆ£2+2t∣uβƒ—β‹…vβƒ—βˆ£+t2∣∣vβƒ—βˆ£βˆ£2\begin{aligned} ||\vec{w}||^2 &= |\vec{u}\cdot\vec{u} - 2t\vec{u}\cdot\vec{v} + t^2\vec{v}\cdot\vec{v}|\\ &\leq |\vec{u}\cdot\vec{u}| + 2t|\vec{u}\cdot\vec{v}| + t^2|\vec{v}\cdot\vec{v}|\\ &= ||\vec{u}||^2 + 2t|\vec{u}\cdot\vec{v}| + t^2||\vec{v}||^2 \end{aligned}

And, since ∣∣wβƒ—βˆ£βˆ£2β‰₯0||\vec{w}||^2 \geq 0, it follows that

∣∣uβƒ—βˆ£βˆ£2+2t∣uβƒ—β‹…vβƒ—βˆ£+t2∣∣vβƒ—βˆ£βˆ£2β‰₯0||\vec{u}||^2 + 2t|\vec{u}\cdot\vec{v}| + t^2||\vec{v}||^2 \geq 0

The key observation here, is that this is a quadratic inequality in the variable tt. Geometrically speaking, this inequality states that a parabola, parameterized by tt, should cross the x-axis at most once.

graph of a red parabola hovering above the x-axis but not quite touching it

Since the well known quadratic formula tells us that a function f(x)=ax2+bx+cf(x) = ax^2 + bx + c intersects with the x-axis at coordinates

x1,x2=12a(βˆ’bΒ±b2βˆ’4ac)x_1,x_2 = \dfrac{1}{2a}\Bigg(-b \pm \sqrt{b^2 - 4ac}\Bigg)

it follows that, for the above inequality to hold, we must have b2βˆ’4ac≀0b^2 - 4ac \leq 0, i.e.:

(2∣uβƒ—β‹…vβƒ—βˆ£)2βˆ’4∣∣uβƒ—βˆ£βˆ£2∣∣vβƒ—βˆ£βˆ£2≀0\big(2|\vec{u}\cdot\vec{v}|\big)^2 - 4||\vec{u}||^2||\vec{v}||^2 \leq 0

By rearranging some terms, we find that (uβƒ—β‹…vβƒ—)2≀(∣∣uβƒ—βˆ£βˆ£β‹…βˆ£βˆ£vβƒ—βˆ£βˆ£)2(\vec{u}\cdot\vec{v})^2 \leq (||\vec{u}||\cdot||\vec{v}||)^2, from which the Cauchy-Schwarz inequality follows.

Proof for The Triangle Inequality

By using the Cauchy-Schwarz inequality, the proof for the triangle inequality becomes relatively straightforward.

Let u⃗\vec{u} and v⃗\vec{v} be arbitrary vectors in Rn\R^n. Then:

∣∣uβƒ—+vβƒ—βˆ£βˆ£2=(uβƒ—+vβƒ—)β‹…(uβƒ—+vβƒ—)=uβƒ—β‹…uβƒ—+2uβƒ—β‹…vβƒ—+vβƒ—β‹…vβƒ—=∣∣uβƒ—βˆ£βˆ£2+2uβƒ—β‹…vβƒ—+∣∣vβƒ—βˆ£βˆ£2β‰€βˆ£βˆ£uβƒ—βˆ£βˆ£2+2∣uβƒ—β‹…vβƒ—βˆ£+∣∣vβƒ—βˆ£βˆ£2β‰€βˆ£βˆ£uβƒ—βˆ£βˆ£2+2∣∣uβƒ—βˆ£βˆ£βˆ£βˆ£vβƒ—βˆ£βˆ£+∣∣vβƒ—βˆ£βˆ£2β€…β€…β€…β€…(Cauchy-Schwarz)=(∣∣uβƒ—βˆ£βˆ£+∣∣vβƒ—βˆ£βˆ£)2\begin{aligned} ||\vec{u} + \vec{v}||^2 &= (\vec{u}+\vec{v}) \cdot (\vec{u}+\vec{v})\\ &= \vec{u}\cdot\vec{u} + 2\vec{u}\cdot\vec{v} + \vec{v}\cdot\vec{v}\\ &= ||\vec{u}||^2 + 2\vec{u}\cdot\vec{v} + ||\vec{v}||^2\\ &\leq ||\vec{u}||^2 + 2|\vec{u}\cdot\vec{v}| + ||\vec{v}||^2\\ &\leq ||\vec{u}||^2 + 2||\vec{u}||||\vec{v}|| + ||\vec{v}||^2\:\:\:\:\text{(Cauchy-Schwarz)}\\ &= \big(||\vec{u}|| + ||\vec{v}||\big)^2 \end{aligned}

From this, the triangle inequality follows: ∣∣uβƒ—+vβƒ—βˆ£βˆ£β‰€βˆ£βˆ£uβƒ—βˆ£βˆ£+∣∣vβƒ—βˆ£βˆ£||\vec{u} + \vec{v}|| \leq ||\vec{u}|| + ||\vec{v}||