In this article, we're proving the triangle inequality, which can be understood as follows:
The shortest distance between two points is a straight line.
I like this theorem because it feels intuitive ... like it needs to be true.
So let's state this theorem in mathematical language. For any two vectors u,vβRn, the length of their sum vector (the direct path) is always shorter than the sum of their indivdual path lengths (the detour):
β£β£u+vβ£β£β€β£β£uβ£β£+β£β£vβ£β£
Here, β£β£β β£β£:RnβR denotes a vector's norm, defined by β£β£xβ£β£=xβ xβ,
where we use the xβ yβ notation for the dot product:RnΓRnβR of two vectors.
First we'll be proving that the absolute value of a dot product is less than the product of the individual vector's norms, which is known as the Cauchy-Schwarz inequality:
Given any two vectors u,vβRn, let us define w=uβtv, for some arbitrary scalar tβR.
By distributivity and scalar multiplication of the dot product, we have
We're about to use the fact that β£a+bβ£β€β£aβ£+β£bβ£ for any a,bβR.
Why is this true? Note how β£aβ£β₯a and β£bβ£β₯b,
which implies that β£aβ£β β£bβ£β₯aβ b.
It follows that a2+2β£aβ£β£bβ£+b2β₯a2+2ab+b2, implying that (β£aβ£+β£bβ£)2β₯(a+b)2,
from which it follows that β£aβ£+β£bβ£β₯β£a+bβ£
The key observation here, is that this is a quadratic inequality in the variable t.
Geometrically speaking, this inequality states that a parabola, parameterized by t, should cross the x-axis at most once.
Since the well known quadratic formula tells us that a function f(x)=ax2+bx+c intersects with the x-axis at coordinates
x1β,x2β=2a1β(βbΒ±b2β4acβ)
it follows that, for the above inequality to hold, we must have b2β4acβ€0, i.e.: