Here’s a quick tutorial on the L2 or Euclidean norm.
First of all, the terminology is not clear. So let’s start with that.
Many equivalent names
All these names mean the same thing:
Euclidean norm == Euclidean length == L2 norm == L2 distance == norm
Although they are often used interchangable, we will use the phrase “L2 norm
” here.
Many equivalent symbols
Now also note that the symbol for the L2 norm
is not always the same.
Let’s say we have a vector, .
The L2 norm is sometimes represented like this,
Or sometimes this,
Other times the L2 norm is represented like this,
Or even this,
To help distinguish from the absolute value sign, we will use the
Equation
Now that we have the names and terminology out of the way, let’s look at the typical equations.
where
In words, the L2 norm
is defined as, 1) square all the elements in the vector together; 2) sum these squared values; and, 3) take the square root of this sum.
A quick example
Let’s use our simple example from earlier,
We compute the L2 norm
of the vector
And there you go!
So in summary, 1) the terminology is a bit confusing since as there are equivalent names, and 2) the symbols are overloaded. Finally, 3) we did a small example computing the L2 norm
of a vector by hand.
If you are hungry for a code example, I wrote a small MATLAB example (computing L2 distance) here.
Hi Jeremy
Can we know when we use it?
Hi nagdawi84, it’s not clear when to use different normalizations. In a machine learning scenario, an unsatisfying but practical answer is to try a few different normalizations, and choose the one that performs the best on your validation set.
You may also want to try normalization if you’re combining different sources of data with vastly different range of scales. For example, say you are combining a person’s age with visual features, where the person’s age can range from 0-100, and the visual features range from 0-1. If the scale of these types of data vastly differs, normalizing may help with learning (e.g., training an SVM).
Hope that helps!