Last updated on November 10th, 2017

There are so many ways to normalize vectors… A common preprocessing step in machine learning is to normalize a vector before passing the vector into some machine learning algorithm e.g., before training a support vector machine (SVM).

One way to normalize the vector is to apply `l2-normalization`

to scale the vector to have a `unit norm`

. “Unit norm” essentially means that if we **squared** each element in the vector, and **summed** them, it would equal `1`

.

(note this normalization is also often referred to as, `unit norm`

or a `vector of length 1`

or a `unit vector`

).

So given a matrix `X`

, where the `rows`

represent samples and the `columns`

represent features of the sample, you can apply `l2-normalization`

to normalize each row to a unit norm. This can be done easily in Python using sklearn.

## Here’s how to l2-normalize vectors to a unit vector in Python

import numpy as np from sklearn import preprocessing # Two samples, with 3 dimensions. # The 2 rows indicate 2 samples, # and the 3 columns indicate 3 features for each sample. X = np.asarray([[-1,0,1], [0,1,2]], dtype=np.float) # Float is needed. # Before-normalization. print X # Output, # [[-1. 0. 1.] # [ 0. 1. 2.]] # l2-normalize the samples (rows). X_normalized = preprocessing.normalize(X, norm='l2') # After normalization. print X_normalized # Output, # [[-0.70710678 0. 0.70710678] # [ 0. 0.4472136 0.89442719]] |

Now what did this do?

It normalized each sample (row) in the X matrix so that the **squared** elements **sum** to 1.

We can check that this is the case:

# Square all the elements/features. X_squared = X_normalized ** 2 print X_squared # Output, # [[ 0.5 0. 0.5] # [ 0. 0.2 0.8]] # Sum over the rows. X_sum_squared = np.sum(X_squared, axis=1) print X_sum_squared # Output, # [ 1. 1.] # Yay! Each row sums to 1 after being normalized. |

As we see, if we square each element, and then sum along the rows, we get the expected value of “1” for each row.

More reading and references:

Official Python documentation

Official Python example