There are so many ways to normalize vectors… A common preprocessing step in machine learning is to normalize a vector before passing the vector into some machine learning algorithm e.g., before training a support vector machine (SVM).

One way to normalize the vector is to apply `l2-normalization`

to scale the vector to have a `unit norm`

. “Unit norm” essentially means that if we **squared** each element in the vector, and **summed** them, it would equal `1`

.

(note this normalization is also often referred to as, `unit norm`

or a `vector of length 1`

or a `unit vector`

)

So given a matrix `X`

, where the `rows`

represent samples and the `columns`

represent features of the sample, you can apply `l2-normalization`

to normalize each row to a unit norm. This can be done easily in Python using sklearn.

## Here’s how to l2-normalize vectors to a unit vector in Python

import numpy as np from sklearn import preprocessing # Two samples, with 3 dimensions. # The 2 rows indicate 2 samples, # and the 3 columns indicate 3 features for each sample. X = np.asarray([[-1,0,1], [0,1,2]], dtype=np.float) # Float is needed. # Before-normalization. print X # Output, # [[-1. 0. 1.] # [ 0. 1. 2.]] # l2-normalize the samples (rows). X_normalized = preprocessing.normalize(X, norm='l2') # After normalization. print X_normalized # Output, # [[-0.70710678 0. 0.70710678] # [ 0. 0.4472136 0.89442719]] |

Now what did this do?

Continue reading “How to normalize vectors to unit norm in Python”