There are so many ways to normalize vectors… A common preprocessing step in machine learning is to normalize a vector before passing the vector into some machine learning algorithm e.g., before training a support vector machine (SVM).

One way to normalize the vector is to apply `l2-normalization`

to scale the vector to have a `unit norm`

. “Unit norm” essentially means that if we **squared** each element in the vector, and **summed** them, it would equal `1`

.

(note this normalization is also often referred to as, `unit norm`

or a `vector of length 1`

or a `unit vector`

).

So given a matrix `X`

, where the `rows`

represent samples and the `columns`

represent features of the sample, you can apply `l2-normalization`

to normalize each row to a unit norm. This can be done easily in Python using sklearn.

## Here’s how to l2-normalize vectors to a unit vector in Python

import numpy as np from sklearn import preprocessing # Two samples, with 3 dimensions. # The 2 rows indicate 2 samples, # and the 3 columns indicate 3 features for each sample. X = np.asarray([[-1,0,1], [0,1,2]], dtype=np.float) # Float is needed. # Before-normalization. print X # Output, # [[-1. 0. 1.] # [ 0. 1. 2.]] # l2-normalize the samples (rows). X_normalized = preprocessing.normalize(X, norm='l2') # After normalization. print X_normalized # Output, # [[-0.70710678 0. 0.70710678] # [ 0. 0.4472136 0.89442719]] |

Now what did this do?

It normalized each sample (row) in the X matrix so that the **squared** elements **sum** to 1.

We can check that this is the case:

# Square all the elements/features. X_squared = X_normalized ** 2 print X_squared # Output, # [[ 0.5 0. 0.5] # [ 0. 0.2 0.8]] # Sum over the rows. X_sum_squared = np.sum(X_squared, axis=1) print X_sum_squared # Output, # [ 1. 1.] # Yay! Each row sums to 1 after being normalized. |

As we see, if we square each element, and then sum along the rows, we get the expected value of “1” for each row.

More reading and references:

Official Python documentation

Official Python example

Just wondering! why do we need to convert vectors to unit norm in ML? what is the reason behind this? Also, I was looking at an example of preprocessing in stock movement data-set and the author used normalizer(norm=’l2′). Any particular reason behind this? Does it have anything to do with the sparsity of the data? Sorry for too many questions.

Thanks for your questions Saurabh!

> why do we need to convert vectors to unit norm in ML?

We don’t have to. For some machine learning approaches (e.g., random forests), this may not be needed. The intuition for normalizing the vectors is that elements within the vector that have large magnitudes may not be more important, so normalizing them puts all elements roughly in the same scale.

> the author used normalizer(norm=’l2′). Any particular reason behind this? Does it have anything to do with the sparsity of the data?

Was this normalization put on the trainable weights during the training phase? L2 normalization penalizes weights that have a large magnitude. Whereas L1 encourages weights to be sparse (i.e., sets weights to be 0).

You can also preprocess the data using L2, which also penalizes large elements within the vector.

Hope that helps!