# How to normalize vectors to unit norm in Python

There are so many ways to normalize vectors… A common preprocessing step in machine learning is to normalize a vector before passing the vector into some machine learning algorithm e.g., before training a support vector machine (SVM).

One way to normalize the vector is to apply `l2-normalization` to scale the vector to have a `unit norm`. “Unit norm” essentially means that if we squared each element in the vector, and summed them, it would equal `1`.

(note this normalization is also often referred to as, `unit norm` or a `vector of length 1` or a `unit vector`)

So given a matrix `X`, where the `rows` represent samples and the `columns` represent features of the sample, you can apply `l2-normalization` to normalize each row to a unit norm. This can be done easily in Python using sklearn.

## Here’s how to l2-normalize vectors to a unit vector in Python

```import numpy as np from sklearn import preprocessing   # Two samples, with 3 dimensions. # The 2 rows indicate 2 samples, # and the 3 columns indicate 3 features for each sample. X = np.asarray([[-1,0,1], [0,1,2]], dtype=np.float) # Float is needed.   # Before-normalization. print X # Output, # [[-1. 0. 1.] # [ 0. 1. 2.]]   # l2-normalize the samples (rows). X_normalized = preprocessing.normalize(X, norm='l2')   # After normalization. print X_normalized # Output, # [[-0.70710678 0. 0.70710678] # [ 0. 0.4472136 0.89442719]]```

Now what did this do?

It normalized each sample (row) in the X matrix so that the squared elements sum to 1.

We can check that this is the case:

```# Square all the elements/features. X_squared = X_normalized ** 2 print X_squared # Output, # [[ 0.5 0. 0.5] # [ 0. 0.2 0.8]]   # Sum over the rows. X_sum_squared = np.sum(X_squared, axis=1) print X_sum_squared # Output, # [ 1. 1.]   # Yay! Each row sums to 1 after being normalized.```

As we see, if we square each element, and then sum along the rows, we get the expected value of “1” for each row.