There are so many ways to normalize vectors… A common preprocessing step in machine learning is to normalize a vector before passing the vector into some machine learning algorithm e.g., before training a support vector machine (SVM).
One way to normalize the vector is to apply
l2-normalization to scale the vector to have a
unit norm. “Unit norm” essentially means that if we squared each element in the vector, and summed them, it would equal
(note this normalization is also often referred to as,
unit norm or a
vector of length 1 or a
So given a matrix
X, where the
rows represent samples and the
columns represent features of the sample, you can apply
l2-normalization to normalize each row to a unit norm. This can be done easily in Python using sklearn.
Here’s how to l2-normalize vectors to a unit vector in Python
import numpy as np
from sklearn import preprocessing
# Two samples, with 3 dimensions.
# The 2 rows indicate 2 samples,
# and the 3 columns indicate 3 features for each sample.
X = np.asarray([[-1,0,1],
[0,1,2]], dtype=np.float) # Float is needed.
# [[-1. 0. 1.]
# [ 0. 1. 2.]]
# l2-normalize the samples (rows).
X_normalized = preprocessing.normalize(X, norm='l2')
# After normalization.
# [[-0.70710678 0. 0.70710678]
# [ 0. 0.4472136 0.89442719]]
Now what did this do?
Continue reading “How to normalize vectors to unit norm in Python”
– you are using the Google App Engine (GAE) development server with Python
– you installed the Anaconda Python distribution
– you want to use the Numpy library with GAE
On Ubuntu and on Mac (but not Windows for some reason), you get this error when trying to deploy:
google app engine ImportError: No module named _ctypes
The tldr; solution
Create an Anaconda environment using numpy 1.6 and python 2.7:
conda create -n np16py27 anaconda numpy=1.6 python=2.7
Load this specific environment from the command line:
Run your GAE dev server:
That’s it! You can read more details below if you are interested.
Continue reading “Using numpy on google app engine with the anaconda python distribution”
Here’s how to compute true positives, false positives, true negatives, and false negatives in Python using the Numpy library.
Note that we are assuming a binary classification problem here. That is a value of
1 indicates a positive class, and a value of
0 indicates a negative class. For multi-class problems, this doesn’t really hold.
Continue reading “how to compute true/false positives and true/false negatives in python for binary classification problems”
Here’s how to debug your code when using a Jupyter/iPython notebook.
Tracer()(). Here’s an example using a simple function (based on this lucid explanation).
x = 10
# One-liner to start the debugger here.
from IPython.core.debugger import Tracer; Tracer()()
x = x + y
for i in range(10):
x = x+i
When the debugger reaches the
Tracer()() line, a small line to type in commands will appear under your cell.
Simply type in the variable names to check the values or run other commands. Below I’ve listed some practical Python PBD commands. More can be found here.
Continue reading “How to debug a Jupyter/iPython notebook”
Here’s how to run an IPython Notebook (now called a Jupyter Notebook) on a remote linux machine without using VNC.
These instructions are expanded on from here,
and it’s worth reading through to get more details.
you have two machines:
local_machine that you are physically working on
remote_machine that you want to run code on.
And you want to work in the browser on your
local_machine, but have the code execute on the
Continue reading “How to run an IPython/Jupyter Notebook on a remote machine”
You are using PyCaffe (Python interface for Caffe) and training a deep neural network directly within Python (although I think the same command holds for MATLAB).
You are on a machine with 2 GPUs and you want to specify which GPU to use for training. This is useful so you can train two different models at the same time on each GPU. Note that here we refer to training two different models on two different GPUs on the same machine, not a single model on two GPUs.
(side note: it seems to me that running two different jobs on the same GPU drastically slows GPU training. It’s so much slower that I only train a single model on a single GPU at a time. Running two different jobs on two different GPUs seems to be okay though)
Continue reading “CAFFE – how to specify which GPU to use in PyCaffe”
So I wanted to to add an IPython notebook within a WordPress blog post. The first thing I tried was exporting directly to HTML, and copying the HTML directly within the WordPress post. This sort of worked. However, it was very slow to copy all the HTML into the post and the formatting looked terrible.
Then I came across a way to do so, which you can read about here.
However, I did not want people to be able to directly access the IPython notebook HTML page (if say google indexed it). Rather I want to direct people to the actual WordPress post.
Here are the steps to add an IPython notebook to a WordPress blog post:
Continue reading “How to display an IPython notebook in a WordPress blog”
I have been working with Theano and it has been a bit of a journey getting the GPU to work. Here are a few notes to remind myself how to do so…
Start Python and check if Theano recognizes the GPU
Python 2.7.8 |Anaconda 2.1.0 (64-bit)| (default, Aug 21 2014, 18:22:21)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2
>>> import theano
Using gpu device 0: GeForce GTX 760 Ti OEM
You should see something like the above line showing that Theano finds your GPU.
If you do not see something like the above, then Theano probably is probably not configured to work with your GPU. But let’s check some more just to be sure.
Continue reading “theano – how to get the gpu to work”
So you want to run the IPython Notebook… and you’re using Anaconda 2.1.0 on some version of Linux.
You are already able to run ipython successfully…
IPython 2.2.0 -- An enhanced Interactive Python
But from the command line, when you try to run the IPython Notebook:
$ ipython notebook
You get a bunch of errors… something about sockets. The last error sticks out…
error: [Errno 99] Cannot assign requested address
A bit of googling shows you some relevant links:
We summarize the two steps needed to get the Anaconda IPython Notebook working here.
Continue reading “Anaconda IPython Notebook – error: [Errno 99] Cannot assign requested address”