Visualizing Data using t-SNE – slides

My goal is to publish all the slides (well maybe not my first and worst ones) I’ve made over the years for our lab’s reading group. To this end, I’ve posted some old slides (from 2015) that describe in detail the t-SNE algorithm described in this paper:

Maaten, L. van der, & Hinton, G. (2008). Visualizing Data using t-SNE. Journal of Machine Learning Research (JMLR), 9, 2579–2605. [pdf]

The authors proposed t-SNE back in 2008. But it seems to have had a sort of revival these last few years, likely due to the number of deep learning papers using it to visualize learned features.

Continue reading “Visualizing Data using t-SNE – slides”

Mastering the Game of Go – slides [paper explained]

This week I presented to our weekly reading group, this work:

Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., van den Driessche, G., … Hassabis, D. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), 484–489.

To quickly summarize this work…

Basically, they create a policy network, which is a convolutional neural network, that predicts the next move a human player would do from a board state. They create a value network, also a convolutional neural network, that predicts the outcome (win or lose) of the game given the current board state.
Continue reading “Mastering the Game of Go – slides [paper explained]”

How to normalize vectors to unit norm in Python

There are so many ways to normalize vectors… A common preprocessing step in machine learning is to normalize a vector before passing the vector into some machine learning algorithm e.g., before training a support vector machine (SVM).

One way to normalize the vector is to apply some normalization to scale the vector to have a length of 1 i.e., a unit norm. There are different ways to define “length” such as as l1 or l2-normalization. If you use l2-normalization, “unit norm” essentially means that if we squared each element in the vector, and summed them, it would equal 1.

(note this normalization is also often referred to as, unit norm or a vector of length 1 or a unit vector).

Continue reading “How to normalize vectors to unit norm in Python”

state-of-the-art classification methods on computer vision datasets

When choosing among the many academic papers to read, I find a nice heuristic is to pick a paper that performs well on publicly available standardized datasets. So I was happy to come across this webpage that tracks what the state-of-the-art classification methods are for some well known computer vision datsets.

Check it out here:
http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html

The best machine learning algorithm for classification and regression

[Please note that I’m still very much a novice in this field – and that I change my mind about things often]

I had a hard time naming this post. Here are a few other titles I could have used:

Why Random/Decision Forests are my favorite machine learning algorithm

Why I think Random/Decision Forests are the best machine learning algorithm.

I know there’s exceptions to this – there exists scenarios where this title is not true. But rather than giving the vague unhelpful answer of “it depends“, here’s why I think that Random Forests should be your first and default choice when choosing a machine learning algorithm to use for classification and/or regression.

Here’s a working list (in no particular order) of why I really like working with Random/Decision forests:
Continue reading “The best machine learning algorithm for classification and regression”

Predicting Disability in Patients with Multiple Sclerosis using MRI – MICCAI CSI

Our work on predicting the physical disability level of patients was accepted for an oral presentation at the MICCAI 2013 workshop on, Computational Methods and Clinical Applications for Spine Image, in Nagoya, Japan.

The paper itself you can read here and is called:
Novel morphological and appearance features for predicting physical disability from MR images in multiple sclerosis patients

MATLAB – TreeBagger example

Did you know that Decision Forests (or Random Forests, I think they are pretty much the same thing) are implemented in MATLAB? In MATLAB, Decision Forests go under the rather deceiving name of TreeBagger.

Here’s a quick tutorial on how to do classification with the TreeBagger class in MATLAB.

Continue reading “MATLAB – TreeBagger example”