My goal is to publish all the slides (well maybe not my first and worst ones) I’ve made over the years for our lab’s reading group. To this end, I’ve posted some old slides (from 2015) that describe in detail the t-SNE algorithm described in this paper:
Maaten, L. van der, & Hinton, G. (2008). Visualizing Data using t-SNE. Journal of Machine Learning Research (JMLR), 9, 2579–2605. [pdf]
The authors proposed t-SNE back in 2008. But it seems to have had a sort of revival these last few years, likely due to the number of deep learning papers using it to visualize learned features.
Basically, they create a policy network, which is a convolutional neural network, that predicts the next move a human player would do from a board state. They create a value network, also a convolutional neural network, that predicts the outcome (win or lose) of the game given the current board state. Continue reading “Mastering the Game of Go – slides [paper explained]”
There are so many ways to normalize vectors… A common preprocessing step in machine learning is to normalize a vector before passing the vector into some machine learning algorithm e.g., before training a support vector machine (SVM).
One way to normalize the vector is to apply l2-normalization to scale the vector to have a unit norm. “Unit norm” essentially means that if we squared each element in the vector, and summed them, it would equal 1.
(note this normalization is also often referred to as, unit norm or a vector of length 1 or a unit vector).
When choosing among the many academic papers to read, I find a nice heuristic is to pick a paper that performs well on publicly available standardized datasets. So I was happy to come across this webpage that tracks what the state-of-the-art classification methods are for some well known computer vision datsets.
[Please note that I’m still very much a novice in this field – and that I change my mind about things often]
I had a hard time naming this post. Here are a few other titles I could have used:
Why Random/Decision Forests are my favorite machine learning algorithm
Why I think Random/Decision Forests are the best machine learning algorithm.
I know there’s exceptions to this – there exists scenarios where this title is not true. But rather than giving the vague unhelpful answer of “it depends“, here’s why I think that Random Forests should be your first and default choice when choosing a machine learning algorithm to use for classification and/or regression.
Did you know that Decision Forests (or Random Forests, I think they are pretty much the same thing) are implemented in MATLAB? In MATLAB, Decision Forests go under the rather deceiving name of TreeBagger.
Here’s a quick tutorial on how to do classification with the TreeBagger class in MATLAB.