Prague travel pictures and deep dreaming


Here is a combined short summary on my travels to the city of Prague in the Czech Republic along with corresponding images created using Google’s DeepDreams.

What is this DeepDreams you speak of?

Basically, DeepDream is a deep neural network that was trained to recognize objects from millions of images. A deep neural network is composed of a stack of layers. Basically, these layers learn image filters that when applied to an image classify the image (e.g., is this an image of a cat or a dog?).

You give DeepDream an image and specify a layer in the neural network. The original image is then slightly perturbed to create a modified image that causes the specified layer in the neural network to be more activated.

Early layers in the neural network are sensitive to low level concepts like the edges and textures in the image. So if you specify an early layer, your image will be modified to have edges and textures that most activate the early selected layer.

Example CNN. Image goes as input. Early layers (purple) are sensitive to things like edges and textures. Later layers (red) are sensitive to higher level concepts like faces.
Example CNN. Image goes as input. Early layers (purple) are sensitive to things like edges and textures. Later layers (red) are sensitive to higher level concepts like faces.

Later (or deeper) layers in the neural network are activated when they see higher level concepts such as faces. So any areas in the original image that slightly look like a face, will be modified to look more like a face.

Okay, but now you might ask, but what about Prague? How was your trip? Did you like the city?

Yeah it was nice! Thanks for asking. Did you want to see some pictures? Here’s one of an old building.

Original image of some building in Prague... can't remember what/if the significance of this picture was.
Original image of some building in Prague… can’t remember what the significance of this picture was.

Let’s try some deep dreaming on this. We’ll use the neural network known as VGG16 (it’s a famous neural network that performs very well on competitions). We’ll start by telling VGG16 (the neural network) to modify this image so that one of it’s middle layers becomes more activated. Specifically, we will activate layer conv3_1 from VGG16 (if you don’t know what conv3_1 means, that’s okay – it’s just a technical detail specifying what layer to use). This gives us this:

Same Prague building but using DeepDreams with VGG16 conv3_1
Same Prague building but using DeepDreams with VGG16 conv3_1

Now if we activate a deeper layer, conv5_2, we get this crazy looking image,

Continue reading “Prague travel pictures and deep dreaming”

Deep features to classify skin lesions – summary and slides

Here I'm nervously just starting our talk on our approach to skin lesion classification
Me nervously just starting to talk about our approach to skin lesion classification.

We presented our work, “Deep Features to Classify Skin Lesions” at ISBI 2016 in Prague! And I’m happy to report that our work was awarded runner-up for the Best Student Paper Award 🙂

In this work, we looked at how to classify skin lesions from images captured with a digital camera (i.e., non-dermoscopy). Our approach was able to distinguish among 10 different types of skin diseases over 1300 images and achieved an accuracy higher than what was previously reported over the same dataset. We did this by applying deep learning (i.e., pretrained convolutional neural networks) to melanoma and non-melanoma skin images.

Continue reading “Deep features to classify skin lesions – summary and slides”

Deep Visual-Semantic Alignments for Generating Image Descriptions – [slides]

Here are some slides I put together to try and explain/present this great paper:

Karpathy, A., & Fei-Fei, L. (2015). Deep Visual-Semantic Alignments for Generating Image Descriptions. CVPR.

Let’s summarize it in two lines:
The authors proposed a way to combine information from an image and a corresponding text caption. They use a Recurrent Neural Network (RNN) to then generate text captions that describe the image.

Pretty interesting stuff. This is the first paper where I really took a close look at RNNs as well.

Continue reading “Deep Visual-Semantic Alignments for Generating Image Descriptions – [slides]”

Deep Neural Networks to Segment Neuronal Membranes in Electron Microscopy Images and Detect Mitosis in Breast Cancer Histology Images

Here are some slides I made to present the following two papers to our reading group:

Cireșan, D. C., Giusti, A., Gambardella, L. M., & Schmidhuber, J. (2012). Deep neural networks segment neuronal membranes in electron microscopy images. In F. Pereira, C. J. C. Burges, L. Bottou, & K. Q. Weinberger (Eds.), NIPS (pp. 2843–2851). Curran Associates, Inc.

Cireșan, D. C., Giusti, A., Gambardella, L. M., & Schmidhuber, J. (2013). Mitosis detection in breast cancer histology images with deep neural networks. In K. Mori, I. Sakuma, Y. Sato, C. Barillot, & N. Navab (Eds.), MICCAI (Vol. 8150 LNCS, pp. 411–418). Springer Berlin Heidelberg.

These two papers, written by the same authors, used a very similar approach to win two medical image analysis competitions in well known conferences. They give a very nice overview of the approach they took to win these challenges.
Continue reading “Deep Neural Networks to Segment Neuronal Membranes in Electron Microscopy Images and Detect Mitosis in Breast Cancer Histology Images”

CAFFE – how to specify which GPU to use in PyCaffe

You are using PyCaffe (Python interface for Caffe) and training a deep neural network directly within Python (although I think the same command holds for MATLAB).

You are on a machine with 2 GPUs and you want to specify which GPU to use for training. This is useful so you can train two different models at the same time on each GPU. Note that here we refer to training two different models on two different GPUs on the same machine, not a single model on two GPUs.

(side note: it seems to me that running two different jobs on the same GPU drastically slows GPU training. It’s so much slower that I only train a single model on a single GPU at a time. Running two different jobs on two different GPUs seems to be okay though)
Continue reading “CAFFE – how to specify which GPU to use in PyCaffe”

ImageNet classification with Deep Convolutional Neural Networks – [paper explained]

Here are some slides I made on the very interesting work by Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton titled, “ImageNet Classification with Deep Convolutional Neural Networks”.

convThis was really the first time I took a deep 🙂 look at Convolutional Neural Networks (CNNs). I was so blown away by their performance that I have been exploring CNNs ever since then.

High-level summary:
Basically, in 2012, Krizhevsky et al. won a competition to classify 1000 different classes across thousands of images. They won by a pretty substantial margin. After this, most (if not all) of the top competing approaches now rely on CNNs to extract strong image features.

This was the first paper I read/presented on the topics of CNNs, so I’m particularly fond of this work. If you find this topic interesting, check out these slides that describe a CNN-based approach to win two medical image analysis competitions.

Hopefully these slides help convey the key ideas from their work.

Continue reading “ImageNet classification with Deep Convolutional Neural Networks – [paper explained]”

state-of-the-art classification methods on computer vision datasets

When choosing among the many academic papers to read, I find a nice heuristic is to pick a paper that performs well on publicly available standardized datasets. So I was happy to come across this webpage that tracks what the state-of-the-art classification methods are for some well known computer vision datsets.

Check it out here:
http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html

How to compile locally on linux without sudo permissions

Here’s an example of how to compile things locally on linux without sudo permissions.

Why in the world would you want to do this? Well you might be on locked down system (say you are at school) and the back and forth between the sys admins may not be effective.

For example, I have been trying to get Caffe to work with Python on my locked down machine at school – and while the sys admins were helpful enough to install Caffe, I was getting strange errors using different versions of Caffe with the different versions of python installed. Here’s how I solved one problem, which I hope is a useful template and reminder for future issues.
Continue reading “How to compile locally on linux without sudo permissions”

theano – how to get the gpu to work

I have been working with Theano and it has been a bit of a journey getting the GPU to work. Here are a few notes to remind myself how to do so…

Start Python and check if Theano recognizes the GPU

$ python
Python 2.7.8 |Anaconda 2.1.0 (64-bit)| (default, Aug 21 2014, 18:22:21)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2

>>> import theano
Using gpu device 0: GeForce GTX 760 Ti OEM

You should see something like the above line showing that Theano finds your GPU.

If you do not see something like the above, then Theano probably is probably not configured to work with your GPU. But let’s check some more just to be sure.
Continue reading “theano – how to get the gpu to work”