This is a Pytorch implementation of the GoogLeNet Architecture that won the 2014 ImageNet competition as described in the paper "Going deeper with convolutions" by Christian Szegedy et. al.
In the paper the authors train the model on the ImageNet dataset which is a huge data set of images of objects in 1000 different classes and was used as part of the ImageNet competition which until recently was the main forum of comparison between state of the art image recognition models.
Although this model is absolutely capable of being applied to the full image net dataset I do not recommend this as it is VERY large, approximately 138GB.
Instead, if you wish to train the model yourself, I recommend either using the sample data set in the /data
directory of the repo which is just a small subset of 11 classes taken from the ImageNet data set or downloading your own subset using the ImageNet Downloader project and then using imagemagick and the converter script included in this repo to get the images the correct size and training with that.
If you really want to train with all of ImageNet you can find a few different methods of acquiring it here.
If you lack a graphics card on which to train the model or you just don't want to go through the hassle of training it yourself I have uploaded a .pkl
file containing a serialized version of the model trained on the included dataset.
(I haven't had a chance to post the file but I'll update this readme when I do!)
All you need to run this code are the torch and torchvision libraries. To install these just run the following command in the root of your local copy of the repo. Do bear in mind though that you may wish to visit the pytorch website to download the most appropriate versions for your system.
sudo pip3 install -r ./requirements.txt