Confusion matrix of bag of sift features + NN classifier
For this project we were required to implement scene recognition in a number of different ways. We started with a very simple and easy to comprehend tiny image feature paired with a nearest neighbor classifier. From there we moved onto implementing a bag of SIFT feature representation and used the same nearest neighbor classifier. Lastly, we were tasked with improving on the nearest neighbor classifier by using a more advanced linear SVN classifier.
I'll start with my best case results I got when testing. Most of these were the result of some strategic or sometimes lucky parameter tweaking. Unfortunately the bag of sift + NN took a huge chunk of time compared to my fastest.
The parameters used for the highest percentage are the ones in the submitted source code with the exception of bag of sift and nearest neighbor classifier, which is where my luck came in. I tried different combinations of vl_dsift parameters (mainly adjsuting the size and step) while not using the 'fast' parameter and had accepted that I wasn't going to get better than 43% accuracy. After that I set out to make it faster for submission (I was getting 20+ minute runtimes) by adding the 'fast' parameter to my vl_dsift calls and shockingly decreased runtimes dramatically and increased accuracy to 44.7%. I ran it again in hopes it wasn't a fluke and got the same result. I then removed 'fast' from the build_vocabulary vl_dsift call and increased my accuracy to 47.2% after rebuilding the vocab.mat file. This was the highest accuracy I received with this method and the submitted code only differs in that 'fast' has been re-added to build_vocabulary.
Unfortunately I was not able to tweak parameters as much as I had wanted to due to my own procrastination as well as the long runtime of each iteration and as such, my accuracy is not as maximized as I would've liked.
The first step of creating these features is to build the vocabulary from the training set.
After building the vocabulary we can move to creating bags of sift features. I opted to use vl_feat's kdtree mechanics which helped raise my speeds to a reasonable level.
I was not able to get any proper results for my attempts at using the svm classifier. I'm not sure where I was going wrong but I think part of it was switching of rows and columns between matlab and vl_feat. It was driving me insane trying to switch those and keep track of which needs to be inverted and in the end that combined with 5+ minute runtime just got the best of me...
Category name | Accuracy | Sample training images | Sample true positives | False positives with true label | False negatives with wrong predicted label | ||||
---|---|---|---|---|---|---|---|---|---|
Kitchen | 0.410 | Store |
Industrial |
Store |
LivingRoom |
||||
Store | 0.320 | TallBuilding |
Bedroom |
LivingRoom |
Kitchen |
||||
Bedroom | 0.220 | LivingRoom |
TallBuilding |
LivingRoom |
TallBuilding |
||||
LivingRoom | 0.290 | TallBuilding |
Store |
Bedroom |
Office |
||||
Office | 0.580 | Bedroom |
Kitchen |
Kitchen |
Kitchen |
||||
Industrial | 0.160 | Kitchen |
InsideCity |
Suburb |
LivingRoom |
||||
Suburb | 0.690 | LivingRoom |
Street |
Store |
TallBuilding |
||||
InsideCity | 0.320 | Suburb |
Coast |
Bedroom |
Store |
||||
TallBuilding | 0.400 | Mountain |
Street |
Street |
InsideCity |
||||
Street | 0.470 | Industrial |
LivingRoom |
InsideCity |
Office |
||||
Highway | 0.610 | Mountain |
Coast |
TallBuilding |
Coast |
||||
OpenCountry | 0.470 | Mountain |
Coast |
Suburb |
Mountain |
||||
Coast | 0.480 | Mountain |
OpenCountry |
Bedroom |
Highway |
||||
Mountain | 0.490 | Industrial |
Highway |
TallBuilding |
Store |
||||
Forest | 0.790 | OpenCountry |
Store |
LivingRoom |
Mountain |
||||
Category name | Accuracy | Sample training images | Sample true positives | False positives with true label | False negatives with wrong predicted label |