The goal of this project is to introduce us to image recognition. Specifically, we will examine the task of scene recognition starting with very simple methods -- tiny images and nearest neighbor classification -- and then move on to more advanced methods -- bags of quantized local features and linear classifiers learned by support vector machines. The pipeline is run for following method:
Tiny images is one of the simplest possible image representations. The image is downsampled to 16x16 and this 256 vector is used as a feature vector. This is not a particularly good representation, because it discards all of the high frequency image content and is not especially invariant to spatial or brightness shifts.
Nearest neighbor and KNN is implemented using KDTREE that optimises the search time in logarithmic terms.
Bags of SIFT uses VL_SIFT that samples dense SIFT points and build a histogram by finding the nearest neighbor KDTREE centroid for every SIFT feature. Implementation for Bag of SIFT features is done with cluster size 200.
Linear SVM uses VL_SVM. The classifiers are tried over different values of lambda.
Current highest accuracy achieved 78% is with Fisher encoding for cluster size 200 + RBF Kernel Non Linear SVM with lambda = 10^-4
- Vocabulary Size Analysis
- Cross Validation Method
- Scale-spaced features
- Complementary features (GIST)
- Fisher Encoding
- Gaussian/RBF Kernel
- Spatial Pyramids
Analysis of different feature types over classification algorithms and different cluster sizes. The code was run for cluster size 400 also, but considering time and accuracy trade off I choose not to display the results.
Accuracy (mean of diagonal of confusion matrix) is 0.261
Category name | Accuracy | Sample training images | Sample true positives | False positives with true label | False negatives with wrong predicted label | ||||
---|---|---|---|---|---|---|---|---|---|
Kitchen | 0.480 | Industrial |
Forest |
Highway |
Forest |
||||
Store | 0.340 | Office |
TallBuilding |
TallBuilding |
Coast |
||||
Bedroom | 0.300 | InsideCity |
Store |
OpenCountry |
Mountain |
||||
LivingRoom | 0.250 | Office |
Kitchen |
Bedroom |
Office |
||||
Office | 0.590 | Mountain |
Bedroom |
Industrial |
InsideCity |
||||
Industrial | 0.330 | LivingRoom |
TallBuilding |
InsideCity |
Store |
||||
Suburb | 0.570 | Coast |
InsideCity |
Kitchen |
Street |
||||
InsideCity | 0.300 | Suburb |
Coast |
TallBuilding |
Store |
||||
TallBuilding | 0.580 | Store |
Forest |
LivingRoom |
Coast |
||||
Street | 0.500 | Kitchen |
Store |
Highway |
Suburb |
||||
Highway | 0.750 | Suburb |
Industrial |
Coast |
Mountain |
||||
OpenCountry | 0.310 | Forest |
Kitchen |
Suburb |
Coast |
||||
Coast | 0.670 | Mountain |
Mountain |
InsideCity |
OpenCountry |
||||
Mountain | 0.700 | Store |
Bedroom |
Kitchen |
Forest |
||||
Forest | 0.640 | Mountain |
Street |
Suburb |
Kitchen |
||||
Category name | Accuracy | Sample training images | Sample true positives | False positives with true label | False negatives with wrong predicted label |