This project aims at implementing scene recognition techniques using tiny images, nearest neighbor classification, bags of quantized local features and linear classifiers learned by support vector machines. Below is the comparison of different combinations and their performances.
Feature | Classifier | Accuracy |
---|---|---|
Tiny Image | Nearest Neighbor | 21.3% |
Bag of SIFT | Nearest Neighbor | 52.1% |
Bag of SIFT | SVM | 68.1% |
Bag of SIFT & GIST | Nearest Neighbor | 60.7% |
Bag of SIFT & GIST | SVM | 76.7% |
The scene classification results visualization is attached as below. The accuracy (mean of diagonal of confusion matrix) reaches 0.767.
Category name | Accuracy | Sample training images | Sample true positives | False positives with true label | False negatives with wrong predicted label | ||||
---|---|---|---|---|---|---|---|---|---|
Kitchen | 0.670 | Bedroom |
LivingRoom |
Store |
Bedroom |
||||
Store | 0.660 | InsideCity |
InsideCity |
TallBuilding |
TallBuilding |
||||
Bedroom | 0.650 | Kitchen |
Kitchen |
Kitchen |
LivingRoom |
||||
LivingRoom | 0.510 | Street |
Bedroom |
Bedroom |
Bedroom |
||||
Office | 0.960 | LivingRoom |
TallBuilding |
Kitchen |
Kitchen |
||||
Industrial | 0.680 | Store |
OpenCountry |
OpenCountry |
Mountain |
||||
Suburb | 0.980 | OpenCountry |
Industrial |
LivingRoom |
Industrial |
||||
InsideCity | 0.750 | Store |
Street |
Office |
Store |
||||
TallBuilding | 0.820 | Industrial |
Store |
Coast |
InsideCity |
||||
Street | 0.800 | Highway |
InsideCity |
Highway |
Industrial |
||||
Highway | 0.850 | Coast |
Coast |
Street |
OpenCountry |
||||
OpenCountry | 0.590 | Coast |
Bedroom |
Coast |
Coast |
||||
Coast | 0.820 | Mountain |
OpenCountry |
OpenCountry |
OpenCountry |
||||
Mountain | 0.870 | Forest |
OpenCountry |
OpenCountry |
OpenCountry |
||||
Forest | 0.900 | Highway |
TallBuilding |
Mountain |
Mountain |
||||
Category name | Accuracy | Sample training images | Sample true positives | False positives with true label | False negatives with wrong predicted label |
The result shows that SVM performs better than NN as it increases accuracy rates from 52.1% to 68.1% when using bag of sift; 60.7% to 76.7% when using bag of sift plus gist. Meanwhile, use of gist increases accuracy by 8.6% when using NN and SVM. There might still be room for improvement if gist and sift descriptors are combined in a smarter way.