Project 4 / Scene Recognition with Bag of Words

In this project, two types image representations: tiny images and bag of SIFTs, and two types of classifiers: nearest neighbor and SVM, are used to classify the scene.

Tiny Images

In this image representation, the original image is resized to 16x16, and normalized with zero mean and unit length.

Bag of SIFTs

In this image representation, the original image is first smoothed using vl_imsmooth. The bin size is set to 8, the manification coefficient is set to 3, step size for building the vocabulary is 10 and the step size for getting the actual bag of SIFT features is 8, and the 'fast' option is used to reduce the running time to 300 seconds. In order to increase the accuracy, instead of increment the histogram for the minimum of distances, five entries of the histogram representing five of the smallest distances are increased by one.

Nearest Neighbor

K-nearest neighbor classifies the instance according to its k nearest neighbors. K is chosen to be five here.

SVM

The SVM used is a linear SVM with a lambda of 0.001. In theory, increasing the regularization will increase the training error but decrease the test error. But choosing a value of lambda that's reasonable will balance the trade off.

Scene classification results visualization


When using random classifier, the Accuracy (mean of diagonal of confusion matrix) is 0.072


When using tiny images and nearest neighbor classifier, the Accuracy (mean of diagonal of confusion matrix) is 0.215


When using bag of SIFTs and nearest neighbor classifier, the Accuracy (mean of diagonal of confusion matrix) is 0.499


When using bag of SIFTs and SVM classifier, the Accuracy (mean of diagonal of confusion matrix) is 0.604

Category name Accuracy Sample training images Sample true positives False positives with true label False negatives with wrong predicted label
Kitchen 0.430
Bedroom

LivingRoom

Store

InsideCity
Store 0.350
Kitchen

LivingRoom

LivingRoom

Office
Bedroom 0.420
LivingRoom

Store

Kitchen

Suburb
LivingRoom 0.260
Bedroom

Kitchen

Bedroom

Kitchen
Office 0.650
Kitchen

Bedroom

Kitchen

LivingRoom
Industrial 0.290
LivingRoom

Street

LivingRoom

Office
Suburb 0.830
InsideCity

Mountain

Bedroom

Forest
InsideCity 0.590
Industrial

Street

TallBuilding

Bedroom
TallBuilding 0.800
Industrial

Kitchen

Forest

Forest
Street 0.750
Mountain

LivingRoom

Highway

Industrial
Highway 0.710
Mountain

Suburb

Coast

Coast
OpenCountry 0.480
TallBuilding

Street

Coast

Coast
Coast 0.800
Highway

OpenCountry

Highway

OpenCountry
Mountain 0.750
Forest

OpenCountry

OpenCountry

Highway
Forest 0.950
Bedroom

TallBuilding

OpenCountry

Street
Category name Accuracy Sample training images Sample true positives False positives with true label False negatives with wrong predicted label