Homework 2 (coding) - CS 4803DL/7643

Brief

Due: Sunday March 3 11:55pm
Starter code: starter code
Submit to Gradescope
This portion (HW1) counts 12% of your total grade

In this homework, we will learn different ways of visualizing and using data gradients, including saliency maps, fooling images, class visualizations, and style transfer. This homework is divided into two parts:

Understand network visualization and implement saliency maps, fooling images, class visualizations
Understand and implement style transfer

Note that this homework is adapted from assignment 3 from the Standford CS231n course.

Setup

Assuming you already have homework 1 dependencies installed, here is some prep work you need to do. First, install the future:

pip install future

Then download the imagenet_val_25 dataset

cd cs7643/datasets
bash get_imagenet_val.sh

We will use PyTorch (v1.0) to complete the problems in this homework, and has been tested with python2.7 on Linux and Mac.

Throughout this homework, we will use SqueezeNet, which should enable you to easily perform all the experiments on a CPU. You are encouraged to use a larger model to finish the rest of the experiments if GPU resouces are not a problem for you, but please highlight the backbone network you use in your implementation if you do it.

Switching a backbone network is quite easy in PyTorch. You can refer to the torch.vision model zoo for more information.

Iandola et al, “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5MB model size”, arXiv 2016

Part 1

Open notebook NetworkVisualization-Pytorch.ipynb. We will explore the use of image gradients for generating new images, by studying and implementing key components in three papers:

You will need to first read these papers, and then we will guide you to understand them deeper with some problems.

Q1.1: Saliency Maps (10 points)

You need to implement compute_saliency_maps function referring to section 3 of the first paper, which describes a method to understand which part of an image is important for classification by visualizing the gradient of the correct class score with respect to the input image. You first want to compute the loss over the correct scores, and then compute the gradients with a backward pass.

Q1.2: Generating Fooling Images (10 points)

Several papers have suggested ways to perform optimization over the input image to construct images that break a trained ConvNet. Given a trained ConvNet, an input image, and a desired label, that we can add a small amount of noise to the input image to force the ConvNet to classify it as having the desired label. You need to generate a fooling image in make_fool_image referring to the second paper. You should perform gradient ascent on the score of the target class, stopping when the model is fooled.

Q1.3 Class Visualization (10 points) [Extra Credit for CS4803DL]

You need to implement create_class_visualization function referring to the third paper. By starting with a random noise image and performing gradient ascent on a target class, we can generate an image that the network will recognize as the target class.

Notes

To get the full credits for Part 1, you need to generate the outputs similar to the referred papers.

Part 2

Another application of image gradients is style transfer. This has recently become quite popular. In this notebook, we will study and implement the style transfer technique from:

Gatys et al., “Image Style Transfer Using Convolutional Neural Networks”, CVPR 2015.

The general idea is to take two images (a content image and a style image), and produce a new image that reflects the content of one but the artistic style of the other. We will do this by first formulating a loss function that matches the content and style of each respective image in the feature space of a deep network, and then performing gradient descent on the pixels of the image itself.

Open notebook StyleTransfer-Pytorch.ipynb. Implement the loss functions for this task and the training update code.

Q2.1 Implement Content Loss (3 points)

Content loss measures how much the feature map of the generated image differs from the feature map of the source image. Implement the content_loss function and pass the content_loss_test.

Q2.2 Implement Style Loss (6 points)

First, compute the Gram matrix which represents the correlations between the responses of each filter, by implementing the function gram_matrix and pass gram_matrix_test. Then implement style_loss function and pass the style_loss_test. Each of the function worth 3 points.

Q2.3 Implement Total Variation Loss (3 points)

Implement total variation regularization loss in tv_loss, which is the sum of the squares of differences in the pixel values for all pairs of pixels that are next to each other (horizontally or vertically). You need to both pass tv_loss_test and provide an efficient vectorized implementation to receive the full credit.

Q2.4 Finish Style Transfer (6 points)

Read the style_transfer function and figure out what are all the parameters, inputs, solvers, etc. The update rule in the following block is hold out for you to finish. What you need to implement is the update rule with by forwarding it to criterion functions and perform the backward update.

You need to generate the pretty pictures outputs which are similar to the given examples in the following block to receive full credits.

Q2.5 Feature Inversion (2 points)

Suppose you implement things correctly, what you have done can do another cool thing. In an attempt to understand the types of features that convolutional networks learn to recognize, the following paper attempts to reconstruct an image from its feature representation. We can easily implement this idea using image gradients from the pretrained network, which is exactly what we did above (but with two different feature representations).

Aravindh Mahendran, Andrea Vedaldi, “Understanding Deep Image Representations by Inverting them”, CVPR 2015

Notes

Q2.1 ~ Q2.3: For each of the loss function in part 2, you will need to pass the unit test to receive full credits, otherwise it will be 0.
Q2.4: For the final output you will be expected to generate the images similar to the example shown in StyleTransfer-Pytorch.ipynb to receive the full credits.
Q2.5: Just run it and generate the outputs. If you previous implementation is correct, you will get the full credits.

Summary Deliverables

Code Submission

Submit the results by uploading a zip file called hw2.zip created with the following command

cd assignment/
./collect_submission.sh

For sanity check, the zip file should contain the following components:

All the IPython notebook files. (1 notebook file for each part)
For Part 1, include everything under cs7643/ except for datasets folder and build folder
For Part 2, include everything under styles/.
All the .py files in the starter code should be included in your submission.

Write-Up Submission

Step1: Convert all IPython notebooks to PDF files with the following command

  jupyter-nbconvert --to pdf filename.ipynb

You should have 2 pdf files in total. Please make sure you have saved the most recent version of your jupyter notebook before running this script.

Step2: Combine all pdf files with your write-up

Please assign pages accordingly for write up submission. Failing to do so will result in violation penalties.

Notes

You should only upload ONE PDF file to the HW2 Writeup section, and then assign the pages properly as you did for PS2.
You should upload hw2.zip, which includes no PDF file, to the HW2 Code section.

References:

CS231n Convolutional Neural Networks for Visual Recognition