- Due: Sunday March 3 11:55pm
- Starter code: starter code
- Submit to Gradescope
- This portion (HW1) counts 12% of your total grade
In this homework, we will learn different ways of visualizing and using data gradients, including saliency maps, fooling images, class visualizations, and style transfer. This homework is divided into two parts:
- Understand network visualization and implement saliency maps, fooling images, class visualizations
- Understand and implement style transfer
Note that this homework is adapted from assignment 3 from the Standford CS231n course.
Assuming you already have homework 1 dependencies installed, here is some prep work you need to do. First, install the
pip install future
Then download the
cd cs7643/datasets bash get_imagenet_val.sh
We will use PyTorch (v1.0) to complete the problems in this homework, and has been tested with python2.7 on Linux and Mac.
Throughout this homework, we will use SqueezeNet, which should enable you to easily perform all the experiments on a CPU. You are encouraged to use a larger model to finish the rest of the experiments if GPU resouces are not a problem for you, but please highlight the backbone network you use in your implementation if you do it.
Switching a backbone network is quite easy in PyTorch. You can refer to the torch.vision model zoo for more information.
- Iandola et al, “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5MB model size”, arXiv 2016
NetworkVisualization-Pytorch.ipynb. We will explore the use of image gradients for generating new images, by studying and implementing key components in three papers:
- Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. “Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps”, ICLR Workshop 2014.
- Szegedy et al, “Intriguing properties of neural networks”, ICLR 2014
- Yosinski et al, “Understanding Neural Networks Through Deep Visualization”, ICML 2015 Deep Learning Workshop
You will need to first read these papers, and then we will guide you to understand them deeper with some problems.
Q1.1: Saliency Maps (10 points)
You need to implement
compute_saliency_maps function referring to section 3 of the first paper, which describes a method to understand which part of an image is important for classification by visualizing the gradient of the correct class score with respect to the input image. You first want to compute the loss over the correct scores, and then compute the gradients with a backward pass.
Q1.2: Generating Fooling Images (10 points)
Several papers have suggested ways to perform optimization over the input image to construct images that break a trained ConvNet. Given a trained ConvNet, an input image, and a desired label, that we can add a small amount of noise to the input image to force the ConvNet to classify it as having the desired label. You need to generate a fooling image in
make_fool_image referring to the second paper. You should perform gradient ascent on the score of the target class, stopping when the model is fooled.
Q1.3 Class Visualization (10 points) [Extra Credit for CS4803DL]
You need to implement
create_class_visualization function referring to the third paper. By starting with a random noise image and performing gradient ascent on a target class, we can generate an image that the network will recognize as the target class.
- To get the full credits for Part 1, you need to generate the outputs similar to the referred papers.
Another application of image gradients is style transfer. This has recently become quite popular. In this notebook, we will study and implement the style transfer technique from:
The general idea is to take two images (a content image and a style image), and produce a new image that reflects the content of one but the artistic style of the other. We will do this by first formulating a loss function that matches the content and style of each respective image in the feature space of a deep network, and then performing gradient descent on the pixels of the image itself.
StyleTransfer-Pytorch.ipynb. Implement the loss functions for this task and the training update code.
Q2.1 Implement Content Loss (3 points)
Content loss measures how much the feature map of the generated image differs from the feature map of the source image. Implement the
content_loss function and pass the
Q2.2 Implement Style Loss (6 points)
First, compute the Gram matrix which represents the correlations between the responses of each filter, by implementing the function
gram_matrix and pass
gram_matrix_test. Then implement
style_loss function and pass the
style_loss_test. Each of the function worth 3 points.
Q2.3 Implement Total Variation Loss (3 points)
Implement total variation regularization loss in
tv_loss, which is the sum of the squares of differences in the pixel values for all pairs of pixels that are next to each other (horizontally or vertically). You need to both pass
tv_loss_test and provide an efficient vectorized implementation to receive the full credit.
Q2.4 Finish Style Transfer (6 points)
style_transfer function and figure out what are all the parameters, inputs, solvers, etc. The update rule in the following block is hold out for you to finish. What you need to implement is the update rule with by forwarding it to criterion functions and perform the backward update.
You need to generate the pretty pictures outputs which are similar to the given examples in the following block to receive full credits.
Q2.5 Feature Inversion (2 points)
Suppose you implement things correctly, what you have done can do another cool thing. In an attempt to understand the types of features that convolutional networks learn to recognize, the following paper attempts to reconstruct an image from its feature representation. We can easily implement this idea using image gradients from the pretrained network, which is exactly what we did above (but with two different feature representations).
- Aravindh Mahendran, Andrea Vedaldi, “Understanding Deep Image Representations by Inverting them”, CVPR 2015
- Q2.1 ~ Q2.3: For each of the loss function in part 2, you will need to pass the unit test to receive full credits, otherwise it will be 0.
- Q2.4: For the final output you will be expected to generate the images similar to the example shown in
StyleTransfer-Pytorch.ipynbto receive the full credits.
- Q2.5: Just run it and generate the outputs. If you previous implementation is correct, you will get the full credits.
Submit the results by uploading a zip file called
hw2.zip created with the following command
cd assignment/ ./collect_submission.sh
For sanity check, the zip file should contain the following components:
- All the IPython notebook files. (1 notebook file for each part)
- For Part 1, include everything under
cs7643/except for datasets folder and build folder
- For Part 2, include everything under
- All the
.pyfiles in the starter code should be included in your submission.
Step1: Convert all IPython notebooks to PDF files with the following command
jupyter-nbconvert --to pdf filename.ipynb
You should have 2 pdf files in total. Please make sure you have saved the most recent version of your jupyter notebook before running this script.
Step2: Combine all pdf files with your write-up
Please assign pages accordingly for write up submission. Failing to do so will result in violation penalties.
- You should only upload ONE PDF file to the HW2 Writeup section, and then assign the pages properly as you did for PS2.
- You should upload
hw2.zip, which includes no PDF file, to the HW2 Code section.