Deep Learning for Perception
Georgia Tech, Spring 2015
Projects

1 Description

The purpose of the project is to demonstrate a more in-depth understanding of the material by applying it in a real-world dataset. If you are conducting research or hope to, this is your chance to take your favorite problem (whether it’s in the fields we have covered, or something completely different) and write a publication-worthy report. In fact, the final project report will be very similarly structured to a publication. The project will consist of three main stages, listed in the next section.

The projects will be conducted by teams of at least 3. If you have found a partner please let me know via email (title “8803DL Teaming”). Otherwise I will assign teams using Catme which assigns teams using a questionnaire and a combination of factors such as schedule, interests, etc.

2 Parts and Dates (Tentative)

Proposal
- Due ~~February 6th~~ February 9th 10AM
- Accounts for 5% of entire class grade
- This is where you present your idea in the form of a problem statement, describe what’s already been done in the field, and design an approach. We will guide you to make sure it’s well-scoped and not too large or small.
Mid-Term Progress
- (Due ~~March 6th~~ March 23rd 10AM)
- Accounts for 5% of entire class grade
- -This is to ensure that you have made some progress on the project, and will involve a description of the final/refined project proposal as well as any progress that has been made.
Final Presentation (Due April 20th) - This will involve a description of your implementation, an analysis of your final results, and any lessons or problems you encountered.

3 What is Required

Project Proposal:

A 2-page report detailing the following:
- Problem statement/motivation: What problem are you trying to solve? Why is this hard right now?
- Related Work: What has been done in similar fields/problems? What are the limitations of current approaches?
- Approach and techniques: What is your proposed approach to solving the problem? How does it compare to existing approaches?
  - Note: It’s OK for these projects to be similar to existing approaches, although if you want to publish the results they will have to have some novelty.
- Data set: What data are you planning to apply your approach to?
  - Note: I would highly recommend choosing a public dataset, of which there are plenty these days! See for example these NLP/text datasets, and these image datasets. It’s OK (and typically recommended) to take a subset to reduce training time, especially during development!
- Experimental Methodology: What specific experiments are you planning on conducting? How are they testing the specific problem you want to solve?
A 5-minute presentation highlighting the above.

Midterm Progress

Submit in PDF form to t-square: A 3-page report detailing the following:
- Problem statement/motivation:
  - What problem are you trying to solve? Why is this hard right now?
  - This can be from previous proposal if nothing has changed but please update it if things have changed!
- Approach and techniques:
  - What is your proposed approach to solving the problem? How does it compare to existing approaches?
  - This can be from previous proposal if nothing has changed but please update it if things have changed!
- Tasks to complete planned approach
  - This should be a set of tasks that have to be completed, anywhere from ingesting data, creating training/testing/validation sets, implementing the algorithm (and steps required for that), running the experiments (and list of experiments), etc.
  - Include dataset and experiments you will conduct (again, can be from proposal if no change)
  - Provide by each task the group members that will be responsible for it
  Important note: We recommend having several goals, starting with an easy/medium one that you know you will be able to achieve, and then a more ambitious one that requires more tasks. Make sure that the easy/medium goals are achievable to ensure you have something to show at the end.
- What has been completed and what’s left to do
  - Description of which of the above tasks have been completed so far
  - Description of which of the above tasks remain and plan for accomplishing them (timeline)
Submit in PPT/PDF form to t-square: An 8-minute presentation highlighting the above.
You will be asked to fill out a review of your team members during class

Final Poster Presentation/Report

Submit a report (minimum of 6 pages in a conference paper format) containing the following:
- Problem statement/motivation (see above)
- Approach and techniques (see above)
  - Make sure to describe how/why you chose the model you did, how you designed the networks/architectures, how you tuned the hyper-parameters, how you determined details (filter sizes, etc.)
  - In more detail, the in-depth analysis can include:
    - Architecture (new): If you are designing a new architecture or major variation of existing architecture, describe your design choices and why they were made (e.g. intuition for it). This includes: Why did you choose the specific model class you did (CNN, RNN, SAE, HTM, etc.)? Why is it appropriate to your problem? What kind of information are you trying to leverage (spatial, temporal, sequence, etc.)? Why did you choose the number of layers you did? Why did you choose specifics of your network? For example, for CNN filter sizes and #s, or for RNNs number of LSTM units. Why did you choose the particular hyperparameters (learning rate, regularization, etc.)?
    - Architecture (existing): If you are leveraging an existing network from related work, you still should describe the design choices and why you think they were made by the authors. However, you should still try either variations of the architecture or exploration/analysis of its behavior/hyperparameters. For example, verifying the claims by the authors in terms of design choices, visualization of what it has learned, etc. Perform an in-depth analysis of the network across some aspect you think is interesting.
    - Algorithms/Optimization: If you implemented algorithms (new layers, optimization, etc.) then of course you can choose some set of architectures/problems and focus on the algorithms. But still talk to why you chose the particular optimization methods and pxerform an in-depth analysis of the results (under what situations do different techniques work better than others, when do they fail, etc.)
  - For at least a subset of the design choices, provide some data/justification for the choices you had to make (e.g. did the network overfit? Show that in the learning curve). If you made the design choice from the beginning, show data that supports or refutes your initial intuition (It’s OK to be wrong!). If you made the design choice in response to something you saw after your initial training/testing, show what that was.
- Related Work (see above)
- Experimental Design (see above)
- Results and analysis
  - NOTE: You can update your results until April 26th 11:55pm via a separate document. This should only include the results, which could be part of existing or new experiments, and a small writeup on how this changes some of the conclusions/discussions in the original.
  - Analyze the results, in terms of training/convergence, cross-validation results, and final accuracies. Is it overfitting? Is there not enough capacity? Would more regularization be useful? Does it seem to be modeling the aspects of the problem that you thought? Provide data and visualizations to support your statements.
  - How do the results compare to other approach (either that you implemented, or from published work on same/similar datasets if available)?
- Discussion, Conclusions, and Future Work
  - What do the results show at a higher level?
  - What conclusions can you draw? Analyze your initial claims, whether they have been proven/disproven, and why.
  - What future work can be done if you continued this work?
Submit in PDF form and print out a 36x24” poster containing the sections above.
- On Monday, we will have two 25 minute sessions (7 projects each) where the groups of 7 will stand by the their posters and the other groups will walk around to view them, ask questions, etc.
- Students will review and rate each other’s posters and the top 3 will be asked to present on Wed. where a more detailed presentation can be made and we can have a good discussion. Note that it is optional for the teams to accept to present, but is an honor to be chosen!
Submit a full archive of your code (.tgz) containing instructions for installation (dependencies, etc.) as well as full copies of major dependent libraries!
- The groups will be asked to demo the code to the TA during office hours
- Document what you implemented specifically, and what specific files you added (wrote from scratch) or modified (document what specific changes you made)

4 Teaming/Honor Policy

You may use online resources, including conference/journal papers, books, etc. and you must cite any resources you use. If you use any public code or libraries you must cite them and prominantly put them in your documentation. However, your project must have a significant implementation component that goes beyond any code/libraries you use, and you must highlight where your implementation is located within the code you provide us.

You can discuss algorithms and theory on the whiteboard with other individuals or teams in the class, but under no circumstances can you use code or text from other teams.

5 Example Project Ideas

There are three different flavors of topics that are possible

Experimental: Development of a set of hypotheses, implementation of the necessary components, and experimental design to test them.
1. In [1] and [2], several tricks and recommendations are made for using backpropagation. Take 3-4 of the recommendations which require implementation, hypothesize what their contribution/effect on performance measures will be, implement them, and conduct a set of experiments to verify or disprove your hypotheses.
Implementation of New Techniques or Combination of Techniques:
1. We covered stochastic gradient descent (backpropagation) for optimizing neural networks, but there are several other non-linear optimization techniques that use higher-order derivatives to try to do better. Pick your favorite method (L-BFGS, conjugate gradient, etc.), implement it, and perform some experiments to see which performs better on a dataset or set of datasets.
Application: Take a new or existing application, design an approach, implement it and validate your results rigorously via cross-validation
1. Mini-ImageNet: Take a subset of the ImageNet data (from the Large-Scale Visual Recognition Challenge) and experiment with architectures and techniques to improve the results.
2. Poker: Apply deep learning to poker playing, taking in card and player information and outputing best action
3. Microsoft Kinect: Apply deep learning to RGBD (RGB and depth) or hand/finger tracking devices
4. Stock Market Prediction: Use RNNs or similar temporal approaches to predict stocks
5. Head tracking: Extracting human head pose from picture or video without face information.
6. Path Planning: Given a model of the environment, plan a path or policy (mapping current state to action)
7. Robotics
  1. Object affordance: We can learn and predict object moving direction when we manipulate those from images and videos.
  2. Material recognition: Similarly, we can estimate what the material of an object surface is.

[1]Efficient BackProp, Y. LeCun, L. Bottou, G. Orr and K. Muller. In Orr, G. and Muller K. (Eds), Neural Networks: Tricks of the trade, Springer, 1998.

[2]Stochastic Gradient Tricks, Léon Bottou: Neural Networks, Tricks of the Trade, Reloaded, 430–445, Edited by Grégoire Montavon, Genevieve B. Orr and Klaus-Robert Müller, Lecture Notes in Computer Science (LNCS 7700), Springer, 2012.

6 Example Datasets

MNIST
ImageNet (take a subset!!!)

Deep Learning for Perception Georgia Tech, Spring 2015 Projects