Deep Learning for Perception
Georgia Tech, Spring 2016

1 Description

The purpose of the project is to demonstrate a more in-depth understanding of the material by applying it in a real-world dataset. If you are conducting research or hope to, this is your chance to take your favorite problem (whether it’s in the fields we have covered, or something completely different) and write a publication-worthy report. In fact, the final project report will be very similarly structured to a publication. The project will consist of three main stages, listed in the next section.
The projects will be conducted by teams of at least 3. If you have found a partner please let me know via email (title “8803DL Teaming”). Otherwise I will assign teams using Catme which assigns teams using a questionnaire and a combination of factors such as schedule, interests, etc.

2 Parts and Dates (Tentative)

3 What is Required

Project Proposal:
Midterm Progress: Due 03/28 10am on t-square
Final Poster Presentation/Report

4 Teaming/Honor Policy

You may use online resources, including conference/journal papers, books, etc. and you must cite any resources you use. If you use any public code or libraries you must cite them and prominantly put them in your documentation. However, your project must have a significant implementation component that goes beyond any code/libraries you use, and you must highlight where your implementation is located within the code you provide us.
You can discuss algorithms and theory on the whiteboard with other individuals or teams in the class, but under no circumstances can you use code or text from other teams.

5 Example Project Ideas

There are three different flavors of topics that are possible
  1. Experimental: Development of a set of hypotheses, implementation of the necessary components, and experimental design to test them.
    1. In [1] and [2], several tricks and recommendations are made for using backpropagation. Take 3-4 of the recommendations which require implementation, hypothesize what their contribution/effect on performance measures will be, implement them, and conduct a set of experiments to verify or disprove your hypotheses.
  2. Implementation of Existing or New Techniques, or Combination of Techniques:
    1. We covered stochastic gradient descent (backpropagation) for optimizing neural networks, but there are several other non-linear optimization techniques that use higher-order derivatives to try to do better. Pick your favorite method (L-BFGS, conjugate gradient, etc.), implement it, and perform some experiments to see which performs better on a dataset or set of datasets.
  3. Application: Take a new or existing application, design an approach, implement it and validate your results rigorously via cross-validation
    1. Mini-ImageNet: Take a subset of the ImageNet data (from the Large-Scale Visual Recognition Challenge) and experiment with architectures and techniques to improve the results.
    2. Poker: Apply deep learning to poker playing, taking in card and player information and outputing best action
    3. Microsoft Kinect: Apply deep learning to RGBD (RGB and depth) or hand/finger tracking devices
    4. Stock Market Prediction: Use RNNs or similar temporal approaches to predict stocks
    5. Head tracking: Extracting human head pose from picture or video without face information.
    6. Path Planning: Given a model of the environment, plan a path or policy (mapping current state to action)
    7. Robotics
      1. Object affordance: We can learn and predict object moving direction when we manipulate those from images and videos.
      2. Material recognition: Similarly, we can estimate what the material of an object surface is.
[1]Efficient BackProp, Y. LeCun, L. Bottou, G. Orr and K. Muller. In Orr, G. and Muller K. (Eds), Neural Networks: Tricks of the trade, Springer, 1998.
[2]Stochastic Gradient Tricks, Léon Bottou: Neural Networks, Tricks of the Trade, Reloaded, 430–445, Edited by Grégoire Montavon, Genevieve B. Orr and Klaus-Robert Müller, Lecture Notes in Computer Science (LNCS 7700), Springer, 2012.

6 Prior Year’s Projects