Project 1: Learning from Unlabeled Data [Easy]

Motivation:
 What problem did this project try to solve?

Training neural networks under supervised learning settings has achieved remarkable success, and in the past year or so focus has shifted to methods that use weak labels (e.g. just a point on an object if you want to learn object detection outputting a bounding box or segmentation) and unlabeled data. This project involves either reimplementing these methods, performing analysis on their performance (especially generalizability, etc.), or applying them to new domains beyond image classification (3D data for example).

Who cares? If you are successful, what difference will it make?

The use of unlabeled data is a topic almost everyone is interested in, given the difficulty and cost of annotating data.

Approaches:
 How is the problem approached today, and what are the limits of current practice?

Some of the recent methods for semi-supervised learning (combination of a small amount of labeled and unlabeled data) and self-supervised learning (only unlabeled data) includes:

What’s the availability of the baselines? Is there any standard implementation, and are they open-sourced?

Yes quite a few reimplementation of these.

Metrics:
 What are the common datasets and benchmarks?
 What’s the availability of the datasets? Are they open-sourced? Is there any restriction regarding how students should use them?
 What is the state-of-the-art approach, if applicable?

Typically simple image classification datasets are used (CIFAR-10/100 up to ImageNet).

What is the ideal size for a team tackling this project?

Any size is fine (3-4 as usual is recommended).

What are the computing resources required to compute a baseline? (CPU/GPU days)

Some of these methods can be computationally heavy or require lots of GPU memory, so some care is needed to choose which methods to implement/try and/or management of expectations in terms of what datasets you can expect to replicate these results in.

Project 2 (New for 2021!): Behavior Classification from Videos [Medium-Hard]

This challenge website poses the problem of classifying behavior (e.g. of mice) from videos. There are several levels of difficulty including vanilla classification, style transfer, and transfer learning.

Project 3 (New for 2021!): Pruning and quantization

Neural networks have a lot of redundancy, and the precision (e.g. 32-bit float) with which weights are represented are not always necessary. A number of research groups have looked at reducing this through pruning and quantization. See how far you can take it! Note that you may not see actual computational savings from pruning (as an example) unless your software/hardware supports sparse operations. However, you can judge success based on how many weights you can prune.

Project 4: Graph Neural Networks [Medium]

Graph neural networks allow us to learn on structured data such as graphs (think social networks, web page links, etc.) and learn feature embeddings for nodes, edges, and entire graphs to perform some prediction. Implement a simple graph neural network to process various sources of data, including wikipedia, knowledge graphs, etc. There is a large amount of data and pytorch/tensorflow models available for this. One popular task is link prediction, where you would like to predict whether the graph should have an edge between two nodes, given all of the existing nodes and edges you currently have in the training set.

Project 5: Cool Applications [Easy-Advanced]

There are a number of interesting applications that one can tackle:

  • [Easy] Converting an image of food into a recipe example blog
  • [Medium] Image captioning: There is a lot of work on image captioning, which combines language and vision processing. You can implement some existing models which utilize attention heavily (e.g. https://panderson.me/up-down-attention/) or investigate how large-scale pre-trained language models make everything better.
  • [Medium] 3D object classification: A number of methods have been applied to 3D point cloud data, with many datasets such as ModelNet. A number of interesting architectures can be implemented, including PointNet or PointPillars.
  • [Medium/Advanced] Cloning someone’s voice from a few samples (can use LibriSpeech dataset) example paper

Project 5: ICLR Reproducibility Challenge

The ICLR reproducibility challenge is typically due in the Fall, but you can follow the guidelines and reproduce your favorite paper, write up a report according to the instructions, and submit later!

© 2020 Georgia Tech