- About the College
- Future Students
- Current Students
Company: Georgia Tech
- General Information
- Job Type: Part-time
- Location: Atlanta, GA
- Compensation: $20/hour
- Contact Information
- Name: Nick Feamster
- Email: firstname.lastname@example.org
- Address: Yogesh Mundada <email@example.com> and Nick Feamster <firstname.lastname@example.org>
- Website: http://noise-lab.net/
Have you ever wondered how many sites have your credit card number? Or, have you ever wondered how many sites have a certain version of your password? Do you think you might have reused the password you have used on your banking Web site on another site? What if you decided that you wanted to “clean up” your personal information on some of the sites where you’ve leaked this information. Would you even know where to start?
Appu (http://appu.gtnoise.net/) is a Chrome extension that keeps track of what we call your privacy footprint on the Web. Every time you enter personally identifiable information (address, credit card information, password, etc.) into a Web site, Appu performs a cryptographic hash of that information, associates the hash with that site, and stores it, to keep track of where you have entered various information. If you ever re-enter the same password on a different site, Appu will warn you that you have reused a password and where you’ve re-used that password. As a user, you will immediately see a warning like the one below:
In the design of Appu, we have tackled many interesting problems from computer science, such as security & cryptography, algorithms, data structures, and distributed computing. We envision that there would be even more exciting problems that would need to be addressed as the extension gets more popular and we get more users. You can read more about it at: http://appu.gtnoise.net/appu.html
This job will be a good opportunity to work with world class computer science researchers and get a hands on experience with a real deployed system used by many users.
We have a few challenging problems that we would need a few students to work on.
Project 1: Manage and test cracked password database for Appu extension:
Number of persons: One undergrad (3rd or 4th year)
Time: Expected time to finish test framework: 3-4 weeks. Can continue gathering and adding password lists throughout Spring 2014..
Appu can tell users if any of their passwords belong to a list that has already been cracked. We have to be intelligent about this: the list of cracked passwords is too large to share with each user, but we can’t ask users to send us all their passwords in the clear (or hashed). This project involves use of a probabilistic data structure (a Bloom filter) to allow users to query our database without revealing their passwords.
This project consists of the following tasks:
Develop a python test script that will check if all passwords from a particular list are correctly stored in our set of cracked passwords.
Gather lists of cracked passwords from various sources and add them to our collection. (We already have about 37 million passwords.)
If adventurous, then the student can also use GPU-based optimizations to calculate cryptographic hashes and speed things up even more.
Project 2: Categorize websites using online classification tools:
Number of persons: One undergrad (2nd or 3rd year)
Requirements: Should know (or be willing to learn) python.
Time: 2 weeks.
no denying that users reuse passwords, and Appu can detect this. One
thing we are interested in studying with Appu is how password reuse
occurs across different categories of sites. For instance, is a given
user reusing the same password just for all food delivery sites, or are
they also using that same password for their personal banking?
Currently, we classify websites manually. This goal of this project is
to automate this process.
Fortunately, there is a serve that can drastically help with automatically categorizing websites:
It works for many sites (e.g., http://dropbox.com returns “Personal Network Storage”), so the bulk of the work will be to write a python script that takes a list of sites and categorizes them. However, there are some sites this site does not verify (e.g., http://4chan.org returns “URL- Pornography - Unverified”). For such sites, it may be necessary to consult another categorization service, or to perform some automated categorization of our own.
Project 3: Manage, create and test FPIs for Appu extension:
Number of persons: One undergrad (2nd or 3rd year)
of existing FPIs: 5 days. Thirty new FPIs: 3 weeks. Developing test
framework and continue adding new FPIs: Rest of Spring 2014.
Fetch-Personal-Info(FPI) is a domain specific language that we use in Appu project to download user’s personal information from various sites. We already have 67 FPIs. However, since this is a form of screen scraping, the website content keeps changing and hence we have to make sure that already existing FPIs keep working. We also have to add new FPIs for newly discovered account sites. Tasks involved are:
Test the 67 existing FPIs and make sure they work properly. If some of them don’t then fix them.
Create at least 30 new FPIs. We have several design principles for creating FPIs that we will expect the student to conform to.
Create a Selenium-based (http://docs.seleniumhq.org) test framework that will automatically test all FPIs and make sure all of them are working correctly.
Project 4: General testing for Appu extension:
Number of persons: One/Two undergrad(s) (1st year onwards)
Requirements: Should be smart enough to know what is expected of the software and whether it works accordingly or not.
Time: Rest of Spring 2014.
This task will involve generating and running unit and regression tests across the many Appu features.
How to Apply: Submit your resume and a statement concerning which of the four projects you would like to work on to Yogesh Mundada