Vinay Bettadapura (Photo)
Vinay Bettadapura
Ph.D. student, Computer Science
Computational Perception Lab (CPL)
College of Computing (CoC), Georgia Tech
Advisor: Prof. Irfan Essa
Before coming to Georgia Tech, I did my Masters in Computer Science from Columbia University where I worked with Prof. Peter Belhumeur. I also work part-time at Google and have previously worked as a Software Engineer at Subex's Telecommunication Fraud Management Group.
[CV] CV | Research Interests | Research Projects | Publications | Work | Courses | Course Projects | Awards | Contact | Calendar |   LinkedIn Blogger Google+ Facebook
Research Interests
My research interests are in the areas of Computer Vision, Machine Learning and Ubiquitous Computing. In particular, I am interested in the role of "context" in activity recognition and event understanding.


Research Projects
Video Based Assessment of OSATS Using Sequential Motion Textures
Video Based Assessment of OSATS Using Sequential Motion Textures
A fully automated framework for video based surgical skill assessment is presented that incorporates the sequential and qualitative aspects of surgical motion in a data-driven manner. The Objective Structured Assessment of Technical Skills (OSATS) assessments is replicated, which provides both an overall and in-detail evaluation of basic suturing skills required for surgeons. Video analysis techniques are introduced that incorporate sequential motion aspects into motion textures. Significant performance improvement over standard bag-of-words and motion analysis approaches is demonstrated. The framework is evaluated in a case study that involved medical students with varying levels of expertise performing basic surgical tasks in a surgical training lab setting.

Here is the M2CAI 2014 paper [M2CAI 14] (won the best paper award)

Activity Recognition From Videos Using Augmented Bag-of-Words
Activity Recognition From Videos Using Augmented Bag-of-Words
Data-driven techniques to augment Bag of Words (BoW) models are explored, which allows for more robust modeling and recognition of complex long-term activities, especially when the structure and topology of the activities are not known a priori. The goal is to address the limitations of standard BoW approaches, which fail to represent the underlying temporal and causal information that is inherent in activity streams. In addition, the use of randomly sampled regular expressions to discover and encode patterns in activities is proposed. Applications include long-term activity recognition, skill assessment, functional categorization and anomaly detection.

Here is the CVPR 2013 Project Webpage (PDF and Code).

Detecting Insider Threats
Detecting Insider Threats in a Real Corporate Database of Computer Usage Activity
This project is a multi-university collaboration with SAIC and DARPA, to develop, integrate and evaluate new approaches to detect the weak signals characteristic of insider threats on organizations’ information systems. The system combines structural and semantic information from a real corporate database of monitored activity on their users’ computers to detect independently developed red team inserts of malicious insider activities. Several algorithms have been developed and tested on 5.5 million actions per day from approximately 5,500 users.

Our main contribution has been the design and development of Vector Space Models (VSMs) for insider threat detecion. Here is the KDD 2013 paper [KDD 13]

Activity Recognition Through IMS
Recognizing Water-Based Activities in the Home Through Infrastructure-Mediated Sensing
In this project, we explore infrastructure-mediated sensing combined with a vector space model learning approach as the basis of an activity recognition system for the home. We propose the use of a single-sensor water-based system in recognizing eleven high-level activities in the kitchen and bathroom, such as cooking, shaving, etc. Results from our two studies show that our system can estimate activities with overall accuracy of 82.69% for one individual and 70.11% for a group of 23 participants.

As far as we know, this work is the first to employ infrastructure-mediated sensing for inferring high-level human activities in a home setting. Here is the UbiComp 2012 paper [UbiComp 12]

Activity Recognition
Activity Recognition from Wide Area Motion Imagery
This project aims at recognizing anomalous activities from aerial videos. My work is a part of the Persistent Stare Exploitation and Analysis System (PerSEAS) research program which aims to develop software systems that can automatically and interactively discover actionable intelligence from airborne, wide area motion imagery (WAMI) in complex urban environments.

A glimpse of this project can be seen here.

Electronics Field Guide
Leafsnap: An Electronics Field Guide

This project aims to simplify the process of plant species identification using visual recognition software on mobile devices such as the iPhone. This work is part of an ongoing collaboration with researchers at Columbia University, University of Maryland and the Smithsonian Institution. My major contribution to this project was the server's database integration and management. I also worked on stress-testing the backend server to improve its performance and scalability.

The free iPhone app can be downloaded from the app-store. Here is the project webpage and here is a video explaining the app's usage. Finally, Leafsnap in the news!

Face Verification
Visual Attributes for Face Verification

The project involves face verification in uncontrolled settings with non-cooperative subjects. The method is based on attribute (binary) classifiers that are trained to recognize the degrees of various visual attributes like gender, race, age, etc. Here is the project page.

I was a part of this research at Columbia University from December 2009 to May 2010. I mainly worked on Boosting to improve the classifiers' performance.

Face Rec
Face Recognition Using Gabor Wavelets

The face representation is based on a Gabor wavelet transform. The features are extracted using a carefully chosen symmetrical Gabor wavelet matrix and a Multi Layer Perceptron is used for classification. The designed system is insensitive to small changes in head poise and homogenous or step illumination changes and is robust against facial hair and glasses for small datasets.

This was my undergraduate thesis supervised by Dr. C. N. S. Ganesh Murthy, Principal Scientist at Mercedes-Benz Research and Development, Bangalore, India. Here is the project report [FACE REC]


Link to my Google Scholar page.
  1. Y. Sharma, V. Bettadapura, et al., "Video Based Assessment of OSATS Using Sequential Motion Textures", 5th MICCAI Workshop on Modeling and Monitoring of Computer Assisted Interventions (M2CAI 2014), Boston, USA, September 2014. [KDD 2013] (won the best paper award)

  2. T. E. Senator, et al., "Detecting Insider Threats in a Real Corporate Database of Computer Usage Activity", 19th ACM SIGKDD Conf. on Knowledge Discovery and Data Mining (KDD 2013), Chicago, USA, August 2013. [Acceptance Rate: 17.4% (126/726)] [KDD 2013]

  3. V. Bettadapura, G. Schindler, T. Ploetz, I. Essa, "Augmenting Bag-of-Words: Data-Driven Discovery of Temporal and Structural Information for Activity Recognition", 26th IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2013), Portland, USA, June 2013. [Acceptance Rate: 25.2% (472/1870)] [CVPR 13] [Project Webpage]

  4. E. Thomaz, V. Bettadapura, G. Reyes, M. Sandesh, G. Schindler, T. Ploetz, G. Abowd, I. Essa, "Recognizing Water-Based Activities in the Home Through Infrastructure-Mediated Sensing", 14th ACM Conference on Ubiquitous Computing (UbiComp 2012), pp. 85-94, Pittsburgh, USA, September 2012. [Acceptance Rate: 19% (58/301)] [UbiComp 12]

  5. V. Bettadapura, "Face Expression Recognition and Analysis: The State of the Art", Tech Report, arXiv:1203.6722, April 2012. [FACE EXP REC]

  6. V. Bettadapura, D. R. Sai Sharan, "Pattern Recognition with Localized Gabor Wavelet Grids", IEEE Conference on Computational Intelligence and Multimedia Applications, vol. 2, pp. 517-521, Sivakasi, India, December 2007. [ICCIMA 07]

  7. V. Bettadapura, B. S. Shreyas, C. N. S Ganesh Murthy, "A Back Propagation Based Face Recognition Model Using 2D Symmetric Gabor Features", IEEE Conference on Signal Processing, Communications and Networking, pp. 433-437, Chennai, India, February 2007. [ICSCN 07]

  8. V. Bettadapura, B. S. Shreyas, "Face Recognition Using Gabor Wavelets", 40th IEEE Asilomar Conference on Signals, Systems and Computers, pp. 593-597, Pacific Groves (Monterey Bay), California, USA, October 2006. [ASILOMAR 06]


Work Experience
  1. Google - Software Engineering Intern (August 2013 - Present): Working on activity and event understanding using videos, images and sensor data.

  2. Google Geo - Software Engineering Intern (May 2013 - August 2013): Worked with the Google Earth and Maps team on improving the quality of the satellite imagery.

  3. Google Research - Software Engineering Intern (May 2012 - August 2012): Worked with the Video Content Analysis team in developing algorithms and building systems for object detection and categorization in YouTube videos.

  4. Subex - Software Engineer (June 2006 - December 2008): Design and development of telecommunication fraud protection and anomaly detection systems. Worked on the mathematical modeling of user behaviors, data mining to detect anomalies in the signals and the design and development of the back-end server, database and web interfaces.


Fall 2011
Knowledge-Based AI (CS 7637)

Numerical Linear Algebra (MATH 6643)

Special Problems (CS 8903)
Prof. Ashok Goel

Prof. Silas Alben

Prof. Irfan Essa
Spring 2011
Machine Learning (CS 7641)

Special Problems (CS 8903)
Prof. Charles Isbell

Prof. Irfan Essa
Fall 2010
Computer Vision (CS 7495)

Grad Studies (CS 7001)

Special Problems (CS 8903)
Prof. Jim Rehg

Prof. Gregory Abowd and Prof. Nick Feamster

Prof. Irfan Essa
Spring 2010
Operating Systems (COMS W4118)

Projects in Computer Science (COMS E6901)

Research Assistantship (COMS E9910)
Prof. Junfeng Yang

Prof. Peter Belhumeur

Prof. Peter Belhumeur
Fall 2009
Analysis of Algorithms (COMS W4231)

Biometrics (COMS W4737)

Projects in Computer Science (COMS E6901)
Prof. Clifford Stein

Prof. Peter Belhumeur

Prof. Peter Belhumeur
Spring 2009
Programming Languages and Translators (COMS W4115)

Computational Aspects of Robotics (COMS W4733)

Visual Interfaces to Computers (COMS W4735)

Machine Learning (COMS 4771)
Prof. Alfred Aho

Prof. Peter Allen

Prof. John Kender

Prof. Tony Jebara


Course Projects
Automatic Geo-Tagging of Photos Using Google Street View Images

The goal of this project was to develop a system that automatically geo-tags an image by comparing it with a large collection of geo-tagged images (Google Street View images, in our case). SIFT descriptors are computed for the images and the matching is done using a KD-Tree. This project is an implementation based on the work of Schindler et al. (CVPR 2007) and Zamir et al. (ECCV 2010). This project was done as a part of the 'Computer Vision' course at Georgia Tech (instructor: Prof. Jim M. Rehg).

Here is the project presentation [GEO-TAG]

Raven's Test
Solving Raven's Matrices Using Visual and Propositional Reasoning

The goal of this project is to learn about the close relationship between learning and problem solving. In this project, we explore this relationship by considering several problems from the Raven's test of intelligence (Raven's matrices). We develop techniques to solve the Raven's matrices using both propositional and visual reasoning. This project was done as a part of the 'Knowledge Based AI' course at Georgia Tech (instructor: Prof. Ashok K. Goel).

Here are the project reports: Solving the Raven's matrices using Propositional Reasoning [GEO-TAG] using Visual Reasoning [GEO-TAG] and a combination of Visual and Propositional Reasoning [GEO-TAG]

The SN*W Programming Language

The SN*W Programming Language is a special purpose declarative language designed for Genetic Programming by allowing programmers to easily harness the power of Genetic Algorithms (GA). A SN*W program is a simple description of an organism structure along with simple methods for construction, mutation, selection and recombination. The SN*W compiler translates these events into a full environmental simulation. The language was developed by five of us as a part of the Programming Languages and Translators course at Columbia under the guidance of Prof. Alfred V. Aho.

Here is the complete SN*W Report (includes the Reference Manual and Tutorial) [CV]

Guess Who
Guess Who? - An iPhone Application for Real-Time Face Recognition Using Side Profiles

The goal of this project was to develop a face recognition system that could recognize people based on side-profile images. The system was designed to be invariant to head-tilt and pose. An iPhone application was developed to showcase the real-time capabilities of the system. The user takes the profile picture of a person using his/her iPhone and uploads it to the server (the server is a Ruby on Rails application). The server does the recognition and sends the results back which gets displayed on the iPhone UI. The entire request-process-response loops takes no longer than 3.5 seconds (on average). This project was done as a part of the Biometrics course at Columbia (instructor: Prof. Peter N. Belhumeur).

Here is the project report [GUESS WHO]

Visual Combination Lock
Visual Combination Lock

The goal of this project was to take a sequence of visual images, and to determine from them if the user has placed some body part(s) in a predetermined sequence of locations and/or poses. If the sequence of images matches the predetermined sequence, the user's access gets 'APPROVED', else the access gets 'DENIED'. An arbitrary predetermined sequence of hand gestures was used where the user displays a combination of numbers using his fingers followed by a specific hand rotation and closure of the fist. The 'Visual Lock' gets unlocked only if the hand gestures are the same as the predetermined sequence. The recognition sequence can be changed to handle any (controlled) hand gestures. This project was done as a part of the Visual Interfaces to Computer course at Columbia (instructor: Prof. John R. Kender).

Here is the project report [VISUAL LOCK]

Columbia Map Assist
Columbia Map Assist

The goal of this project was to develop a 'Columbia Map Assistant' that would describe the location of a visitor to the Columbia campus and give the visitor directions from one building to another. The first main job was to use the given map to encode the buildings' shapes, to determine their spatial relationships to each other and to filter out any relationships that are unnecessary because they can be easily inferred. The second main job was to use these descriptions to generate a natural language description that unambiguously indicates how to reach the goal from the source. This project was done as a part of the Visual Interfaces to Computer course at Columbia (instructor: Prof. John R. Kender).

Here is the project report [MAP ASSIST]

Visual Information Retrieval
Visual Information Retrieval

The goal of this project was to write and analyze algorithms that explore different ways of deciding the degree of similarities amongst actual images. A set of images of fruits and vegetables along with a few random objects (distracters) were used. The algorithm performs a color-based match and a texture based match and then uses the total match to decide the similarity amongst the images. This kind of an algorithm is useful in retrieving images based on the visual content rather than the associated labels or other metadata. This project was done as a part of the Visual Interfaces to Computer course at Columbia (instructor: Prof. John R. Kender).

Here is the project report [VISUAL RETREIVAL]




vinay [at]
304F, College of Computing Building
801, Atlantic Drive
Atlanta, GA 30332
Also on:
LinkedIn Blogger Google+ Facebook




VTU Logo  --- Subex Logo  --- CS@CU Logo  --- COC@GT Logo  --- RIM@GT Logo  --- GOOGLE Logo

Valid HTML 4.01 Transitional Valid CSS!