Award Recognizes Professor's Lasting Contributions to Information Retrieval
Unsupervised learning relies on finding relevant information in large databases. This is possible thanks in part to groundbreaking research by College of Computing Professor Santosh Vempala and his collaborators that was done 20 years ago.
The team's lasting contributions to information retrieval were recently recognized with a test-of-time award. The annual conference Principles of Database Systems (PODS) gave their paper, Latent Semantic Indexing: A Probabilistic Analysis, its Gems of PODS honor at the this year’s conference.
The 1998 paper analyzed a popular spectral algorithm and introduced the very first topic model, now a standard in unsupervised learning.
The researchers discovered that if a database is viewed as a matrix, a computer algorithm can perform singular-value decomposition, a matrix reduction technique that pulls out the most significant directions to explain the data. This step not only involves minimal distortion of data, but it actually yields better retrieval results than the full original matrix.
“This was one of the first provable techniques for automatically extracting information from data,” Vempala said.
The model has been significantly enhanced in the decades since the paper was published. In that time, the work has influenced prominent computing fields such as spectral methods, data mining, machine learning, and deep neural networks.
Vempala wrote the paper as a summer intern at IBM with Prabhakar Raghavan (now VP of Engineering at Google), together with Columbia University Professor Christos Papadimitriou (then at Berkeley), and Meiji University Professor Hisao Tamaki.
As part of the award, Papadimitriou gave a talk during the PODs conference about the paper and its influence.
We are thrilled to announce Vivek Sarkar as the new Dean of the College of Computing at Georgia Tech! With a distinguished career spanning academia and industry, Sarkar's leadership promises to elevate our community to new heights. https://t.co/2mX5D46cJz pic.twitter.com/LxpLTCXWZV
— Georgia Tech Computing (@gtcomputing) April 12, 2024
@GeorgiaTech's dedication to excellence in computer science (CS) has been recognized once again, with the latest U.S. News and World Report rankings unveiling the institution at 7th place overall for graduate CS studies.https://t.co/qavNUSTb7n pic.twitter.com/BcGyGBQld8
— Georgia Tech Computing (@gtcomputing) April 10, 2024