Over the past 20 years, the Internet has opened up an entire universe of information and made it available to (according to recent statistics) a third of the world’s population, literally at their fingertips. Ironically, one area for which information is still hard to come by is the Internet itself—its performance as a conglomeration of digital networks, and the actors that influence that performance.
College of Computing students and faculty are trying to change that.
Grouped together under the broad heading of “Internet transparency,” a number of current research projects are intended to find, analyze and distribute the kind of data that makes Internet users more knowledgeable about their networks, the performance delivered by their Internet service providers (ISPs) and possible interference with the data they receive.
In particular, a team of students and researchers in the lab of Associate Professor Nick Feamster are working both to find the kind of performance information that’s currently missing from ISP advertisements and other network measurements, and to find ways of detecting censorship of network data.
“People pay for an Internet plan, but often they have no idea what they’re actually getting,” says Srikanth Sundaresan, a Ph.D. student in computer science. “It’s not just speed; a lot of people tend to conflate Internet performance with speed. There are other aspects to performance that are not well known outside of the research community. We feel users really should know about them, but those factors have a big influence on the performance they get.”
Sundaresan is working with Feamster on a project, BISMark (Broadband Internet Service benchMark), that has gathered performance data by deploying specially programmed routers in 30 homes (and counting) around Atlanta. The boxes, intended to function as normal routers and otherwise be invisible to users, periodically ran active network measurement tests and automatically sent the data back to Sundaresan and the team.
Meanwhile on the censorship side, Ph.D. student Sam Burnett is working with Feamster on a Google-funded project to not only detect data tampering but also find ways to mitigate it (Collage, a technology the two developed to embed messages in user-generated content such as Flickr images, received a good bit of media attention in fall 2010). As with the network performance studies, the first step is to gather some reliable information.
“We're building an extension for a web browser,” Burnett says. “When you try to visit a web page and you’re prevented from doing so, the extension will collect some information about why that might be the case, and it will report these data back to us, so we can aggregate amongst a bunch of people. The end goal is to provide an interface so that, when you try to visit a page and can’t, it’ll just tell you right then and there: ‘We couldn’t load the page because your ISP is down,’ or ‘Your government is blocking this web page.’”
Above the politics
“Anytime you deal with Internet censorship, you’re inherently dealing with politics,” Burnett continues. “But we’re scientists, so we focus on what we observe, we can actually see from the data, as opposed to thinking too much about policy. Still, it’s something that people in general care about. It’s nice to go talk to my family about my research and see that they find it interesting.”
Abhishek Jain’s motivations are professional. The undergraduate CS major had worked for the GVU Center, building on his personal interests in human-computer interaction. Sundaresan and the BISMark project needed an interface for users to check their own network data, and Jain was a natural fit for the task.
“We’re using different tools, like graphs, to plot the different performance metrics,” Jain says. “This will make it easy for users to look up, for example, what their average connection speed has been for the past few days. It helps them make better decisions and gives them more information about what’s going on.”
“Some ISPs limit the access to specific applications, such as streaming or peer-to-peer applications,” says Nazanin Magharei, a postdoctoral researcher in Feamster’s lab. “This usually happens without any notices to the Internet users. It is important for the users to know if their ISP is limiting their access to certain applications, and this is the goal of the project that I am working on.”
Magharei’s job is to examine the data gathered by Sundaresan’s routers and try to characterize the performance of different applications across various ISPs and service plans, as well as determine the effects different applications have on network properties.
Concurrent to the BISMark study, Sundaresan worked with Feamster to help the Federal Communications Commission review data from a much larger deployment of measurement devices in some 4,200 homes across the United States. For the Georgia Tech team, the next step is global: Feamster’s team is looking for volunteers to host the devices around the world, and that in itself is a challenge.
“We need control centers wherever we go, especially when we take them so far out of the country,” Sundaresan said. “The maintenance becomes very difficult because we’re physically so far away, so we need some kind of local support, some kind of university or corporate setting where, if something goes wrong and the server needs a reboot or some small hack, we can ask someone to do it.”
Data collection is also a central challenge on the censorship side: “If you’re using [this browser extension we want to build] in China or in Iran, the government may not want you to contribute this information,” Burnett says. “So you need to have some way of getting the data out—and also giving it back to people.
“Ideally, we would have perfect data—we’d just collect the entire browsing history,” says Burnett, explaining the data-collection challenge from the personal level. “But you can’t really do that because no one will agree to it. If I told you, ‘I’m going to collect your entire browsing history, and trust me, I’m going to anonymize it. Just trust me.’ Are you going to trust me?”
"Researchers always need data, but the problem of data collection is a bit of a chicken-and-egg problem: to get meaningful results, you need data, but in order to get data, you first need to demonstrate to people that you can provide something meaningful," Feamster says. "Our approach to this Catch-22 has always been to build tools that solve problems that people want solved. In the case of these projects, that problem is providing the user meaningful, relevant information about their Internet performance.
"As Internet connectivity and access to information starts to become more of a fundamental human right, users want tools that tell them about the connectivity they're getting and the information that they can or cannot access," he continues. "As we strive to provide these tools to users, we hope not only to provide better information to users about their network connectivity, but also to improve the communications network itself to better serve the needs of its users."