GVU's WWW User Survey: Question Database


Sponsor Colleen Kehoe & Kim Morton
colleen@cc.gatech.edu, psg95km@prism.gatech.edu
260A CoC
Area WWW, programming, databases


Problem


The World-Wide Web is clearly one of the most popular Internet resources. Yet because of its distributed, global nature, very little is known about its users, their characteristics, and why they are using the Web. A better understanding of these users, and their reasons for accessing the Web will lead to improved development of Web related tools and technologies as well as make the Web more usable by all users. The Graphics, Visualization, & Usability (GVU) Center's World Wide Web User Surveys are a public service effort. To date, over 90,000 responses have been collected through seven surveys and a basic analysis of responses is available to the public for free. These resuts are cited very widely and are used by researchers, developers, businesses and policy makers for a variety of purposes. The survey contains questions on: demographics, web usage, privacy, online purchasing, virtual banking, politics, webmasters, web authors and others. It is one of the more high-profile projects in the GVU Center.

The code that runs the survey is written in PERL and runs as CGI scripts. Currently, all of the questions which are asked in the survey are part of the source code. This makes adding and removing questions a tedious process. We would eventually like to rewrite some of the survey code so that questions are selected (perhaps on the fly) from a database. We have identifed a series of steps which we feel will allow us to transition smoothly from our current set-up to our ideal:

  1. Design a database to hold the current set of questions that also allows the history of each question to be recorded (e.g. when last asked, how many times asked, etc.). Also, some questions are adaptive--that is to say that they may have follow-up questions associated with different answers. These relationships must be captured in the database as well. Put the current set of questions (about 200) in the database.
  2. Write a program that allows someone to select questions from the database for a particular questionnaire. The output of this program should be identical to the format currently used to store the questions so that it can be integrated into the current survey. Even still, some changes will need to be made to the current survey code. Identify these pieces and pull them out so they can be generated as well.
  3. Fully integrate the database output with the survey code. (i.e. make the changes identifed in the previous step.
  4. Investigate the feasability of having the survey draw questions directly from the database. (The reason for doing this is so that questions can be presented in a random order or to only 50% of respondents, for example.) The server is very heavily used while we are collecting responses, so performance estimates should be part of this step.
  5. Build it!

Obviously, we don't expect any person or group to do all of this in 2 weeks. More reasonably, we would like to see each person/team take one step. Since this is a real development project, students who work on this later in the quarter must build on the work that has already been done. However, we will allow more that one person/team to work on the same step at the same time in order to explore different solutions to the problems. Students are especially encouraged to work together (4 or less) on this project or at the very least to discuss it with each other.

Background

Deliverables

It depends on which stage of the process we're in, but generally:

Not all deliverables will be applicable in all cases -- check with Colleen to be sure.

Evaluation
Evaluation is based on how well your work fits in with the larger plan for the survey code. The evaluation of the deliverables will take into account the student's familiarity with PERL, databases, UNIX, etc.Please see Colleen before beginning this project!