GVU's WWW User Survey

GVU's 10th WWW User Survey


[ Survey Home ] [ 10th Survey Home ] [ Graphs ] [ Reports ] [ Datasets
This is the main document for the Graphic, Visualization, & Usability Center's (GVU) 10th WWW User Survey. GVU runs the Surveys as public service and as such, all results are available online (subject to certain terms and conditions). The 10th Survey was run from October 10, 1998 through December 15, 1998. The GVU Survey is now also sponsored by a Corporate Council that provides financial support to the survey effort as well as providing new directions for the surveys to explore. Special pointers to the survey were provided by Yahoo, MindSpring and DoubleClick.

The winners of the $100 cash prizes for the Tenth Survey are: Chris D. (New Jersey), Joyce R. (North Carolina), Michael F. (California), Alethea B. (Texas), Caroline P. (Canada), Eric H. (Washington), Chris S. (Pennsylvania), Rochelle W. (New Zealand), and Ingrid M. (California). Congratulations and thanks to all who participated!

Over 5,000 web users participated in the survey. Questions were asked on the following topics:
Basic Sections: Electronic Commerce:
Special Sections:
 

 
Get an overview of the findings by reading:

Tenth Survey Report[HTML]
Read about previous surveys in: 
General Survey Information (Past & Future Surveys) 
Special Presentation of Selected Results for the WWW History Day (April 1997) 
Published Papers & Presentations on the Surveys 
Media Responses, Press Releases, & Appearances
Understand how the results are collected by reading: 
Survey Methodology and Limitations of the results, and Technical Information 
Dig into the details by looking at the: 
Tables and Graphs (GIF) for each question
Conduct your own analysis by using our: 
Collected Datasets, and 
Original Questionnaires
Look at other web and internet surveys at: 
Cyber Dialogue 
CyberAtlas - a good starting point 
Nua Internet Surveys - monthly coverage of major surveys 
More sources... 
Read the fine print: 
Special Thanks 
Copyright Information 
The WWW-Surveying Mailing List 
WWW Corporate Council

Executive Summary


Click here for Executive Summary

Back to the top  


Survey Methodology

The Internet presents a unique problem for surveying. At the heart of the issue is the methodology used to collect responses from individual users. Since there is no central registry of all Internet users, completing a census, where an attempt is made to contact every user of the Internet, is neither practical nor feasible financially. As such, Internet surveys attempt to answer questions about all users by selecting a subset of users to participate in the survey. This process of determining a set of users is called sampling, since only a sample of all possible users is selected.

Sampling

There are two types of sampling, random and non-probabilistic. Random sampling creates a sample using a random process for selection of elements from the entire population. Thus, each element has an equal chance of being chosen to become part of the sample. To illustrate, suppose that the universe of entities consists of a hat that contains five slips of paper. A method to select elements from the hat using a random process would be to 1) shake the contents of the hat, 2) reach into the hat, and 3) pick an slip of paper with one's eyes closed. This process would ensure that each slip of paper had an equal chance of being selected. As a result, one could not claim that some slips of paper were favored over the others, causing a bias in the sample.

Given that the sample was selected using a random process, and each element had an equal chance of being selected for the sample, results obtained from measuring the sample can generalize to the entire population. This statistical affordance is why random sampling is widely used in surveys. After all, the whole purpose of a survey is to collect data on a group and have confidence that the results are representative of the entire population. Random digit dialing, also called RDD, is a form of random sampling where phone numbers are selected randomly and interviews of people are conducted over the phone.

Non-probabilistic sampling does not ensure the elements are selected in random manner. It is difficult then to guarantee that certain portions of the population were not excluded from the sample since elements do not have an equal chance of being selected. To continue with the above example, suppose that the slips of paper are colored. A non-probabilistic methodology might select only certain colors for the sample. It becomes possible that the slips of paper that were not selected differ in some way from those that were selected. This would indicate a systematic bias in the sampling methodology. Note that it is entirely possible that the colored slips that were not selected did not differ from the selected slips, but this could only be determined by examining both sets of slips.

Self-selection

Since there is no centralized registry of all users of the Internet and users are spread out all over the world, it becomes quiet difficult to select users of the entire population at random. To simplify the problem most surveys of the Internet focus on a particular region of users, which is typically the United States, though surveys of European, Asian, and Oceanic users have also been conducted. Still, the question becomes how to contact users and get them to participate. The traditional methodology is to use RDD. While this ensures that the phone numbers and thus users are selected at random, it potentially suffers from other problems as well, namely self-selection.

Self-selection occurs when the entities in the sample are given a choice to participate. If a set of members in the sample decides not to participate, it reduces the ability of the results to generalize to the entire population. This decrease in the confidence of the survey occurs since the group of that decided not to participate may differ in some manner from the group that participated. It is important to note that self-selection occurs in nearly all surveys of people. In the case of RDD, if a call is placed to a number in the sample and the user hangs up the phone, self-selection has occurred. Likewise, if in a mail-based survey, certain users do not respond, self-selection has occurred. While there are techniques like double sampling to deal with those members who chose not to participate or respond, most surveys do not employ these techniques due to their high cost.

 

GVU's WWW User Survey Methodology

Unlike most other surveys, GVU's WWW User Surveys are conducted over the Web, i.e., participants respond to questionnaires posted on the Web. In fact, GVU pioneered the entire field of Web-based surveying in January of 1994, being the first publicly accessible Web-based survey. The GVU Center conducts the surveys every sixth months as a public service to the WWW community.

The GVU Surveys employ non-probabilistic sampling. Participants are solicited in the following manner:

There are several points to be made here. First, the above methodology has evolved due the fact there is no broadcast mechanism on the Web that would enable participants to be selected or notified at random. As such, the methodology attempts to propagate the presence of the surveys though diverse mediums. Second, high exposure sites are sites that capture significant portion of all WWW user activity as measured by PC-Meter. These sites are specifically targeted to increase the likelihood that the majority of WWW users will have been given an equal opportunity to participate in the surveys. Additionally, content neutral sites are chosen from the list of most popular sites to reduce the chance of imposing a systematic bias in the results. Finally, the Seventh Survey was the first survey to experiment with the random rotation of banners through advertising networks. The ability of the advertising networks to randomly rotate banners is a relatively new, one that did not really exist during the first three years of GVU's Surveys. This ability goes a long way towards ensuring that members of the WWW community have been selected at random. Since this technique is still quite experimental, it's effect on the reliability of the results in unable to be determined, though we will be examining this effect in future research.

New to the Sixth Survey was the introduction of an incentive cash prizes. Respondents that completed at least four questionnaires became eligible to for the several $250 US awards. Our initial investigation into the effect of including incentives into the design of the surveys reveals that while the overall number of respondents did not increase tremendously, the total number of completed questionnaires did increase significantly. Compared to the Third Survey, which had over 23,000 respondents to the General Questionnaire and 60,000 completed questionnaires (average 2.6 complete questionnaires/user), the Seventh Survey received over 19,000 responses to the General Questionnaire and close to 88,000 completed questionnaires (average 4.6 complete questionnaires/user). The effect of offering incentives on self-selection is an open research issue, though it is a technique that has been employed widely though out traditional survey methodologies, e.g., Nielsen's set-top box sample, etc. For the Ninth survey, ten respondents were chosen to receive a $100 cash prize.

Since random sampling techniques are not employed consistently though out the methodology, the ability of the collected data to generalize to the entire population is reduced, because certain members of the Web user community may not have had an equal chance to participate. The characteristics of these users may differ significantly from those users who did participate in the surveys. As it turns out, comparison of the GVU's WWW User Surveys results to other WWW User data published that utilize random techniques reveal that the main area where GVU's Surveys show a bias exists in the experience, intensity of usage, and skill sets of the users, but not the core demographics of users. Intuitively this makes sense, as only those users that are able to use the WWW are able to participate in the Surveys, whereas a set of RDD users may claim to be able to use the Internet or have used the Web at some time in the past. These users are not likely to be included in the GVU results. However, for many marketing needs, this bias is exactly what is desired of the data: real data from real users online today.

Given the limitations that exist in the data as a result of the methodology, we make the following recommendation to those using the data presented within this report:

Despite the evidence to support the Survey results, we remain unconvinced that the Survey's sampling methodology is optimal and welcome suggestions and further comments on this subject.

Back to the top  


Technical Information

Descriptive Statistics

Most analyses were conducted using SPSS 8.0 for WindowsNT. Additional analyses were conducted with Excel 98 on WindowsNT.

Execution

The Surveys were executed on a dedicated quad processor Sun Sparc 20's. All HTML pages were generated on the fly via our Survey Engine (written in PERL). For more information about how the Surveys Engine actually works, see the write-up in the paper on the Second Survey Results. For those interested in more information about the Adaptive Java Surveying Applet, please see the write up in Surveying the Territory: GVU's Five WWW User Surveys, Colleen M. Kehoe & James E. Pitkow, The World Wide Web Journal, Vol. 1, no. 3. Please direct inquiries about the availability of the survey code to: www-survey@cc.gatech.edu.

Back to the top  


Special Thanks

Special thanks go to Georgia Tech's College of Computing's Computer Network Services for their excellent expert support, especially: Dan Forsyth, Peter Wan, Karen Barrett, YingQing Wang and David Leonard.

Questionnaires and advice were contributed by:

Additional thanks are extended to: The fabulous artwork used as the logo for these pages was created and generously loaned to the surveys by the following artist/graphic designer: Allyana Ziolko

 Back to the top


[ Survey Home ] [ 9th Survey Home ] [ Graphs ] [ Reports ] [ Datasets
Copyright 1994-1997
Georgia Tech Research Corporation
Atlanta, Georgia 30332-0415
ALL RIGHTS RESERVED
Usage Restrictions 
For more information or to submit comments: 
send e-mail to www-survey@cc.gatech.edu.
 GVU's WWW Surveying Team
GVU Center, College of Computing
Georgia Institute of Technology
Atlanta, GA 30332-0280
 
Sun Microsystems Andersen Consulting NCR
CyberDialogue Yahoo Scientific Atlanta
Top Jobs on the Net
With special thanks to:

DoubleClick   MindSpring

for their support in advertising the Tenth Survey.