GVU's 6th WWW User Survey
[ Survey Home ] [ 6th Survey Home ] [ Graphs ] [ Bulleted Lists ] [ Datasets ]
This is the main document for the Graphic, Visualization, & Usability Center's (GVU) 6th WWW User Survey. GVU runs the Surveys as public service and as such, all results are available online at no charge (subject to certain terms and conditions). The 6th Survey was run from October 10, 1996 through November, 1996 and was endorsed by the World Wide Web Consortium (W3C) (which exists to develop common standards for the evolution of the Web) and INRIA (the acting European host for the W3C in collaboration with CERN, where the Web originated). The $250 US cash prize winners are Donald P, from Michigan and Sue M, from California.
Over 59,400 unique responses were collected from over 15,000 unique respondents. Questions were asked on the following areas:
Basic Sections: Consumer Sections:
- General Demographics
- Web and Internet Usage
- Data Privacy, Censorship, etc.
- Security of Transactions
- Information Gathering Behavior
- Purchasing Behavior
- Opinions of Vendors
- HTML Authoring, Java, etc.
- Web/Internet Service Providers
Table of Contents
- Executive Summary
- The Results
- High Level Summary and Trend Analysis (HTML)
- Graphic Presentation of Tables and Graphs (GIF)
- Bulleted Lists of Interpretations for each question (ASCII)
- Collected Datasets
- Original Questionnaires
- Other Survey Information
- General Survey Information (Past & Future Surveys)
- Special Presentation of Selected Results for the WWW History Day
- Survey Methodology and Limitations of the Survey Results
- Technical Information
- Published Papers & Presentations on the Surveys
- Media Responses, Press Releases, & Appearances
- Other Sources of Internet/WWW Statistics and Demographics
- Copyright Information
- Miscellaneous Information
The Internet represents the most viable and fertile testbed for future global interactive systems. Many golden opportunities are readily leveraged off knowledge of how this evolving medium is and is not being utilized and by whom. Given the rapid rate of change of Internet related technologies and its user base, examination of a snapshot of the user population and usage patterns, even if performed with the utmost attention, can be misleading. Behind the numbers that represent current users are trends and emerging traits that paint the real picture. With knowledge of past and current patterns, one can comfortably make decisions about the future (at least as comfortable as decisions go on the Internet).
GVU's WWW User Surveys pioneered the field of Web-based surveying in January of 1994. This was just after the introduction of CGI (Common Gateway Interface) and HTML Forms--technologies that made communication between users and sites possible. Bear in mind that this was quite a long time ago for the WWW, a period that predated Netscape and Java. Since then, GVU's Surveys have been conducted every six months, providing one of the oldest sets of data on WWW and Internet demographics and usage.
These pages covers the latest results from GVU's Sixth WWW User Survey, conducted October 10 through November 10, 1996. Longitudinal analysis incorporating data from previous surveys are integrated into the report yielding some of the most complete coverage of the user population available. Just the same, presentation of all the results is an arduous task (please forgive any typos and spelling errors). We've created close to 300 graphs (See: graphs and tables) of the results and added our interpretation to each question asked in the Survey. These interpretations are also available in a separate, non-graphical format ideal for printing and offline reading. (See: bulleted lists of the findings). The bulleted lists provide an easy way for users to scan the results non-graphically first, and then inspect the graphs for only those questions of interest. Needless to say, there are a lot of interesting results, from which the high level summary below points out the more interesting findings. Plus, PDF files of the entire set of HTML pages presented herein will also available very soon.
For all questions, analysis between the following groups were performed: European vs US users, Female vs Male users, and by age (19-25, 26-50, 51+). Through out the course of the surveys, we've experimented with different stratifications and found these to be the most revealing.
As the Internet and the WWW continue its explosive growth, we will continue to provide the community with data from GVU's Surveys. Thanks for your interest and participation in the surveys and we look forward to your participation next time around starting April 10, 1997!
Jim Pitkow &
GVU's WWW User Survey Team
Back to the top
High Level Summary and Trend Analysis
Cultural and Societal Impact
What do users feel is the most important issue facing the Internet?
The largest category of respondents (35.9%) said that censorship was the most important issue facing the Internet today. That was followed by privacy (26.2%) and navigation (14.1%). The issues that were the least cited as most important were cultural and language issues. Among European respondents, navigation out ranked privacy as the second most important issue. And among women, privacy out ranked censorship as the most important issue. Although the top 3 concerns had the same relative ranking for each age group (censorship, privacy, navigation), younger people were far more concerned with censorship than older users. Conversely, older people were more concerned with navigation.
What do users feel about the continued dominance of the English language on the Web?
For this question, respondents could choose more than one response.The statement most strongly agreed with (59.2%) was that the web's impact on language and culture will be more helpful than harmful. Other statements that were strongly agreed with were that the web would help business, would unify languages, and unify people. European respondents especially felt that the web would help business and unify languages. More respondents from Europe than those from the US felt that the web would cause a loss of linguistic and cultural diversity. This is not that surprising given that Europe is more diverse than the US and that much of the current "culture and language" of the web comes from the US. Respondents over age 50 agreed more strongly than younger respondents with the positive impacts of the web (more helpful than harmful, will help business, and will unify people and languages). Younger respondents were more likely to see the web as harmful and causing a loss of diversity.
Since getting on the Internet, users have become...
Almost half (46.1%) of the respondents felt more connected to people who share their interests since coming online. This provides some evidence for the claim that the Internet is more than just an information source, rather it's building new communities based on common interests instead of common geographic locations.
What's the average age?
The average age of users responding to the Sixth survey is 34.9 years old. Average age has been slowly but steadily increasing since the Fourth survey (Fourth: 32.7 years, Fifth: 33.0 years). Also consistent with previous surveys is the observation that, on average, women are slightly younger than men and Europeans are significantly younger than US respondents.
What's the gender ratio & how has this changed over time?
The gender ratio is nearly identical to the Fifth survey, with 31.4% female and 68.6% male. European users are still predominantly male (80.2%). There was a slight increase in the percentage of women over age 50 in this survey (Sixth: 27.1%, Fifth: 24.7%). It is interesting compare these percentages to the First Survey conducted in January 1994 where 95% of the users were male, and the Third Survey were 82% of the users were male.
What about location, marital status, & occupations?
The percentage of respondents from the US increased in this survey to 82.7%. This is even higher than the percentage in the Fourth survey (80.6%). 83% of female respondents were from the US, but all locations were more gender-balanced in this survey compared to previous surveys. Older respondents are more likely to be from the US than younger respondents (89% of those over 50 compared to 75% of those 19-25).
There was a slight increase in the percentage of respondents who are married in all categories. 45.7% report being married and 36.7% report being single. More Europeans that respondents from the US report being either single or living with another. Almost 3/4 of those age 19-26 are single, while almost 3/4 of those age 50 and over are married.
There was a slight increase in the percentage of users in Management and Professional categories. European users were more likely to be in Computers or Education than their US counterparts. Women are only half as likely as men to be in Computer related fields, but are equally likely to be in Management or Professional positions. More than half of those age 19-25 are in Education (which includes being a student). Those aged 26-50 are more likely to be in Computer fields than any other.
How willing are users to pay for access to Web sites?
More than 2/3 of respondents (67.6%) reported that they were not willing to pay fees for accessing web materials. This number is up slightly from the Fifth survey. Respondents unwillingness to pay may stem from their perception of the value of the information currently available on the web and may change as people become used to high-quality professional sites. Alternatively, it may be a reflection of the fact that many users are already paying a service provider for access and may not be willing to pay again for content. Of those who were willing to pay, most preferred a subscription model. Women and younger respondents were slightly less willing to pay than other categories of respondents.
This was a refined questionnaire originally launched in the Fifth Survey. It investigates the political profile of Web users as well as their online political activities.
What is the political profile of Web users?This was a new question for this survey and will probably be revised for the next survey. The results, however, do provide an interesting first pass at understanding the political makeup of the web community. For this question, respondents were asked to state whether they agreed with, disagreed with or weren't sure for 10 statements. Five of them dealt with personal issues and five with economic issues. Each answer was given a certain number of points. Each respondent was then given a score for personal and economic issues and plotted on the two-dimensional graph below. Please take a look at the graph. The sizes of the dots in the graph indicate how many respondents fell into that category. The definitions of the various terms are as follows:
- Left-Liberals prefer self-government in personal matters and central decision-making on economics. They want the government to serve the disadvantaged in the name of fairness. Leftists tolerate social diversity, but work for economic equality.
- Libertarians are self-governors in both personal and economic issues. They believe the government's only purpose is to protect people from coe rcion and violence. They value individual responsibility, and tolerate economic and social diversity.
- Centrists favor selective government intervention and emphasize practical solutions to current problems. They tend to keep an open mind on new issues. Many centrists feel that the government serves as a check on excessive liberty.
- Right-Conservatives prefer self-government on economic issues, but want official standards in personal matters. They want the government to defend the community from threats to its moral fiber.
- Authoritarians want government to advance society and individuals through expert central planning. They often doubt whether self-government is practical. Left-authoritarians are also called socialists, while fascists are right-authoritarians.
These definitions and the questions used are Copyright 1995-6 by Advocates for Self-Government who are otherwise unaffiliated with this survey.The largest group of web users fell in the the Centrist category (38.4%) followed by Left-liberal (27.3%) and Libertarian (25.1%). Most puzzling is the large dot at 100% in both categories. We are still looking into why this particular point is so large and out of proportion with the ones around it.
What are their voting behaviors?
This question was only asked of those who said they were registered to vote. Of those registered, 50-60% voted in the most recent local, national, and legislative elections. Older voters were more than twice as likely to have voted in national and legislative elections. However, more than half of younger voters voted in the most recent local election. This can probably be explained by the fact that most European respondents fall into the younger age groups, and Europeans report primarily participating in local elections.
We predicted correctly that issues of data privacy would become increasingly important as the Internet became a part of many people's daily lives. This refined questionnaire, which was originally launched in the Fifth Survey, provides some of the freshest insights into users' knowledge of and concerns about data privacy issues.
How often do people falsify online registration information?
This question was rephrased since the last survey, so the results are not directly comparable. For this survey, 63.1% of respondents said they had never provided false information to a site when registering. 3.4% preferred not to say, which leaves 33.5% who have provided false information. Of those who have provided false information, most (66.5%) do so infrequently (less than 25% of the time). Only 33.5% provide false information frequently (more than 25% of the time). A smaller percentage of females than males report ever having falsified information. Also, the likelihood of having provided false information decreases with age.
Why do people not register at sites?
The most widely cited reason for not registering is that the terms and conditions of how the collected information is going to be used is not clearly specified (70.15%). User also feel very strongly that revealing the requested information is not worth being able to access the site (69.95%). Thus, while the foremost problem of terms and conditions of user can be easily rectified, the latter problem of making the trade-off between demographic collected and accessing a site is not as straight forward. An equally difficult issue is building trust between entities. Over 62% report that they do not trust the collecting site. Efforts that attempt to help ensure the data privacy standards of sites, like E-Trust may be able to help alleviate this lack of trust.
Turns out that the time it takes to complete the form is a factor (38.9%), but not as significant as the others. Much of the remaining difficulties reside in the type of information collected, with 45.33% not registering because of postal mail requirements, 30.74% because of name requirements, and 21.99% email requirements. Thus, proposals that call for business cards to be built into the browser and protocols which would enable them to be easily deposited at sites is not the cure-all for this problem but will help somewhat.
What information do people think ought to be automatically recorded during a Web transaction?
Three out of four users agree that sites ought to be able to record the page that is requested (76.60%) and the time of the page request (74.42%). Under half (43.71%) feel that the browser that users are using ought to be loggable. The machine name/address (27.00%), the operating system the user operates (26.83%), the user's email address (21.03%), and the location of the user (19.70%) were all not high on people's list of this to record per page request. It is interesting to note that all of the above information except email and location can be reliably gathered for every page request by most users of the WWW. When asked about an identifier that would uniquely label users across sessions at a site, less than one out of every give (19.08%) thought that this should be possible. Yet, identifiers already exist and are widely supported by browsers, aka cookies. There is already evidence of controversy surrounding the use and lack of control over cookies by technically savvy portions of the user community and the advertising community that desires fine grain measurement of usage.
What are some of their opinions on various issues surrounding anonymity?
Privacy and anonymity go hand-in-hand, but exactly how does the Web community feel about the specific issues surround anonymity on the Internet? The below question asked people top rate their agreement/disagreement on a 5-point scale, with '1' representing strong disagreement, '5' representing strong agreement and '3' neutrality. Nearly everyone felt strongly that people ought to be able to have private communications over the Internet (4.70). People tend to seriously value the anonymous nature of the Internet (4.46). Most people prefer anonymous payment systems (3.93) and feel that the Internet needs new laws to protect privacy (3.79). While people tend to agree that they ought to be able to take on multiple roles/aliases on the Internet (3.67), the community seems to be all over the board on the use of key escrow systems (3.09), with nearly half stating agreement with a key escrow system and half expressing disagreement.
What do users like to be done about spamming?
From this survey, people are very clear that they do not like to be receive mass emailings, i.e., be spammed, but what they propose to do about it? The majority of people responded in favor of an opt-out system, where a registry would contain the addresses of people who do not wish to receive mass emailings. Note that is is similar to the system already in place in the US that exists to remove people from junk mailing lists. Over 16% responded in favor of imposing an 'impact' fee on the agencies sending the mail. Exactly what this impact fee would be or how it would be implemented was not specified in the question. Somewhat surprisingly, only 5.89% voted in favor of government regulation making spamming illegal. This suggests that the online community favors the co-existence of users and spammers, but with users having the final say. Women and the elder generation were more in favor of an opt-out registry than their counter-parts (59.38 female vs 47.38 male and 55.91 50+ vs 48.66 19-25).
WWW Usage & Preferences
Where do people access the Web from?
As with the Fifth survey, the majority of respondents report that they primarily access the web from home (63.6%). This is an increase from the Fifth survey where the percentage was 55.4%. In Europe, however, only 36.7% report having their primary access from home (most report having it from work). Across all age groups, most access the web primarily from home, but that is especially true for users over age 50 (77.6%).
How often do people use their Web browser?
While the number of times browsers are used per day has remained stable since the last survey, the number of hours people user the Web has increased, with one in five users (20.05%) reporting using their browsers over 20 hours per week. Just about one third (30.01%) spend 10 to 20 hours a week on the Web, with 17% spending 7 to 9 hrs/wk and 17.76% spending 4 to 6 hrs/wk. Casual use of under 5 hours per week is down from 16.87% in the Fifth Survey to 15.18%, further emphasizing the trend towards increased usage. For comparison, in the Third Survey conducted in April of 1995, only 28.46% of the users spent more than 10 hrs/wk on the Web. Eighteen months later, nearly twice as many users (50.06%) spend more than 10 hrs/wk! US, female, and older users are more likely to spend less time on the Web than their counterparts.
Why do people use their Web browsers?
For this question, users were allowed to mark more than one answer. These responses are almost identical to the responses for the Fifth and Fourth Surveys. The most common Web activity is simply browsing (77.08%) followed by entertainment (63.79%), education (53.29%), and work (50.9%). Compared to a year ago, shopping is up to 18.83% from 11.1% in the Fourth Survey and 14.91% in the Fifth Survey. This represents a moderate and steady growth of the Web for shopping, a trend that is expected to continue as online transactions become easier and more choices become available. Europeans tend to report less recreational uses of the Web than do US users.
What are the main problems with using the Web?
Speed continues to be the number one problem of Web users (76.55%), and has been since the Fourth Survey when the question was first introduced. This is not to say that problem has been getting worse, as the number who complained of speed is down from 80.9% in the Fifth Survey, but still higher than the 69.9% in the Fourth. This effect is most likely due the the changes in connect speed of users to the Internet. The next big problems are "finding known info" (34.09%), organizing collected information (31.03%), and being able to find pages already visited (13.41%). Cost does not seem to be an issue, with only 7.75% reporting this as a problem. Given that the average household income of Web users is well above the normal population, this is not very surprising and can not be taken to mean that the Web is currently affordable for all. The only notable difference between genders was the problem of finding information: 31.01% of males, and 40.33% of females reported this problem. This difference was found in the Fifth Survey as well. No major differences were reported across age groups.
How often do people use the Web instead of watching TV?
Almost 37% of respondents claim that they use the Web instead of watching TV on a daily basis. An additional 29.03% say the Web replaces TV on a weekly basis, usually more than once a week. This pattern almost exactly mirrors the pattern found in the Fifth Survey. These number when used in conjunction with the use of the email as being on equal par with the phone paint a tremendously strong picture of the rapid integration of the Internet and World Wide Web into the fabric of the lives of those who currently use it. This is truly an amazing time.
How fast are people's connection to the Internet?
Modems still dominate the day for Web users, with just over half of the users (51.40%) using 28.8 Kb/sec modems and 19.69% using 14.4 Kb/sec modems. This represents a significant shift from the Fifth survey, where only 39.0% were using 28.8 Kb/sec and 25.5% were using 14.4 Kb/sec modems. The trend for increasing number of respondents connecting at speeds less than or equal to 28.8 Kb/sec is still occurring, with 71.59% of the respondents using 28.8 Kb/sec or less, up from 65.5% in the Fifth and 61% in the Fourth Survey.
Purchasing, Security, and Vendors
This set of questionnaires provides an in-depth view of not only what purchases people make online, by also where they gather product information, comparisons of online commerce to other mediums, attitudes towards security and characteristics of Web vendors.
What do people purchase and gather information about on the WWW?
The use of the Web for gathering purchase related material and making actual purchases has increased significantly in the past year. The most popular items bought and information gathered on are computer software and hardware. Over half of the users report using the Web to gather information on software and hardware, with more people using he Web for items over $50 than for items under $50. This gathering corresponds to purchases as well, with between 15% and 30% of the users making online purchases for hardware and software of various prices (15.09% for hardware under $50, 17.43 for hardware over $50, 29.11% for software under $50, and 20.84% for software over $50).
Other popular items sought and bought over the Web include: travel arrangements (48.87% sought, 20.63% bought), books and magazines (43.16% sought, 18.89% bought), and musical tapes, cd's, albums (36.65% sought, 13.66% bought). This is a substantial increase compared to a year ago, where 44% sought, 9% bought travel arrangements, 39% sought, 11% bought books and magazines, 38% sought, 9% bought musical tapes, cd's, albums. Stated differently, twice as many people made purchases of travel arrangements, 6% more bought books, and 4% more bought books. Just about all other areas, including apparel, legal services, and personal items have also shown increases in gathering and purchases though the Web. As reported in Primary Uses of the Web,, shopping has nearly doubled to roughly one out of every five users shopping via the Web. The Web is truly becoming a viable medium for electronic commerce albeit slowly but surely.
How much have people spent in the past sixth months and how much do they intend to spend?
Over a third (35.85%) of the users report spending less than $10 on purchases made through the WWW in the past six months. Slightly over 20% report spending between $10 and $99, with an additional 29.50% reporting spending over $100 though the Web. The amount of large ticket item spending has increased over the past year, as only 22 reported spending over $100 in the Fourth Survey. The amount of purchases under $10 has decreased though from 55% in the Fourth Survey. As has been experienced with past surveys, users typically overestimate the amount they intend to spend in the next sixth months, so the following numbers may turn out to be somewhat smaller. For example, in the Third Survey, 35% of the users anticipated spending over $100 via the Web, but only 22% actually did as reported in the Fourth Survey. This time around, 32% of the users anticipate spending $100 or more, with 16% expecting to spend between $10 and $99.
Are people comfortable with sending credit card information over the WWW?
This question asked users to state their agreement(5)/disagreement(1) on a 5 point scale about providing credit card information through the Web. Overall the trend is towards increased trust in the Web for transactions, though security concerns are a primary reason for not buying on the Web (average 3.39). This sentiment has decreased slightly over the past year, where in the Fourth Survey the average was 3.56. Not purchasing over the Web due to security concerns continues to bother women more than men (3.76 women vs 3.17 men). Providing credit card information through the Web is considered risker than giving over the phone (3.36), risker than giving to an unknown store (3.06), and risker than faxing to an offline vendor (2.95).
Users are divided in their agreement that providing credit card information over the Web is just plain foolish, though the average (2.95) suggests that slightly more people disagree with this statement. Compared to a year ago, users disagree with this statement more (3.56 Fourth Survey), revealing an increase the perceived security of Web transactions. Older users are more cautious of Web security than younger users.
What do users think of Web vendors?
This question asked users to rate the importance of vendor characteristics on a scale of unimportant (1) to important (5). Uses where then asked to compare Web vendors against traditional vendors on a scale where higher numbers indicate preference for Web vendors over traditional vendors. As far as importance, all characteristics were rated important, with security being the most important characteristic for users (4.646). This was followed by reliability (4.641), quality of information provided (4.600), timely delivery (4.456), and ease of contacting (4.04). Other issues like ease of ordering, refunds, and cancelations fell between 4.195 and 4.275. Surprisingly, the characteristics at the bottom of the ranking were lowest price (4.119) and ease of payment process (4.043). This profile suggest a very quality oriented customer who expects top quality service. The ranking within the profile has changed very little since the Fourth Survey over a year ago.
Given that security was the number one issue, users reflected a preference for traditional vendors (2.305) over Web vendors. This was the weakest characteristic for Web vendors, followed by easy refunds (2.8733), reliability (2.809), ease of canceling orders (2.887), customer service (2.895), and ease of payment process (2.970). Web vendors showed strengths over traditional vendors in the areas of ease of contacting (3.709), lowest prices (3.709), easy to order (3.614), and quality of information (3.461).
Web Authors and Java
This is a refined set of questions launched in the Sixth Survey that asks Web authors about their uses and perceptions of Java, a programming language developed at Sun Microsystems which can be used to add interactivity to Web pages.
Have you used Java and do you plan to use it in the future?
The percentage of authors who have programmed in Java increased from 17.3% in the Fifth survey to 24.4% in the Sixth. A higher percentage of Europeans have used Java, probably because of their stronger programming backgrounds.
Percentages increased slightly both for those who plan to use Java and those who do not. As programmers and businesses become more familiar with Java, they have a better idea of whether it suits their needs or not. Consequently, respondents are slightly more sure this time as to whether or not they will be using Java in the next year.
What are the major advantages of Java?
This question asked Web authors what they thought the major advantages of Java were. Respondents could choose more than one answer. More than half of respondents cited Java's platform independence as a major advantage. About a third identified the fact that Java doesn't require special permissions to run (unlike CGI programs) as an advantage and 21.7% thought the level of interactivity it provided was an advantage. Europeans more than Americans saw the platform independence and interactivity as major advantages.
What are authors' perceptions and knowledge of Java's security?
The largest category of users said that they didn't know how secure Java was (45.4%). Of those who gave it a rating, 8.3% thought it was very insecure, 28.0% said somewhat insecure, 54.5% said somewhat secure and 9.0% thought it was very secure. This represents a distinct shift toward more trust in Java's security measures from the Fifth survey.
Back to the top
Survey MethodologyThe Internet presents a unique problem for surveying. At the heart of the issue is the methodology used to collect responses from individual users. Since there is no central registry of all Internet users, completing a census, where an attempt is made t o contact every user of the Internet, is neither practical nor feasible financially. As such, Internet surveys attempt to answer questions about all users by selecting a subset of users to participate in the survey. This process of determining a set of u sers is called sampling, since only a sample of all possible users is selected.
There are two types of sampling, random and non-probabilistic. Random sampling creates a sample using a random process for selection of elements from the entire population. Thus, each element has an equal chance of being chosen to become part of the sam ple. To illustrate, suppose that the universe of entities consists of a hat that contains five slips of paper. A method to select elements from the hat using a random process would be to 1) shake the contents of the hat, 2) reach into the hat, and 3) pic k an slip of paper with one's eyes closed. This process would ensure that each slip of paper had an equal chance of being selected. As a result, one could not claim that some slips of paper were favored over the others, causing a bias in the sample.
Given that the sample was selected using a random, and each element had an equal chance of being selected for the sample, results obtained from measuring the sample can generalize to the entire population. This statistical affordance is why random sampli ng is widely used in surveys. After all, the whole purpose of a survey is to collect data on a group and have confidence that the results are representative of the entire population. Random digit dialing, also called RDD, is a form of random sampling whe re phone numbers are selected randomly and interviews of people are conducted over the phone.
Non-probabilistic sampling does not ensure the elements are selected in random manner. It is difficult then to guarantee that certain portions of the population were not excluded from the sample since elements do not have an equal chance of being selected . To continue with the above example, suppose that the slips of paper are colored. A non-probabilistic methodology might select only certain colors for the sample. It becomes possible that the slips of paper that were not the chose differ in some way f rom those that were selected. This would indicate a systematic bias in the sampling methodology. Note that it is entirely possible that the colored slips that were not selected did not differ from the selected slips, but this could only be determined by examining both sets of slips.
Since there is no centralized registry of all users of the Internet and users are spread out all over the world, it becomes quiet difficult to select users of the entire population at random. To simplify the problem most surveys of the Internet focus on a particular region of users, which is typically the United States, though surveys of European, Asian, and Oceanic users have also been conducted. Still, the question becomes how to contact users and get them to participate. The traditional methodology is to use RDD. While this ensures that the phone numbers and thus users are selected at random, it potentially suffers from other problems as well, namely self-selection.
Self-selection occurs when the entities in the sample are given a choice to participate. If a set of members in the sample decides not to participate, it reduces the ability of the results to generalize to the entire population. This decrease in the con fidence of the survey occurs since the group of that decided not to participate may differ in some manner from the group that participated. It is important to note that self-selection occurs in nearly all surveys of people. In the case of RDD, if a call is placed to a number in the sample and the user hangs up the phone, self-selection has occurred. Likewise, if in a mail-based survey, certain users do not respond, self-selection has occurred. While there are techniques like double sampling to deal wi th those members who chose not to participate or respond, most surveys do not employ these techniques due to their high cost.
GVU's WWW User Survey Methodology
Unlike most other surveys, GVU's WWW User Surveys are conducted over the Web, i.e., participants respond to questionnaires posted on the Web. In fact, GVU pioneered the entire field of Web-based surveying in January of 1994, being the first publicly acc essible Web-based survey. The GVU Center conducts the surveys every sixth months as a public service to the WWW community.
The GVU Surveys employ non-probabilistic sampling. Participants are solicited in the following manner:
There are several points to be made here. First, the above methodology has evolved due the fact there is no broadcast mechanism on the Web that would enable participants to be selected or notified at random. As such, the methodology attempts to propagate the presence of the surveys though diverse mediums. Second, high exposure sites are sites that capture significant portion of all WWW user activity as measured by PC-Meter. These sites are specifically targeted to increase the likelihood that the majori ty of WWW users will have been given an equal opportunity to participate in the surveys. Additionally, content neutral sites are chosen from the list of most popular sites to reduce the chance of imposing a systematic bias in the results. Finally, the Six th Survey is the first survey to experiment with the random rotation of banners at high exposure sites. The ability for sites to randomly rotate banners is a relatively new, one that did not exist during the first two years of GVU's Surveys (1994 and 199 5). This ability goes a long way towards ensuring that members of the WWW community have been selected at random. Since this technique is still quite experimental, it's effect on the reliability of the results in unable to be determined, though we will be examining this effect in future research.
- Announcements on Internet related newsgroups (e.g., comp.infosystems.www.announce, comp.internet.net-happenings, etc.),
- Banners placed on specific pages on high exposure sites (e.g., Yahoo, Lycos, etc.)
- Banners randomly rotated though high-exposure sites (e.g., Webcrawler, etc.),
- Announcements made to the www-surveying mailing list, a list maintained by GVU's WWW User Surveys composed of people interested in the surveys, and
- Announcements made in the popular media, (e.g., newspapers, trade magazines, etc.).
Also new to the Sixth Survey was the introduction of an incentive¾cash prizes. Respondents that completed at least four questionnaires became eligible to for the several $250 US awards. Our initial investigation into the effect of including incentives i nto the design of the surveys reveals that while the overall number of respondents did not increase significantly, the total number of completed questionnaires did increase significantly. Compared to the Third Survey, which had over 23,000 respondents to the General Questionnaire and 60,000 completed questionnaires (average x complete questionnaires/user), the Sixth Survey received over 15,000 responses to the General Questionnaire and close to 59,000 completed questionnaires (average x complete question naires/user). The effect of offering incentives on self-selection is an open research issue, though it is a technique that has been employed widely though out traditional survey methodologies, e.g., Nielsen's set-top box sample, etc.
Since random sampling techniques are not employed consistently though out the methodology, the ability of the collected data to generalize to the entire population is reduced, because certain members of the Web user community may not have had an equal cha nce to participate. The characteristics of these users may differ significantly from those users who did participate in the surveys. As it turns out, comparison of the GVU's WWW User Surveys results to other WWW User data published that utilize random t echniques reveal that the main area where GVU's Surveys show a bias exists in the experience, intensity of usage, and skill sets of the users, but not the core demographics of users1. Intuitively this makes sense, as only those users that are able to use the WWW are able to participate in the Surveys, whereas a set of RDD users may claim to be able to use the Internet or have used the Web at some time in the past. These users are not likely to be included in the GVU results. However, for many marketing needs, this bias is exactly what is desired of the data: real data from real users online today.
Given the limitations that exist in the data as a result of the methodology, we make the following recommendation to those using the data presented within this report:
- We recommend that the GVU data be used with the understanding that the data has a bias towards the experienced and more frequent users than random digit dial surveys.
- We recommend that users interested in understanding the complete spectrum of the Internet and WWW communities augment the GVU data with random sample surveys.
Despite the evidence to support the Survey results, we remain unconvinced that the Survey's sampling methodology is optimal and welcome suggestions and further comments on this subject.
Back to the top
Technical InformationStatistical Inferences
All analyzes were performed using Splus version 3.3 for Unix.
The Surveys were load balanced using three dedicated quad processor Sun Sparc 20's. All HTML pages were generated on the fly via our Survey Engine (written in PERL). For more information about how the Surveys Engine actually works, see the write-up in the paper on the Second Survey Results. For those interested in more information about the Adaptive Java Surveying Applet, please see the write up in Surveying the Territory: GVU's Five WWW User Surveys, Colleen M. Kehoe & James E. Pitkow, The World Wide Web Journal, Vol. 1, no. 3. Please direct inquiries about the availability of the survey code to: email@example.com.
Back to the top
Special ThanksSpecial thanks go to Georgia Tech's College of Computing's Computer Network Services for their excellent expert support, especially: Dan Forsyth, Bryan Rank, Peter Wan, Karen Barrett, and David Leonard.
Questionnaires and advice were contributed by:
Additional thanks are extended to:
- Consumer - Sunil Gupta (Univ of Michigan),
- Politics - Roger Hurwitz (MIT), John C. Mallery (MIT), and Mark S. Bonchek (MIT), and
- Privacy - Peter Neumann (SRI), Dave Redell & the CPSR Working Group on Privacy and Civil Liberties, Brian Behlendorf (Organic), and Marc Rotenberg (EPIC).
The fabulous artwork used as the logo for these pages was created and generously loaned to the surveys by the following artist/graphic designer: Allyana Ziolko
- Dr. Jarek Rossignac (GVU's Director),
- Dr. Jim Foley (GVU's former Director),
- Randy Carpenter (GVU's Systems' Administrator),
- Greg Calhoun, Emil Sarpa, and John Dutra of Sun Microsystems, who made the equipment loan possible,
- FIND/SVP's Emerging Technologies Research Group, for underwriting a portion of the survey execution,
- Michael Tchong for assistance with the graphic design and promotion,
- and last but not least to GVU & Georgia Tech!!
Back to the top
[ Survey Home ] [ 6th Survey Home ] [ Graphs ] [ Bulleted Lists ] [ Datasets ]
For more information or to submit comments:
send e-mail to firstname.lastname@example.org.