GVU's Third WWW User Survey

GVU's 3rd WWW User Survey


[ Survey Home ] [ 3rd Survey Home ] [ Graphs ] [ Bulleted Lists ] [ Datasets ]

GVU runs the surveys as PUBLIC SERVICE and as such,
ALL RESULTS ARE FREE (subject to certain terms and conditions).


This is the home page for the Graphic, Visualization, & Usability Center's (GVU) 3rd WWW User Survey. The 3rd survey was run from April 10th 1995 through May 10th 1995 and was endorsed by both the World-Wide Web Consortium (W3C) and the National Center for Supercomputing Applications (NCSA). Over 13,000 unique responses were collected to five sets of questionnaires, including: General Demographics, WWW Browser Usage, Authoring Information, Consumer Attitudes & Preferences and a pre-test directed towards Web Service Providers. This page provides access to the following:

Executive Summary

The survey team is really excited. The General Demographic Questionnaire received just over 13,000 unique responses from April 10 to May 10, more than 10 times the response rate to any other online survey. Plus, as many of you already know, GVU's WWW User Surveys pioneered the use of the Web as a surveying tool by being the first public Web-based survey in January of 1994 and by adding adaptive questioning in October of the same year. To date, we still have the most sophisticated surveying software on the Web, which permits users who have already taken a previous survey to verify their identity and correct only the questions whose values have changed, enforces question completion, detects multiple submissions allowing for overwriting of previously submitted responses, and uses multiple identification verification methods.

In a nut shell, GVU's Third WWW User Survey is the oldest & largest online survey - largest both in terms of the number of questions and the number of responses. As such, we are especially pleased to be able to offer all results and analyzes free of charge, as a public service effort to the Web and Internet communities. We strongly believe that all participants, regardless of ability to pay, should have access to the most complete data gathered on Web users to date. Remember though, the data presented on the following pages is only a snap shot - we do not make any claims about the representiveness of the data to the entire Web population.

In an attempt to facilitate dissemination of the results, we've created over 200 graphs (See: graphs and tables ) of the results and added our interpretation to each question asked as part of the survey. These interpretations are also available in a separate non-graphical format (See: bulleted lists of the findings). Within these pages, you will find comparisons that show the relationships between European & US users, Prodigy users, and Women & Men. These comparisons reveal some interesting characteristics about the Web users we sampled. For more about the differences between groups, see our high level summary & trends analysis. Don't forget to see the Consumer Surveys pages, developed and analyzed by the Hermes team, as part of GVU's WWW User Surveys. Also, check out Erik Granerad's non-USA analysis of the collected datasets.

We've released the collected datasets and have made most material available via anonymous ftp via ftp.cc.gatech.edu in /pub/gvu/www/survey/survey-04-1995. We appreciate your patience - all good work takes time - especially since Colleen and I do all survey related undertakings in our spare hours.

Thanks
Jim Pitkow &
Colleen Kehoe

Back to the top


High Level Summary and Trend Analysis

General Demographics

Analyzes of the data for the Third Survey resulted in many interesting findings. Overall, we observed substantial shifts in the demographics of the users who filled out the first two surveys and the third. The users in the Third Survey represent less and less the "technology developers/ pioneers" of the first survey (primarily young, computer savy users) and more of what we refer to as the "early adopters/seekers of technology." The adopters are not typically provided access to the Web through work or school, but actively seek out local or major Internet access providers, like Prodigy.

Why all this mentioning of Prodigy?

Due to an agreement between the Hermes team (they develop and analysis the consumer attitudes questionnaires) and Prodigy (the first major online Internet access provider to enable Web access), a link to the surveys was placed on Prodigy's Web entry page for 10 days during the surveying period. This provided us with the ability to compare Prodigy's users to users in general - the first comparison of these two populations that we know of. Additionally, we stratified the respondents by location (Europe & USA) and gender (Women & Men) and performed statistical tests for differences between groups.

What's the average age?

One category that has changed considerably over time is age. The mean age for the Third Survey is 35.01 (median 35.00), up almost four years from the Second Survey. Also, only 30.44% were between the ages of 21 and 30, compared to 56% of the respondents for the First Survey. We observe no statistically significant differences across gender for age (average age for women is 35.15 years old vs 35.05 for men).

What's the gender ratio & how has this changed over time?

As for gender, 15.5% of the users were female, 82.0% male and 2.5% chose to "Rather not say!" Compared to the Second Survey, women represent a 6% increase and men a 8% decrease. Compared to the First Survey in January of 1994, this represents a 10% increase for women and 12% decrease for men. This trend is quite linear (R Squared .98) and suggests an even male/female ratio could be achieved during the first quarter of 1997. Granted, it would be nice to have more data points to increase the confidence of these predications (it's a good thing we started the surveys when we did). In summary, there exists a trend for the Web towards older users and towards more balanced gender ratios.

Also, we note that in the US, 17.1% of the users were female, 80.3% male and 2.6% chose to "Rather not say!" For Prodigy, that ratios were even more in favor of women, with 19.1% female and 78.8% male. This four to one m/f ratio more so reflect the proportions outside the Web and also suggest that as more major online services join the Web and Internet, more balanced female/male ratios are likely to occur. The US and Prodigy ratios also indicate that the US is integrating women more quickly into the user population than other parts of the world.

What's the average and median income?

The overall median income is between $50,000 and $60,000 US dollars, with an estimated average income of $69,000. European respondents continue to lag in income, with an average income of $53,500 US dollars. Prodigy users' income is the highest of all sampled groups, with a median income in the range of $60,000 and $75,000 and an estimated average income of $80,000 US dollars.

What about location, marital status, & occupations?

For classification by major geographical location, 80.6% of the respondents were from the US, 9.8% from Europe, and 5.8% from Canada and Mexico, with all other major geographical locations begin represented, but to a lesser degree. Step towards replicating the survey in other continents and providing some multilingual support might alter these differences. Overall, 50.3% of the users are married, with 40.0% being single. The users who reported being divorced was 5.7%. Occupation wise, Computer-related fields (31.4%) and Education-related fields (including students) (23.7%) still represent the majority of respondents. Professional (21.9%), Management (12.2%), and "Other" occupations (10.8%) fill out the other categories. 82.3% of the respondents are white, with none of the other groups reporting over 5% of the responses. In summary, the respondents are typically white, married, North American, with computer or educational occupations.

How willing are users to pay for access to Web sites?

Overall, 22.6% of the respondents stated outright that they would not pay fees to access material from WWW cites. This is the same ratio as observed in the 2nd survey. Additionally, there were no statistically significant differences found between the Prodigy and non-Prodigy response distributions for this question. This implies that as the Web increases its user base, we'd expect to continue to find a 20% negative response to paying for access to Web sites. The distribution of primary computing platforms across all sampled populations closely resembles computer marketing reports: 52.0% Windows, 26.2% Macintosh, & 8.8% Unix. These three platforms account for 87% of all platforms reported.

WWW Usage & Preferences

How often do people use their Web browser?

While our survey does not answer the question, "How many Web users are there?" it does provide insight into potentially more interesting areas like why people use Web and in what manner. Overall people spend a considerable amount of time on the Web, with 41% of the users report using their browser between 6 and 10 hours/week and 21% between 11 and 20 hours/week, an increase of 5% and 6% since lst October. Over 72% responded that they use their Web browser at least once a day! These findings are very encouraging to services like electronic news that attempt to provide daily content - the audience is tuned-in and present.

Why do people use their Web browser?

The most common use of browsers is simply for browsing (82.6%) followed by entertainment (56.6%) and work (50.9%). The category with the least number of responses is shopping (10.5%). More users from Europe primarily use their browsers for academic research than users in the US (45.1% vs. 32.6%).

What do people do with their Web browser and with what regularity?

The following questions were scored on a 1 (never) to 9 (regularly) scale. The most popular activity for using Web browsers is to replace other browsers (6.7) like FTP, Gopher, etc. Other categories include accessing: reference information (6.2), electronic news (5.7) and product information (5.1). Thus, we find support for the notion the Web browsers are becoming the default interface to the Internet. The least frequently cited activity for using their Web browser is shopping (2.9), which may very well be due to the lack of merchandise on the Web and ubiquitous secure payment schemes. Interestingly, these responses are quite similar to those from the Second Survey. For more on the consumer attitudes and preferences, see: the Consumer Surveys portion of the User Surveys, which were developed and analyzed by the Hermes team.

How likely are people to archive documents found of the Web?

In general, users print and save documents with approximately the same regularity (3.9 for print and 4.5 for save). These numbers are right around the "Sometimes" option (4.5), which indicates that not many documents are archived off the Web. This finding is supported by the research done by Catledge & Pitkow on Web Browsing Strategies (See: Third WWW Conference Proceedings maintained by Elsevier Science B.V), which observed low archiving rates for actual users.

How fast are people's connection to the Internet?

The most common connection speed is 14 Kb/sec (43.8%) followed by 10 Mb/sec (13.1%). This uneven distribution is a result of the Prodigy users, 73.2% of which have 14.4 connections, and those which have connections provided via work or school.

Information Providers/HTML Authors

How easy was it for people to learn HTML & how did they learn it?

Good news - HTML, the markup language used for writing Web documents is easy to learn. Most users (82.0%) spent between 1 and 6 hours learning HTML. Many users learned HTML in only 1 to 3 hours (55.2%). CGI was rated the most difficult (5.0) followed by FORMS (4.0), ISMAP (3.9), and HTML overall (2.5). Interestingly, none of these averages are near the maximum difficulty rating of 9.0.

How did users learn about HTML?

On-line documentation was consulted by 88.4% of users in learning HTML. The next two most popular sources, books and friends, were consulted by only 29.2% and 25.2% of users, respectively (respondents were allowed to choose more than one answer).

When queried about charging for advertising on their site, the vast majority of Webmasters replied that the question was "Not Applicable" (70.6%) or that they "Don't Allow Ads" (24.0%) for a total of 94.6%. For those that do allow ads, the largest percentage (3.25%) charge under $50 per week. Only 0.36% charge over $510 per week.

What about HTTP servers?

As far as HTTP servers, the most popular server is NCSA's (38.6%) followed by MacHTTP (20.8%) and CERN's (18.5%). In Europe, however, the most popular server is CERN's (34.9%). Only a small percentage of sites operate a proxy server (12.6%) and most HTTP servers do not mirror other sites (91.5%). The most common server connection speed is 10 Mb/sec (32.3%). The next most common are 1 Mb/sec with 18.0% and 56 Kb/sec with 14.1%, indicating ample throughput to the Internet for over half of the HTTP servers (the bottle neck is on the client side).

Back to the top


Limitations of the Results

Highly distributed, heterogeneous, electronic surveying is a new field, especially with respect to the Web. Our adaptive WWW based surveying techniques are pioneering and as such, require conservative interpretation of collected data due to the absence of time-tested validation and correction metrics. Basically, our survey suffers two problems: sampling and self-selection. Essentially, when people decide to participate in a survey, they select themselves. This decision may reflect some systematic selecting principle (or judgment) that effects the collected data. Just about all surveys suffer from self-selection problems. That is, when a potential respondent hangs up on a telephone based surveyor, self-selection has occurred. Likewise, when a potential respondent does not send back a direct mail survey, self-selection has occurred.

The other issue is sampling. There are essentially two types of sampling: random and non-probabilistic. Random selection is intended to ensure equal representation amongst populations. To accomplish this, steps need to be taken to get respondents in a random manner, e.g. drawing numbers out of a hat. Our survey uses non-probabilistic sampling, which does not use randomization techniques to get respondents. This reduces the ability of the gathered data to generalize to the entire user population.

Since the Web does not have a broadcast mechanism (yet) we used the following diverse mediums to attract respondents:

One could argue that this diversified exposure minimizes any systematic effect introduced via the sampling method. We tend to agree, though have taken steps to further explore this issue. Specifically, we designed the Third Survey to enable us to determine how the respondents found out about the survey. This allows us to group respondents accordingly and look for significant differences between these user populations.

Overall, 50% of the users found out about the survey via other WWW pages, with 20.3% finding out via "Other" sources, and 17.9% finding out via Usenet newsgroup announcement. WWW based listserver/mailing lists, e.g. www-announce, etc. accounted for nearly 6% of all respondents finding out about the survey, thus are not tremendously effective means to attract attention or draw samples from. People remembering to take the survey was the least effective method cited (0.44%). This indicates that reliance on people to remember that it is time to participate in a survey is not a very effective means to accomplish longitude tracking of populations. While very few users found out about the 3rd surveys via the www-survey mailing announcement (1.1%) compared to other methods, we note that the 142 users who did respond accounted for 1/4 of the survey mailing list at the time. Thus, exclusive mailing lists seem to be a pretty efficient way to announce the beginning of a survey to people.

There were no significant differences between the response profiles of women and men for the following categories: remembering to take the survey, other Web pages, the newspaper, other sources, and listserve announcements. There were differences found for: finding out via friends, magazines, Usenet news, and the www-surveying mailing list. Given the low effectiveness of all but other Web pages and Usenet news announcements, most of these differences lead to nominal effects.

Thus, the surveys do not appear to suffer critically from sampling biases with respect to gender. If a segment of the Web user population were excluded, statistically, we'd expect to find similar response distribution for women and men. Still, we remain unconvinced that the survey's sampling methodology is optimal and welcome suggestions and further comments on this subject.

Back to the top


Technical Information

Statistical Inferences

All analyzes were performed using Splus version 3.1 for Unix. Tests for significant interactions amongst variables were performed using the classical chi-squared for independence of categorical data, with significance being determined at p <= 0.01 level. Test for differences between stratified samples was performed using a two-sided alternative for the Wilcox rank sum statistic. Since all tests included N > 49, the normal approximation was used. In the event of ties, the Lehmann approximation was used. Significance was determined at the p <= 0.01 and confirmed by checking that Z was either <= -2.58 or Z => 2.85.

Execution

The surveys were run on a dedicated Sun Sparc 2. All HTML pages were generated on the fly via our survey software and query engine (written in PERL). For more information about how the surveys actually work, see: the write up in the paper on the Second Survey Results. For inquiries about the availability of the survey code, contact: www-survey@cc.gatech.edu.

Back to the top


In Appreciation

We owe the loan of the fabulous artwork to Allyana Ziolko (please contact Allyana: allyana@cc.gatech.edu for comments and permission to use) and technical support to Michael Mealling (OIT), Dan Forsyth (CoC), Randy Carpenter (GVU), Kipp Jones (CoC) & Dave Leonard (CoC). Of course, the resources necessary for the surveys would not be possible without support from the GVU administrative staff and Dr. James Foley, GVU's Director. Special thanks to Melissa House for helping with all the graphs.
Back to the top

[ Survey Home ] [ 3rd Survey Home ] [ Graphs ] [ Bulleted Lists ] [ Datasets ]
For more information or to submit comments:
send e-mail to www-survey@cc.gatech.edu.

GVU's WWW Surveying Team
Graphics, Visualization, & Usability Center
College of Computing
Georgia Institute of Technology
Atlanta, GA 30332-0280
Copyright 1995
Georgia Tech Research Corporation
Atlanta, Georgia 30332-0415
ALL RIGHTS RESERVED
Usage Restrictions