Usability Experiments






















Introduction

What is a Usability Experiment?

Usability is synonymous with "user friendly", a term used by the computer world to describe their software and hardware to customers. User friendly was a deceptive description that narrowed down the user' needs to one dimension of whether or not the interface was friendly. Usability is still somewhat narrow but it has multiple components that are pertinent to interface design (six are listed):

An experiment is, of course, one or more tasks (or tests) that are given to an individual for measuring a tool or the individual's knowledge. Therefore, the phrase "usability experiment" implies one or more tasks that are used to evaluate an interface in at least these categories above: learning, efficiency, memory, errors, satisfaction, and longevity (presented above). As interfaces become more complex, these experiments are essential in marketing and selling any thing that a user interacts with on a daily or casual basis.


Usability Experiments in Interface Design

Parallel to development of an interface, participation of users play a major role in determining the final outcome of the interface's design. The way of obtaining this input objectively is through the use of usability experiments. If presented ethically, this will give the developer an ideal of what needs improvement and what is effective. The designer must view the experiment as an important part of the interface design lifecycle and should reiterate using these usability experiments for each stage of new and significant development in the design. In usability experiments, the designer must employ a close prototype of the actual interface to receive "real world" results; or, the evaluated information will be misleading, causing a False Positive or False Negative conclusion.





Developing Usability Experiments

Test Objectives

For any experimentation, test plans and goals are defined and agreed upon to obtain good, solid feedback as it will have a significant impact on the design. The type of evaluation, formative or summative, governs the stage of the design when presented for the usability experiments. Summative evaluation is done for a completed interface design and measures the effectiveness of that design. Formative evaluation is the reiterative process of assessing the interface at different phases of development and allowing the designer to implement any enhancements.

In test planning, issues are addressed:

Each should be considered carefully and selected appropriately to induce normal performance of the tasks.


Getting a Sample Population of Users

Test users must be as similar to the actual user population as possible. But in some cases, test users can be human-computer interface experts that can predict the performance of the interface. The latter case is good for companies with small budgets. But realistically, the expert will not be in an environment in which the interface will be used and the results may be biased by the "lab like" atmosphere.

The sample populace should have proportional amount of novice users to expert users. The expert users can evaluate the new advance features of a new design interface and provide substantial input of comparing the interface to others that they are familiar with. The novice users impart how intuitive the interface is, which effects the learning curve by shrinking it or expanding it between novice and expert users. The usability experiments should have some of the same and, yet, differing test for each group. Some tests are too simple for an expert and may be somewhat complex for the novice; this may cause erratic results. Too many of these tests will not be beneficial for the designer. Experiments should closely resemble the level the user is at and possibly beyond, but not too far.

Two basic ways to employ the test users are between-subjects testing and within-subjects testing. Within-subject testing involves test users evaluating all test systems. A transfer of skill often happens as the user evaluates another system, and the former novices will do much better on each system hereafter. This must be taken into account in analyzing the data. Between-subjects testing has different users for different system s and compares the results of each test system. This method must have a large number of users and random selection for each test group to avoid any biasing of the data.



The Learning Curves of the Novice User and the Expert User.  Each curve reflects
the performance results of an experiment if one or the other is the focus of 
that experiment.  The point of intersection is the optimal and you want to 
achieve the "best of both worlds."

Experimental Testing

Pre-testing would be good in understanding the effectiveness of your tasks with an experimenter, an HCI expert. This individual can help determine whether or not your task goals and objectives will be accomplished using these benchmark tests, and if each goal can be quantified in some form or fashion. Normally, this is considered as pilot testing. Many software designers do this in order to have robust prototypes for usability testing.

Heuristics of Testing

One must remember that users are human; therefore, all tasks should take into account their emotions and well-being. One of a user's main concerns is how intelligent they appear after a task. The administrator must assure the person that he/she is not the item that is measured. The interface is evaluated, and the participant is just providing vital information about it.

The individuals are reassured that their performance will not be disclosed to any manager or to other individual users. No one except the actual administrator or evaluator will have this confidential knowledge. The administrator must take any and all precautions in their evaluations, such as a closed environment and a hidden video camera (if necessary), with administrator nearby but not far away. This eliminates any intrusion for passersby and the affect of "someone looking over the shoulder". Some may feel that an isolated setup is not real and that a normal environment that depicts the actual ambiance for the application would be better. This, of course, leads into discussion of naturalistic versus experimental observations, which is beyond the scope of this report to present in depth.




Symbolic View of a Usability Test Lab.


Methods of Gathering Data

Each step taken before this point, will have some effect on the accuracy and validity of the data. This section directly connects to the goals of the experiments. The pre-testing prepares the evaluator to gather the correct information. The sample populace of users present a realistic view of how efficient and effective the interface is and what improvements are necessary.

Either subjective or objective methods are employed and each has its own techniques for gathering its respective data. Objective is via time, cost of steps to complete a tasks, errors, percent, etc. Subjective is think-aloud through each task, discuss after each task is completed, have an interview at the completion of the experiment, etc. Both can be combined so the data is not biased to either.






Implementations (Future and Present)

To reduce costs, designers and HCI experts are developing better techniques in predicting the usability of interfaces. Using theoretical solutions like the Goal-Operators-Method-Selection Rules(GOMS) analytical technique can alleviate the need of usability testing and design.

Technological solutions such as Computer-Aided Usability Engineering (CAUSE) is in use for gathering objective data as a task is performed, allowing the evaluator to gather subjective data. CAUSE measures the keystrokes, time, number of errors, etc. A future goal is to have the user think aloud and record the audio, while the task is performed.

Usability experimentation is in a continual cycle of enhancements to fray costs and provide the designer and user the best possible user interface for any particular application.