CS6751 : Topic Report
Heuristic Evaluation

Prepared by Rajesh Vijayan

Introduction

There are basically four ways to evaluate an user interface : Formally by some analysis technique, automatically by a computerized procedure, emperically by experiments with test users, and heuristically by simply looking at the interface and passing judgement according to ones own opinion. Formal analysis models have not yet reached the stage where they can be generally applied in real software development projects. Automatic evaluation is completely infeasible except for a few primitive checks. Therefore, current practice is to do emperical evaluation if one wants a good and thorough evaluation of a user interface. Unfortunately, in most practical situations, people actually do not conduct empirical evaluations because they lack the time, expertise, inclination, or simply the tradition to do so.

As its name suggests, heuristic evaluation is not a pure analysis technique. It might be thought of as "analysis by a team of analysists using a variety of informed models". A more descriptive term applied to heuristic evaluation is Usability Inspection : an inspection is carried out and a list of problems affecting usability is drawn up. For its effect, the heuristic evaluation method relies on two techniques in combination. First, it employs a team of evaluators rather than relying on one person to carry out the evaluation. Second, a set of design heuristics is used to guide the evaluation.

Usability Heuristics

Molich and Nielsen [1990a] have listed nine heuristics which can be used to generate ideas while critiquing the system. These nine principles seem to be well suited as the basis for practical heuristic evaluation. These nine principles correspond more or less to principles that are generally recognized in the user interface community and almost all usability problems fit well into one of these categories.

The above set of heursistics were developed by Molich and Nielsen[1990a]. Nielsen [1994] has further refined these heuristics to come up with a revised set of heuristics that has more expalanatory power. The revised set of ten heuristics are Further discussion on these heuristics can be found at Nielsen's site related to heuristic evaluation.

Team of evaluators

In order to test the practical applicability of heuristic evaluation, Nielsen and Molich [1990b], conducted four experiments where a number of evaluators were presented with an interface design and asked to comment on it. They found that individual evaluators were mostly bad at doing heuristic evaluations and that they only found between 20% and 51% of the usability problems in the interfaces they evaluated. On the other hand, they found that the overall result can can be dramatically improved by forming aggregates of evaluators since the "collected wisdom" of several evaluators is not just equal to that of the best evaluator in the group. Aggregates of evaluators are formed by having several evaluators conduct a heuristic evaluation and then collecting the usability problems found by each of them to form a larger set.

They also found that three to five evaluators in an aggregate would be able to detect about two third of the usability problems. The result can be seen in figure 1.


Figure 1 : Curve showing the proportion of usability problems in an interface found by heuristic evaluation using various numbers of evaluators. The curve represents the average of six case studies of heuristic evaluation.

They also found that in order to produce better results than the individual evaluatons, the evaluators should do their evaluations independently of each other and only compare results after each of them has looked at the design and written his/her evaluation report. This is so that the evaluators don't bias each other towards a certain way of approaching the analysis and therefore only discover certain usability problems.

In another study carried out by Nielsen[1992], he found out that usability specialists were much better than those without usability expertise at finding usablilty problems by heurisitic evaluation. Furthermore, usability specialists with expertise in the specific kind of interface being evaluated did much better than regular usability specialists without such expertise, especially with regard to certain usability problems that were unique to that kind of interface.


Figure 2 : Average proportion of usability problems found as a function of number of evaluators in a group performing the heuristic evaluation

In figure 2, `Novice evaluators' refer to evaluators with no usability expertise, "regular specialists" refer to usability specialists, and "double specialists" refer to usability specialists who also have experience with the particular kind of interface being evaluated. As can be seen from figure 2, the double specialists found significantly more usability problems than did the regular usability specialists. If double specialists are used a smaller group size can be used to evaluate the interface. A size of two to three has been recommended.

He also found that major usability problems have higher probability of being found than minor problems by heuristic evaluation. Problems with the lack of clearly marked exits are harder to find than problems violating the other heuristics, and additional efforts should be taken to identify these usability problems.

Conclusions

A number of advantages are claimed for heuristic evaluation. They include

  • It is cheap.
  • It is intutive and it is easy to motivate people to do it.
  • It does not require advanced planning.
  • It can be used early in the development process.

    A disadvantage of the method is that it sometimes identifies usability problems without providing direct suggestions to solve them The method is biased by the current mindset of the evaluators and normally does not generate breakthroughs in the evaluated design.

    Heuristics Evaluation sites on WWW

  • Jakob Nielsen's online writing on heuristic evaluation.
  • Participatory heuristic evaluation : Process-Oriented extension to discount usability.
  • Heuristic evaluation material.

    References

  • Molich, R., and Nielsen, J. (1990a). Improving a human-computer dialogue : What designers know about traditional interface design, Communication of the ACM. (March 1990).
  • Nielsen, J., and Molich, R. (1990b). Heuristic evaluation of user interfaces, Proc. ACM CHI'90 Conf. (Seattle, WA, 1-5 April), 249-256.
  • Nielsen, J. (1992). Finding usability problems through heuristic evaluation. Proceedings ACM CHI'92 Conference (Monterey, CA, May 3-7), 373-380.
  • Nielsen, J. (1994). Enhancing the explanatory power of usability heuristics. Proc. ACM CHI'94 Conf. (Boston, MA, April 24-28), 152-158.