Abstract
In the user-centered approach to software design and
development, end-users act as evaluators in usability tests
at various points during the development life-cycle. Some
usability professionals argue that these usability tests
simply reflect the preferences of the participants and
should not be used in place of objective performance
measures. In an attempt to strengthen the validity of the
user-centered approach, the present study examined the
association between subjective preference measures and
objective performance measures in relation to the user's
hardware and software use and familiarity. The results
suggest that not only do the subjective ratings of end-user
evaluators often differ from objective performance
measures, but also that this relationship is dependent on
the user's past computer experience.
Introduction
One activity in a user-centered approach to software
usability testing involves the evaluation of the product
software by end-users. However, the testing methods
used in this approach vary. Some usability professionals
use subjective ratings, while others use objective
measures, and still others use a combination. Several
studies have examined the extent of agreement between
both types of measures [3, 4, 5]. The results of these
studies have been mixed suggesting the influence of other
factors.
It is possible that hardware and software familiarity of
end-user evaluators could strongly influence their
subjective ratings of a software interface/system. Such
an effect might explain the lack of correspondence
between subjective ratings and objective measures.
Thus, a better understanding of this relationship would
help usability professionals recruit an appropriate
distribution of end-users for the evaluation process.
Information regarding the potential relationship between
subjective and objective usability measures may also be
helpful when only subjective ratings of systems are
available.
METHOD
Participants
Volunteer participants (N=12) stated that they had
experience using a computer keyboard and mouse. The
participants received course credit for participating.
Survey information
The computer experience survey was designed to show
individual differences between evaluators. The survey
addressed participants' years of experience with
computers, experience with different types of systems,
hardware and software familiarity and use, and
demographic information. Measures of preference,
perceived ease-of-use, and expectation of high
performance were collected using a 7-point Likert scale
with left and right anchors of not at all and
very much, respectively.
Task
A simple data retrieval task was used to assure that
participants with varying levels of computer experience
would be able to understand and evaluate the processes
involved. The three interfaces used in the data retrieval
task were 1) a command line, 2) a 2-level menu, and 3) a
listbox. All three interfaces accessed the same database.
The difference between interfaces was related to the
method used to access the information. The
command line interface had participants type in a
customer name in order to retrieve the customer's account
number. The 2-level menu had participants
access information by first categorizing the customer
name within alphabetical ranges (i.e., A-E, F-H, etc.) by
clicking on the appropriate button, then the participant
searches a list of customer names from within the
alphabetical range chosen and clicks on the appropriate
button to access the customer's account number. The
listbox interface used a listbox "widget" that contained
every customer name in alphabetical order; the
participant accessed a customer's account information by
manipulating the listbox to display the customer name
and then clicking on the appropriate customer name
directly.
Procedure
At the beginning of the study, participants were asked to
complete a survey related to computer use and
familiarity. Participants used the interfaces in orders that
produced a counterbalanced design. A demonstration of
each interface was then given. Participants were asked to
access and record a list of customer account numbers
using each of the three interfaces. Measures of
preference, perceived ease-of-use, and expected
performance were collected from each participant at the
following times: 1) after the demonstration, 2) after ten
practice trials, and 3) after an experimental task (20 trials
per interface). The computer system recorded the time to
access each customer account number using each
interface.
Analysis
The subjective ratings and objective performance
measures were checked for agreement. Information from
the computer use and familiarity survey was then tested
as a predictor of the agreement score using stepwise
regression.
RESULTS
About half (41.7%) of the participants' rated the
listbox interface as the best, while performance
measures showed the menu interface to be
superior with regard to data access time. All participants
rated the command line interface lowest and
performance data supported this rating. Familiarity
with a variety of types of software (R2 = .6163;
p<.0134) and the amount of time participants had
been using computers on a regular basis = .4504;
p<.0169) reliably predicted whether participants' ratings
corresponded with performance measures.
DISCUSSION
The present study examined the relationship between
subjective ratings and objective measures from software
usability tests and its relation to measures of computer
experience. The results indicate that subjective ratings
and objective measures of performance often do not
correspond. There is evidence that making decisions to
satisfy preferences will not automatically lead to optimal
user performance (Bailey, 1993). The results of the
present study suggest that computer experience of end-
user evaluators can influence their subjective ratings of a
software interface/system. The results also suggest that
measures of computer familiarity and use can predict the
relationship between subjective and objective measures
of software usability. Such an effect might explain the
lack of correspondence between subjective ratings and
objective measures often found in studies of the validity
of the user-centered approach.
The results of the study have implications for the
methodology of user-centered software development and
usability testing. A better understanding of the
relationship between computer experience and the
association between subjective and objective usability
measures would help usability professionals better
interpret the results of usability tests based on end-user
ratings. Information regarding the potential relationship
between subjective and objective usability measures may
also allow usability professionals to recruit the most
appropriate evaluators given the limitations of many
usability testing methodologies. When only subjective
ratings methods are available to test a system, evaluators
with greater experience and exposure to hardware and
software may be the best candidates to evaluate software
design.
CONCLUSIONS
The subjective ratings and objective measures of
performance of software usability evaluators do not
necessarily correspond. This level of correspondence is,
in part, dependent on past computer experience.
References
1. Bailey, R. W. Performance vs. preference. In
Proceedings of the Human Factors & Ergonomics
Society 37th Annual Meeting, Human Factors &
Ergonomics Society, Santa Monica, California, 1993,
282-286.
2. Greene, S. L.., Gould, J. D., Bois, S. J., Rasamny, M.,
& Meluson, A. Entry and selection-based methods of
human-computer interaction. Human Factors, 34(1),
1992, 97-113.
3. Hayhoe, D. Sorting-based menu categories.
International Journal of Man-Machine Studies, 33, 1990,
677-695.
4. Keyson, D. K., & Parsons, K. C. Designing the user
interface using interfaces. Behaviour & Information
Technology, 10(6), 1990, 443-457.
5. Nielson, J., & Levy, J. Measuring Usability:
Preference vs. Performance. Communications of the
ACM, 37(4), 1994, 66-75.