Usability Research: The Need for Standards

by Jerome Carter on June 27, 2016 · 0 comments

With the possible exception of those living under rocks, everyone knows that EHR usability is a hot topic.  The million-dollar question is what to do about it.   Understanding the nature of usability problems is always a good place to start.  As one would expect, the number of usability studies reported in the literature has increased significantly in recent years.  So, what have we learned? Unfortunately, it is hard to tell.

Determining the usability of a system requires objective standards or measures by which one can judge.  What is the ideal number the steps to order a test, enter a note, look up an old history, or initiate follow-up of an abnormal result?  No one knows, and since there are no objective standards for these processes, all results tend to be local (either for a given EHR system or a specific type of user).   Beyond processes, the same questions could be asked about user interface elements.  Is there an ideal font size, text color, window position, etc. that would be ideal for all EHR users?  Users differ in terms of clinical experience, problem-solving skills, cognitive support needs, ability to perceive color, and in other ways–there is no such thing as an “average” EHR user who can be represented by a panel of peers.    

Here is the definition of usability as offered by Zhang and Walji (1)

TURF and ISO definitions of usability differ in the difference between “effective” in ISO and “useful” in TURF and between “efficient” in ISO and “usable” in TURF. Under TURF, “useful” refers to how well the system supports the work domain where the users accomplish the goals for their work, independent of how the system is implemented. A system is fully useful if it includes the domain and only the domain functions that are essential for the work, independent of implementations. Full usefulness is an ideal situation; it is rarely achieved in real systems. Usefulness also changes with the change of the work domain, with the development of new knowledge, with the availability of innovations in technology. Usefulness can be measured by the percentage of domain functions that are in the EHR over all domain functions (those in the system and those not in the system), and the ratio of domain functions vs. non-domain functions in the system.

Under TURF, a system is usable if it is easy to learn, efficient to use, and error-tolerant. How usable a system is can be measured by learnability, efficiency, and error tolerance. Learnability refers to the ease of learning and re-learning. It can be measured by examining how much time and effort are required to become a skilled performer for the task, such as the number of trials needed to reach a preset level of performance, the number of items that need to be memorized, and the number of sequences of task steps that need to be memorized. Learnability usually correlates positively with efficiency but it could be independent of efficiency and sometimes correlates negatively with efficiency (e.g., an interface optimized for ease of learning may not be optimized for efficiency). Efficiency refers to the effort required to accomplish a task.

Pay particular attention to the concept of “learnability” as described above.   Simple systems make it easy to determine learnability.  For instance, buying a song from iTunes the first time might be confusing, but Apple has a vested interest in making sure any confusion disappears quickly.  The task of buying an item, no matter what type, is the same.  Now, move this discussion to a more complex system. 

As long as the task is repetitive, then learning should come quickly, assuming the system is not horribly designed.   So, for a given clinical task, what is the ideal number of steps based on user background? How many discrete clinical tasks are there?  Answering these questions requires detailed process maps for key clinical processes—I have yet to see such a thing in the wild.  In addition, all user traits mentioned above (clinical experience, cognitive needs, etc.) come into play.   

Evaluating the usability of any system according to the subjective criteria offered by the ISO and TURF leaves considerable room for creating test instruments and interpreting the results.  Even so, many studies have been published.  Without standards for process measures and user profiles, generalization of outcomes is nearly impossible as there is nothing to assure that researchers are using the same terms and concepts in the same way.    Lacking standards, results suffer from poor external validity, applying to one system and one set of users at one time and one place.  Discerning global precepts or design rules is nearly impossible unless everyone measures the same phenomena against the same standards under the same conditions, which brings me to the article of the week.

Ellsworth and colleagues (2), sought to determine the comparability of EHR usability evaluation results.  They conducted a systematic review looking at methodology and reporting trends.  The first thing worth noting is the small number (120) of published studies that met their criteria for inclusion.

Even though our initial search identified nearly 5000 potential studies, only a very small fraction of these truly applied usability evaluation standards and were therefore eligible for our review.

Considering the importance accorded work/task support in usability definitions, it is disconcerting to see that only seven percent of the included studies included task analysis or clinical workflow analysis as part of the evaluation.   Most studies (37%) relied on surveys, and even here the authors noted that nearly half of survey-based studies either failed to use validated instruments or describe how the survey was developed.   The authors come to a sobering conclusion:

However, a review of the literature on EHR evaluations demonstrates a paucity of quality published studies describing scientifically valid and reproducible usability evaluations conducted at various stages of EHR system development and how findings from these evaluations can be used to inform others in similar efforts. The lack of formal and standardized reporting of usability evaluation results is a major contributor to this knowledge gap, and efforts to improve this deficiency will be one step of moving the field of usability engineering forward.

I think the authors have a long wait ahead because while some degree of usability is vested in the system, the users and their processes/tasks are major determinants as well.   Supporting clinicians as they work requires a much more detailed map of clinical work than is currently extant.  The reality is that clinical workflows modeled with flowcharts are hopelessly inadequate for mapping process properties (e.g., control-flow, data movement, and resource usage) as they occur in clinical work.  Trying to create standards for usability without such information will prove to be wasted effort.   Helping someone accomplish a task requires knowing exactly what they are trying to do and why—THEN that knowledge has to be transformed into code.

Another stumbling block for usability researchers is the EHR itself.     Data-centric systems are not optimized to support processes; instead, they focus on capturing and presenting data.  Usability evaluations of data-centric systems will reveal process support flaws. However, if there is no intention of moving to a process-centric design, the usability of the underlying system can only improve but so much.  Software architecture has to be part of usability evaluations because every other aspect of usability is grounded in the system’s internal design. 

I agree with the authors— EHR usability research needs standards for evaluation and reporting of results.  Furthermore, better ways of analyzing and modeling processes are essential, if clinical work support is going to be a credible dimension of EHR usability.  Finally, the authors suggest that usability studies should be done early in the design process to assure the best possible system—no argument here.   I would add to that a caveat: Unless there is a willingness to acknowledge the limitations of data-centric designs in supporting clinical work, even the best usability/UCD efforts will run into a wall.

1. Zhang J, Walji MF. TURF: toward a unified framework of EHR usability. J Biomed Inform. 2011;44(6):1056–1067.

2. Ellsworth MA, Dziadzko M, O’Horo JC, Farrell AM, Zhang J, Herasevich V. An appraisal of published usability evaluations of electronic health records via systematic review. J Am Med Inform Assoc. 2016 Apr 23. [E-pub]