Human Performance Regression Testing
Amanda Swearngin, Myra B. Cohen, Bonnie E. John, Rachel K. E. Bellamy
Supplementary Data -- ICSE 2013
Abstract

As software systems evolve, new interface features such as keyboard shortcuts and toolbars are introduced. While it is common to regression test the new features for functional correctness, there has been less focus on systematic regression testing for usability, due to the effort and time involved in human studies. Cognitive modeling tools such as CogTool provide some help by computing predictions of user performance, but they still require manual effort to describe the user interface and tasks, limiting regression testing efforts. In recent work, we developed CogTool-Helper to reduce the effort required to generate human performance models of existing systems. We build on this work by providing task specific test case generation and present our vision for human performance regression testing (HPRT) that generates large numbers of test cases and evaluates a range of human performance predictions for the same task. We examine the feasibility of HPRT on four tasks in LibreOffice, find several regressions, and then discuss how a project team could use this information. We also illustrate that we can increase efficiency with sampling by leveraging an inference algorithm. Samples that take approximately 50% of the runtime lose at most 10% of the performance predictions.


Experiment Settings
Setup

Below, we have provided the GUI and EFG files extracted for each version of each task (Menus Only, Menus + Keyboards , Menus + Keyboards + Toolbars). We have also provided the sets of rules (in our XML format) we used to prune the generated test cases. A zip file of each set of generated test cases is also provided.

The tool we use to peform the test case generation, GUITAR, can be found here: http://guitar.sourceforge.net/. The set of applications used for our tasks is LibreOffice (www.libreoffice.org).

Task Module Version Rules, GUI, EFG # Test Cases
Menus Only Rules , GUI , EFG 3
Format Text Writer Menus + Keyboards Rules , GUI , EFG 24
Menus + Keyboards + Toolbars Rules , GUI , EFG 81
Menus Only Rules , GUI , EFG 2
Insert Hyperlink Writer Menus + Keyboards Rules , GUI , EFG 8
Menus + Keyboards + Toolbars Rules , GUI , EFG 18
Menus Only Rules , GUI , EFG 4
Absolute Value Calc Menus + Keyboards Rules , GUI , EFG 32
Menus + Keyboards + Toolbars Rules , GUI , EFG 72
Menus Only Rules , GUI , EFG 3
Insert Table Impress Menus + Keyboards Rules , GUI , EFG 12
Menus + Keyboards + Toolbars Rules , GUI , EFG 36
Results for RQ1

Our results for Research Question 1 are below. Each CogTool (.cgt) project file has been provided. To view the CogTool files download CogTool (http://cogtool.hcii.cs.cmu.edu/) and open them as a project.


Task Version No Test Cases Mean Time Min Time Max Time SD Project
Format Text M 3 13.8 13.7 13.8 0.1 .cgt
Format Text MK 24 13.2 12.3 14.1 0.6 .cgt
Format Text MKT 81 11.8 8.6 14.1 1.7 .cgt
Insert Hyperlink M 2 20.5 19.5 21.6 1.5 .cgt
Insert Hyperlink MK 8 20.1 18.3 21.6 1.4 .cgt
Insert Hyperlink MKT 18 19.8 17.6 21.6 1.3 .cgt
Absolute Value M 4 18.1 17.9 18.3 0.1 .cgt
Absolute Value MK 32 18.3 17.7 18.8 0.2 .cgt
Absolute Value MKT 72 17.8 14.1 18.9 1.6 .cgt
Insert Table M 3 12.8 12.7 12.9 0.1 .cgt
Insert Table MK 12 12.7 12.3 13.3 0.3 .cgt
Insert Table MKT 36 12.3 11.3 13.3 0.4 .cgt
Results for RQ2

The results shown below are across 5 randomly chosen runs for each sample size. The samples are taken from the full generated set of test cases for that task. The set of test cases is provided as well as the resulting project file for each run. To view the CogTool project files (.cgt), download CogTool (http://cogtool.hcii.cs.cmu.edu/) and open them as a project. A zip file is also given with all of the generated test cases for that run.

Test cases for the 'All' versions for each task can be found in the Setup table (Menus + Keyboards + Toolbars). Results for the 'All' versions can be found in the RQ1 results table (MKT version). Event-Flow Graphs (EFGs) can be found for each set of test cases below in the Setup table (Menus+Keyboards+Toolbars version).

Design Construction CogTool Analysis
Task (Sample %/Size) Run Time (s) % Red No. Methods No. Inferred Mean Min Max Test Cases Project
Format Text (5%/4) 445.7 93.8 12.8 8.8 12.2 10.2 13.7
Format Text (10%/8) 800.2 88.9 41.4 33.4 11.9 8.8 14.0
Format Text (25%/20) 1869.6 74.1 76.2 56.2 11.8 8.6 14.1
Format Text (50%/41) 3668.3 49.2 81.0 40.0 11.8 8.6 14.1
Format Text (All) 7215.9 - 81 - 11.8 8.6 14.1 -- --
Insert Hyperlink (5%/1) 187.5 89.8 1.0 0.0 19.6 19.6 19.6
Insert Hyperlink (10%/2) 293.7 84.0 3.6 1.6 20.3 19.5 21.1
Insert Hyperlink (25%/5) 579.1 68.5 15.6 10.6 19.8 17.6 21.6
Insert Hyperlink (50%/9) 967.5 47.3 18.0 9.0 19.8 17.6 21.6
Insert Hyperlink (All) 1836.8 - 18 - 19.8 17.6 21.6 -- --
Absolute Value (5%/4) 877.3 93.7 14.8 10.8 17.6 15.2 18.8
Absolute Value (10%/7) 1423.1 89.7 25.8 18.8 16.9 14.1 18.7
Absolute Value (25%/18) 3561.4 74.3 56.4 38.4 17.0 14.1 18.9
Absolute Value (50%/36) 6974.6 49.7 69.6 33.6 17.1 14.1 18.9
Absolute Value (All) 13864.9 - 72 - 17.1 14.1 18.9 -- --
Insert Table (5%/2) 300.9 92.2 3.6 1.6 12.3 11.8 12.7
Insert Table (10%/4) 519.7 86.6 6.4 2.4 12.3 11.8 12.8
Insert Table (25%/9) 1036.3 73.2 19.4 10.4 12.3 11.4 13.1
Insert Table (50%/18) 2069.9 46.5 32.8 14.8 12.4 11.4 13.3
Insert Table (All) 3867.2 - 36 - 12.3 11.3 13.3 -- --
Acknowledgments
We thank Peter Santhanam (IBM Research) for pointing out the connection between usability and functional GUI testing and Atif Memon (University of Maryland) for providing us with the newest releases of GUITAR and technical support. This work is supported in part by IBM, the National Science Foundation through award CCF-0747009, CNS-0855139 and CNS-1205472, and by the Air Force Office of Scientific Research, award FA9550-10-1-0406. The views and conclusions in this paper are those of the authors and do not necessarily reflect the position or policy of IBM, NSF or AFOSR.