The Limits of Expert Text Entry Speed on Mobile Keyboards with Autocorrect

Improving mobile keyboard typing speed increases in value as more tasks move to a mobile setting. Autocorrect is a powerful way to reduce the time it takes to manually fix typing errors, which results in typing speed increase. However, recent user studies of autocorrect uncovered an unexplored side-effect: participants’ aversion to typing errors despite autocorrect. We present the first computational model of typing on keyboards with autocorrect, which enables precise study of expert typists’ aversion to typing errors on such keyboards. Unlike empirical typing studies that last days, our model evaluates the effects of typists’ aversion to typing errors for any autocorrect accuracy in seconds. We show that typists’ aversion to typing errors adds a self-imposed limit on upper bound typing speeds, which decreases the value of highly accurate autocorrect. Our findings motivate future designs of keyboards with autocorrect that reduce typists’ aversion to typing errors to increase typing speeds.

The Limits of Expert Text Entry Speed on Mobile Keyboards with Autocorrect Nikola Banovic, Ticha Sethapakdi, Yasasvi Hari, Anind K. Dey, Jennifer Mankoff. Mobile HCI 2019.

A picture of a samsung phone. The screen says: Block 2. Trial 6 of 10. this camera takes nice photographs. The user has begun typing with errors: "this camera tankes l" Error correction offers 'tankes' 'tankers' and 'takes' and a soft keyboard is shown before that.

An example mobile device with a soft keyboard: A) text entry area, which in our study contained study progress, the current phrase to transcribe, and an area for transcribed characters, B) automatically suggested words, and C) a miniQWERTY soft keyboard with autocorrect.

A bar plat showing typing speed (WPM, y axis) against acuracy (0 to 1). The bars start at 32 WPM (for 0 accuracy) and go up to approx 32 (for accuracy of 1).
Our model estimated expected mean typing speeds (lines) for different levels of typing error rate aversion (e) compared to mean empirical typing speed with automatic correction and suggestion (bar plot) in WPM across Accuracy. Error bars represent 95% confidence intervals.
4 bar plats showing error rate in uncorrected, corrected, autocorrected, and manual corrected conditions. Error rates for uncorrected are (approximately) 0 to 0.05 as accuracy increases; error rates for corrected are .10 to .005 for corrected condition as accuracy goes from 0 to 1. Error rates are  0 to about .1 for uncorrected as accuracy goes from 0 to 1. Error rates are variable but all below 0.05 for manual as accuracy goes from 0 to 1
Median empirical error rates across Accuracy in session 3 with automated correction and suggestion. Error bars represent minimum and maximum error rate values, and dots represent outliers

Leveraging Routine Behavior and Contextually-Filtered Features for Depression Detection among College Students

The rate of depression in college students is rising, which is known to increase suicide risk, lower academic performance and double the likelihood of dropping out. Researchers have used passive mobile sensing technology to assess mental health. Existing work on finding relationships between mobile sensing and depression, as well as identifying depression via sensing features, mainly utilize single data channels or simply concatenate multiple channels. There is an opportunity to identify better features by reasoning about co-occurrence across multiple sensing channels. We present a new method to extract contextually filtered features on passively collected, time-series data from mobile devices via rule mining algorithms. We first employ association rule mining algorithms on two different user groups (e.g., depression vs. non-depression). We then introduce a new metric to select a subset of rules that identifies distinguishing behavior patterns between the two groups. Finally, we consider co-occurrence across the features that comprise the rules in a feature extraction stage to obtain contextually filtered features with which to train classifiers. Our results reveal that the best model with these features significantly outperforms a standard model that uses unimodal features by an average of 9.7% across a variety of metrics. We further verified the generalizability of our approach on a second dataset, and achieved very similar results.

Leveraging Routine Behavior and Contextually-Filtered Features for Depression Detection among College Students. Xuhai Xu, Prerna Chikersal, Afsaneh Doryab, Daniella Villaalba, Janine M. Dutcher, Michael J. Tumminia, Tim Althoff, Sheldon Cohen, Kasey Creswell, David Creswell, Jennifer Mankoff and Anind K. Dey. IMWUT, Article No 116. 10.1145/3351274

A pipeline starting with data collection (including from mobile phone sensors, campus map, and fitbit) which feeds into feature extraction. This is piped into association rule mining, and features plus rules are combined to create contextually filtered features, which are then piped into a machine learning classifier. Ground truth comes from the BDI-II questionnaire.
The high-level pipeline of the integration of rule mining algorithms and machine learning models. The dashed frame highlights the novel contribution of the paper. We designed a new metric to select the top rules from the rule set generated by ARM. We also proposed a new approach to extract contextually filtered features based on the top rules. Finally, we use these features to train classifiers.

Lyme Disease’s Impact

An ongoing, and very personal thread of research that our group engages in (due to my own journey with Lyme Disease, which I occasionally blog about here) is research into the impacts of Lyme Disease and opportunities for helping to support patients with Lyme Disease. From a patient perspective, Lyme disease is as tough to deal with as many other more well known conditions [1].

Lyme disease can be difficult to navigate because of the disagreements about its diagnosis and the disease process. In addition, it is woefully underfunded and understudied, given that the CDC estimates around 300,000 new cases occur per year (similar to the rate of breast cancer) [2].

Bar chart showing that Lyme disease is woefully under studied.

As an HCI researcher, I started out trying to understand the relationship that Lyme Disease patients have with digital technologies. For example, we studied the impact of conflicting information online on patients [3] and how patients self-mediate the accessibility of online content [4]. It is my hope to eventually begin exploring technologies that can improve quality of life as well.

However, one thing patients need right away is peer reviewed evidence about the impact that Lyme disease has on patients (e.g. [3]) and the value of treatment for patients (e.g. [4]). Here, as a technologist, the opportunity is to work with big data (thousands of patient reports) to unpack trends and model outcomes in new ways. That research is still in the formative stages, but in our most recent publication [4] we use straightforward subgroup analysis to demonstrate that treatment effectiveness is not adequately captured simply by looking at averages.

This chart shows that there is a large subgroup (about a third) of respondents to our survey who reported positive response to treatment, even though the average response was not positive.

There are many opportunities and much need for further data analysis here, including documenting the impact of differences such as gender on treatment (and access to treatment), developing interventions that can help patients to track symptoms, manage interaction within and between doctors, and navigate accessibility and access issues.

[1] Johnson, L., Wilcox, S., Mankoff, J., & Stricker, R. B. (2014). Severity of chronic Lyme disease compared to other chronic conditions: a quality of life survey. PeerJ2, e322.

[2] Johnson, L., Shapiro, M. & Mankoff, J. Removing the mask of average treatment effects in chronic Lyme Disease research using big data and subgroup analysis.

[3] Mankoff, J., Kuksenok, K., Kiesler, S., Rode, J. A., & Waldman, K. (2011, May). Competing online viewpoints and models of chronic illness. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 589-598). ACM.

[4] Kuksenok, K., Brooks, M., & Mankoff, J. (2013, April). Accessible online content creation by end users. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 59-68). ACM.


Understanding gender equity in author order assignment

Academic success and promotion are heavily influenced by publication record. In many fields, including computer science, multi-author papers are the norm. Evidence from other fields shows that norms for ordering author names can influence the assignment of credit. We interviewed 38 students and faculty in human- computer interaction (HCI) and machine learning (ML) at two institutions to determine factors related to assignment of author order in collaborative publication in the field of computer science. We found that women were concerned with author order earlier in the process:

Our female interviews reported raising author order in discussion earlier in the process than men.

Interview outcomes informed metrics for our bibliometric analysis of gender and collaboration in papers published between 1996 and 2016 in three top HCI and ML conferences. We found expected results overall — being the most junior author increased the likelihood of first authorship, while being the most senior author increased the likelihood of last authorship. However, these effects disappeared or even reversed for women authors:

Comparison of regression weights for author rank (blue) with author rank crossed with gender (orange). Regression was predicting author position (first, middle, last)

Based on our findings, we make recommendations for assignment of credit in multi-author papers and interpretation of author order, particularly with respect to how these factors affect women.


eDigs logoJennifer MankoffDimeji OnafuwaKirstin EarlyNidhi VyasVikram Kamath:
Understanding the Needs of Prospective Tenants. COMPASS 2018: 36:1-36:10

EDigs is a research project group in Carnegie Mellon University working on sustainability. Our research is focused on helping people find a perfect rental through machine learning and user research.

We sometimes study how our members use EDigs in order to learn how to build software support for successful social communities.

eDigs websiteScreenshot of showing a mobile app, facebook and twitter feeds, and information about it.

Aversion to Typing Errors

Quantifying Aversion to Costly Typing Errors in Expert Mobile Text Entry

Text entry is an increasingly important activity for mobile device users. As a result, increasing text entry speed of expert typists is an important design goal for physical and soft keyboards. Mathematical models that predict text entry speed can help with keyboard design and optimization. Making typing errors when entering text is inevitable. However, current models do not consider how typists themselves reduce the risk of making typing errors (and lower error frequency) by typing more slowly. We demonstrate that users respond to costly typing errors by reducing their typing speed to minimize typing errors. We present a model that estimates the effects of risk aversion to errors on typing speed. We estimate the magnitude of this speed change, and show that disregarding the adjustments to typing speed that expert typists use to reduce typing errors leads to overly optimistic estimates of maximum errorless expert typing speeds.

promoNikola Banovic, Varun Rao, Abinaya Saravanan, Anind K. Dey, and Jennifer Mankoff. 2017. Quantifying Aversion to Costly Typing Errors in Expert Mobile Text Entry. (To appear) In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI ’17). ACM, New York, NY, USA.

Modeling & Generating Routines

Leveraging Human Routine Models to Detect and Generate Human Behaviors

An ability to detect behaviors that negatively impact people’s wellbeing and show people how they can correct those behaviors could enable technology that improves people’s lives. Existing supervised machine learning approaches to detect and generate such behaviors require lengthy and expensive data labeling by domain experts. In this work, we focus on the domain of routine behaviors, where we model routines as a series of frequent actions that people perform in specific situations. We present an approach that bypasses labeling each behavior instance that a person exhibits. Instead, we weakly label instances using people’s demonstrated routine. We classify and generate new instances based on the probability that they belong to the routine model. We illustrate our approach on an example system that helps drivers become aware of and understand their aggressive driving behaviors. Our work enables technology that can trigger interventions and help people reflect on their behaviors when those behaviors are likely to negatively impact them.

drivingsimulator_no_labelNikola Banovic, Anqi Wang, Yanfeng Jin, Christie Chang, Julian Ramos, Anind K. Dey, and Jennifer Mankoff. 2017. Leveraging Human Routine Models to Detect and Generate Human Behaviors. (To appear) In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI ’17). ACM, New York, NY, USA.


Watch-ya-doin is an innovative experienced based sampling framework for longitudinal data collection and analysis. Our system consists of a smartwatch and an android device working unobtrusively to track data. Our goal is to train on and recognize a specific activity over time. We use a simple wrist-worn accelerometer to predict eating behavior and other activities. These are inexpensive to deploy and easy to maintain, since battery life is a whole week using our application.
     Our primary application area is AT abandonment. About 700,000 people in the United States have an upper limb amputation, and about 6.8 million face fine motor and/or arm dexterity limitations[1]. Assistive technology (AT), ranging from myo-electric prosthetics to passive prosthetics to a variety of orthotics can help in the rehabilitation and improve independence and ability to perform everyday tasks. Yet AT is not used to its full potential, with abandonment rates ranging from 23% to 90% for prosthetics users, and high abandonment of orthotics as well. Given the cost of these devices, this is an enormous waste of a significant financial investment in developing, fabricating, and providing the device, as well as potentially leading to frustration, insufficient rehabilitation, increased risk of limb-loss associated co-morbidities, and overall a reduced quality of life for the recipient.
       To address this, we need objective and accurate information about AT use. Current data is limited primarily to questionnaires, or skill testing during office visits. Apart from being limited by subjectivity and evaluator bias, survey tools are also not appropriate to estimate quality of use. A patient may – more or less accurately – report his or her AT use for a certain number of hours a day, but this does not indicate which tasks it was used for, which makes it difficult to evaluate how appropriate or helpful they were. In addition, neither reported use time nor skill testing can be sufficiently used to predict abandonment once AT is deployed.

      Our next steps include generalizing our approach to AT (such as upper limb prosthetics), and expanding it to include a wider variety of tracked activities. In addition, we will develop a longitudinal data set that includes examples of abandonment. This will allow the creation algorithms that can characterize the type and quality of use over the lifecycle of AT and predict abandonment.

[1] U.S. Census 2001

Modeling Human Routines

Modeling and Understanding Human Routine Behavior

Human routines are blueprints of behavior, which allow people to accomplish their purposeful repetitive tasks and activities. People express their routines through actions that they perform in the particular situations that triggered those actions. An ability to model routines and understand the situations in which they are likely to occur could allow technology to help people improve their bad habits, inexpert behavior, and other suboptimal routines. In this project we explore generalizable routine modeling approaches that encode patterns of routine behavior in ways that allow systems, such as smart agents, to classify, predict, and reason about human actions under the inherent uncertainty present in human behavior. Such technologies can have a positive effect on society by making people healthier, safer, and more efficient in their routine tasks.


Modeling and Understanding Human Routine Behavior
Nikola Banovic, Tofi Buzali, Fanny Chevalier, Jennifer Mankoff, and Anind K. Dey
In Proceedings of the 2016 ACM annual conference on Human Factors in Computing Systems(CHI ’16). ACM, New York, NY, USA.
Honorable Mention Award

Dynamic question ordering

Diagram of FOCUS method for dynamic, cost-sensitive feature selection

In recent years, surveys have been shifting online, offering the possibility for adaptive questions, where later questions depend on responses to earlier questions. We present a general framework for dynamically ordering questions, based on previous responses, to engage respondents, improving survey completion and imputation of unknown items. Our work considers two scenarios for data collection from survey-takers. In the first, we want to maximize survey completion (and the quality of necessary imputations) and so we focus on ordering questions to engage the respondent and collect hopefully all the information we seek, or at least the information that most characterizes the respondent so imputed values will be accurate. In the second scenario, our goal is to give the respondent a personalized prediction, based on information they provide. Since it is possible to give a reasonable prediction with only a subset of questions, we are not concerned with motivating the user to answer all questions. Instead, we want to order questions so that the user provides information that most reduces the uncertainty of our prediction, while not being too burdensome to answer.

Kirstin Early, Stephen E. Fienberg, Jennifer Mankoff. (2016). Test time feature ordering with FOCUS: Interactive predictions with minimal user burden. In Proceedings of 2016 ACM Conference on Pervasive and Ubiquitous ComputingHonorable Mention: Top 5% of submissions. Talk slides.