Skip to content

Summit Success!

c10twrzuqailqicOn Tuesday and Wednesday of this week we ran a 3D printing summit which focused on the intersection of computer science and additive manufacturing. We had an outstanding lineup of speakers who explored topics ranging from the challenges faced by users in understanding tasks both simple and complex to the applications of optimization and machine learning to design to a wide variety of materials to printing in the large. Each talk was wildly different and hugely fascinating.

One of the best parts of the event was that each speaker led a discussion group after their talk in which attendees had the opportunity to dig in depth into questions they had about the topics presented. To me, those discussions were often the highlight of the event.

One topic I want to highlight was applications of additive manufacturing and the constraints they bring with them. The discussion ranged widely here from fuel nozzles to tooling to jewelry. Important distinctions that came up include the size of the market (jewelry and tooling are both surprising large for example) and the degree of oversight needed for products to succeed.

Another very interesting question that came up was the extent to which the automated algorithms we saw presented are able to capture the whole toolchain. The toolchain itself is a topic for another day, as from an end user’s perspective it is needlessly complex. Worse, details such as the size of a printer nozzle may have profound implications for high level design decisions. If we can capture these relationships in models, they are more likely to succeed.

I was only able to attend about a third of the discussions (since they ran in parallel), and glancing at the notes from other groups I see topics as wide ranging as 3D scanning, the value of hardware vs software advances, wireless charging, and carbon fiber. I wish I could be a bird on the wall in every one of these discussions!

 

 

Capacity Building for Accessibility @ CMU

AccessComputing Capacity Building Award goes to Jeff Bigham and Carol FriezeI just finished an inspiring day attending a capacity building for accessibility workshop at CMU organized by Carol Frieze and sponsored in part by Access Computing (who’s founder, Richard Ladner, who was the keynote speaker). At the event, Carol and Jeff Bigham were honored with the Access Computing Capacity Building Award, a well deserved honor.

 

It was wonderful to see the number of people in the room, and to realize just how strong and rich the accessibility community at CMU has become over the years since I arrived. Just to list a few of the faculty and post docs in the school of computer science who do accessibility research and were represented at the summit in some way, we have Henny AdmoniChieko AsakawaJeff Bigham, Carol Frieze, Scott Hudson,  Jennifer Mankoff, Luz RelloAaron Steinfeld and Sidd Srinivasa. Many others work in the area, such as folks associated with the Quality of Life Center that recently ended (its leaders were Dan Siewiorek and Rory Cooper). Of course that brings up the Department of Rehabilitation Science and Technology at the University of Pittsburgh, a rich resource and partner in much of our disability work, which Rory is a member of.

I’ll close by quoting our keynote speaker, whose goal is to see not only increasing attention to accessibility research, but increasing inclusion of people with disabilities in computing:

Computing fields need more people with disabilities because their expertise and perspectives spark innovation

 

 

 

Writing Advice

As I’m ramping up for the CHI deadline once again, I find myself not only writing, but teaching about writing. I was lucky enough to learn writing from some experts during graduate school, and have had a lot of practice since. This year, in response partly to a request from my students, I’m trying to put excerpts that feel like nuggets I’ve repeated many times all in one place. So here are some common writing issues that I’m commenting on in CHI papers this year (in no particular order):

Introduction: What is the main promise, obstacle, and solution of the paper, at the level that you address it? I.e. don’t tell me that cats would be better off if their owners could only be more subservient (promise), cat owners don’t know what their cats want from them (obstacle) and we contribute a tool that helps cats and cat owners communicate (solution). In addition to being too high level, the promise and the obstacle don’t even entirely match up. Instead, tell me the specific communication problem you are probably working on (such as cats needing their litter box cleaned right away being fastidious), the specific obstacle (owners tend to place it where they can’t smell it)  and the specific solution (a litter box smell sensor smart phone app?). Then generalize (this is an instance of better cat/owner communication etc etc). I’ve only been a cat owner for less than a year, so those of you who know cats better can probably think of a better example :).

Section structure: Every section needs some sort of story, which you should lay out in an overview paragraph (if you feel your text is redundant, fix that later, first get the goals down in overview form). Then there’s the section meat, and hopefully something at the end about what we learned. Not all that dissimilar from the whole paper. Every paragraph needs the same thing. Something along the lines of why the paragraph exists (i.e. a topic sentence that ties to the rest of the section in some way), the meat, and then an ending (what we learned). Again, redundancy is better at first and then you can streamline.

Related work: Each related work paragraph should start with a topic statement explaining what it is about (presumably with respect to those topics since you have established them as important). If there are other things you cover in related work, explain what they are and why you cover them in the paragraph where you touch on them. You might also try to end each related work paragraph with a summary sentence that restates the major benefit and open questions left by the work you just described. This should not be about your research, just about the gaps that are present.

Vocabulary: Papers often have jargon in them. For those of you who have a tendency to use too much jargon, pick a small number of new terms you will use, introduce them, and then use them consistently. For those of you who have a tendency to use too little jargon (i.e., if you find yourself saying ‘our model’ even when there are three different ones in your paper), follow the same advice :).

Keep a list of your writing quirks, whatever they are, as your editors uncover them for you. Inconsistent use of -? Lack of italics on latin? Forget to check that your reference program didn’t introduce errors? Excess word phrases such as ‘in order to’? When you get comments about wording, spelling, grammar, syntax, punctuation, and formatting add them to a hit list of things you spend 10 minutes checking before you send out each draft (and especially before you submit!). This will make life much easier for your reviewers.

Cutting … that should probably be a whole separate post. Just remember that more concise writing is usually better writing, so you almost always should cut words before content.

Niggles: I always assume that if I have a niggle, so will my reviewers (kind of how if one student asks something in class, probably others have the same question). So if you have a niggle of a doubt about anything, add a comment to your paper, or try to address it, or talk it over with someone.

I’m sure I will have more, but this is a start (and feel free to add your own ideas in the comments).

(Photo credit threecheersformcr_xo@Flickr)

FABulous CHI 2016

At the CHI 2016 conference this week, there were a slew of presentations on the topic of fabrication. First of course I have to highlight our own Megan Hofmann who presented our paper, Helping Handsa study of participatory design of assistive technology that highlights the role of rapid prototyping techniques and 3D printing. In addition, Andrew Spielberg (MIT & Disney Intern) presented RapID (best), a joint project with Disney Research Pittsburgh which explores the use of RFID tags as a platform for rapidly prototyping interactive physical devices by leveraging probabilistic modeling to support rapid recognition of tag coverage and motion.

I was really excited by the breadth and depth of the interest at CHI in fabrication, which went far beyond these two papers. Although I only attended by robot (perhaps a topic for another blog post), attending got me to comb through the proceedings looking for things to go to — and there were far more than I could possibly find time for! Several papers looked qualitatively at the experiences and abilities of novices, from Hudson’s paper on newcomers to 3D printing to Booth’s paper on problems end users face constructing working circuits (video; couldn’t find a pdf) to Bennett’s study of e-NABLE hand users and identity.

There were a number of innovative technology papers, including Ramaker’s Rube Goldbergesque RetroFab, Groeger’s HotFlex (which supports post-processing at relatively low temperatures), and Peng’s incremental printing while modifying. These papers fill two full sessions (I have only listed about half).   Other interesting and relevant presentations (not from this group) included a slew of fabrication papers, a study of end user programming of physical interfaces, and studies of assistive technology use including the e-NABLE community.

Two final papers I have to call out because they are so beautiful: Kazi’s Chronofab (look at the video) and a study in patience, Efrat’s Hybrid Bricolage for designing hand-smocked artifacts (video, again can’t find the paper online).

The Data Pipeline

Visualization of accounts that could be associated with ISIS

I’ve been teaching a course on using data for three years now, and it feels ‘finished’ enough that it is perhaps worth writing about. When I began the course, I had the goal of (1) exploring the human side of data, meaning the ways in which what we know about people impact our ability to make, process, interpret, and use data (2) make the course accessible to a broad variety of people (not just programmers) and (3) organize the course around modules that would produce nice portfolio items. The result is the course at data.cmubi.org.  While it has evolved over the years, it’s always included at least a few beginner programmers, and the projects have been interesting opportunities for students to explore issues like interactive machine learning, data visualization, and topic areas they care a great deal about.

A big emphasis on the course is on data cleaning –understanding deeply the flaws in your data, from bias in data collection to missing values in data files. Many (hopefully most) of the projects below have significant sections documenting their sources and efforts / decision making around this topic.

Another big emphasis in the course is on understanding what the data will be used for, and by whom. Tied to this, we talk extensively about intelligibility in machine learning, the importance of narrative in visualization (and visualization in general), and the importance of defining the question you are answering.

Here are some of the highlights over the last three years:

Screen Shot 2016-05-09 at 12.51.04 PM

Bus bunching is a phenomenon that can impact bus wait times. One of my 2016 students has been collecting data and extensively studying the phenomenon. His final project in the class drew on this data set and explores visual representations of the phenomenon.

 

Screen Shot 2016-05-09 at 12.56.20 PMYelp data is always an area of interest. In 2014 … In 2015 students explored which state has the best pizza :). In 2016, the ‘Bon Yinzers‘ developed a wonderful series of visualizations of factors that affect popularity of Pittsburgh restaurants. They uncovered some interesting phenomena such as the unexpectedly off-cycle checkin times of the most active Yelp users in Pittsburgh.

Screen Shot 2016-05-09 at 1.00.13 PMSan Francisco Crime Alert explores the likelihood of different types of  Crime in different SF area neighborhoods. Their prediction algorithm gives you a way to explore the prevelance of major and minor crime in terms of time of year, time of day, and location.

Screen Shot 2016-05-09 at 1.03.50 PMIn 2015, a group collected and analyzed data set of tweets by potential ISIS supporters, with the goal of ultimately engaging others in helping to label such data and understand how ISIS supporter accounts differ from other accounts with sometimes similar tweets (e.g. news accounts or bloggers).

Screen Shot 2016-05-09 at 1.12.00 PMOften, a goal of class students is more about policy than about end users. In 2015 Healt$care explores the quality of healthcare and its relationship to dollars spent across the U.S. in a highly visual fashion.

 

Screen Shot 2016-05-09 at 1.17.31 PMIn 2014, a group asked what jobs are popular in what parts of the US?. Again a combination of data visualization and prediction supports exploration of the question. A similar approach was explored by a 2014 group that collected data about movie piracy and its relationship to DVD release strategies.

Sadly, not all of the older projects still work (web standards change so fast!). I wish I could provide links to work such as the Reddit AMA visualization pictured here.

Screen Shot 2015-05-04 at 4.08.56 PM

 

AMIA trip report

AMIA Logo

I have been curious about AMIA for some time, and was even invited to be part of a panel submission to it this year. So when I realized it was only a few hours’ drive away, I took advantage of the closeness to plan a last minute trip. It has been an interesting experience and well worth the attendance. Although a very large conference, the group of people attending seems to be friendly and open, and I was welcomed in particular by two wonderful women I met, Bonnie Kaplan and Patti Brennan. The sessions are an intriguing combination of computer science, medicine, and clinical practice (with the quality of/knowledge about each varying based on the expertise/presence of appropriate collaborators).  I attended sessions on Monday, Tuesday, and Wednesday. The theme that stood out to me more than any other across my experiences here was the great diversity of stakeholders that are (and especially that should be) considered in the design of effective health IT. Some very interesting observations came out of the large scale analysis of clinical data that I saw discussed on Monday. For example, there is a lot of attention being paid to data privacy (although one person commented that this is commonly misunderstood as “Uniqueness is not synonymous with being identified”) and particularly how to scrub data so that it can “get past IRB” for further analysis. One interesting approach taken by N. Shah (Learning Practice-based Evidence from unstructured Clinical Notes; Session S22) is to extract the terms (features) and use those instead of the data itself. Of course a limitation here is that you have to think of features ahead of time.

Another interesting topic that came up repeatedly is the importance of defining the timeline of the data as much as the timeline of the person. Questions that need to be answered include what is time zero in the data being analyzed (and what might be missing as a result); what is the exit cause, or end moment for each person in the database (and who is being left out / what is the bias as a result?); and the observation that in general “sick people make more data.” To this I would add that if you attempt to address these biases by collecting information there is potentially selection bias in the subjects and the impact of the burden of sensing on the data producer. Connected to this is the ongoing questions of the benefits and problems of a single unique identifier as a way of connecting health information.

Last observation from Monday is the question of what public data sets are out there that we should make ourselves aware of. For example, MIT has big data medical initiative and (also see http://groups.csail.mit.edu/medg/) and may have a clinical notes data set associated with it (I am still looking for this).

On Tuesday I started the day with S44: year in review  (D. Masys). I missed the very start of it, but came in when he was talking about studies of IT’s use in improving clinical practice, such as a study showing that reminding clinicians to do their work better improves patient outcomes (“physician alerts” “embedded in EHR systems” etc), or maybe just improves process, with the observation that we should measure both. Interestingly to me, the question of also improving process and outcomes by organizing the work of caregivers (and reminding them of things) was missing from this discussion.

Dr. Masys then moved on to explore unexpected consequences of IT that had been published: adding virtual reality caused “surgeon blindness” to some information; missed lab results in another study and alert fatigue in another (drug-drug interactions suffer from 90% overrides…). Given the difficulty of publishing negative results, it would be interesting to explore this particular set of work for tips. It was also interesting to hear his critique of questionable results, particularly the repeated mentions of hawthorne effects  because so many interventions are compared to care as usual (rather than an equal-intensity control condition). Another way of phrasing this is to ask at what cost does the intervention work (and/or how do we “adjust for the intensity of the intervention” )

Another category Dr. Masys explored of interest to me was health applications of mobile electronics. Briefly, one study looked at chronic widespread pain … reduced ‘catastophizing’; four looked at text messaging vs telephone appointment reminders; effectiveness of a short message reminder in increased followup compliance; text4baby mobile health program; cameroon mobile phone SMS 9CAMPS) trial (PLoS One)

Dr. Masys then moved on to the practice of clinical informatics and bioinformatics (out of “the world of rcts”). This focused on new methods that might be interesting. I particularly want to follow up on one of the few studies that looked at multiple stakeholders which had the goal of reducing unintended negative consequences; the use of registries to do low cost, very large trials;  the use of a private key derived from dna data being encrypted for encrypting that same data; and the creation of a 2D barcode summarizing patient genetic variants that affect the dose or choice of a drug; and a demonstration that diagnostic accuracy was as good on a tiny mobile phone screen as a big screen.

The last category reviewed by Dr. Masys was editors choice publications from JAMIA; J. of Biomed. Informatics; and the Diane Forsyth award. Almost all of these seem worth reviewing in more depth — particularly the JAMIA articles scientific research in the age of omics (explores the need to increase accountability of scientists for the quality of their research) web-scale pharmacovigilance (used public search engine logs to detect novel drug drug interactions); CPOEs decrease medication errors (a meta study that basically concluded without realizing it that CPOEs would work better if we had only applied basic principals from contextual inquiry!) and the JBI articles by Rothman, who developed a continuous measure of patient condition-predicted hospital re-admission and mortality independent of disease (how does this compare with patient reported health status); Weiskopf (who documented the relative incompleteness of EHR data across charts he studied); Friedman’s overview of NLP state of the art and prospects for significant progress (summary of a workshop); Post’s article on tools for analytics of EHR data; and Valizadegan’s article on learning classification models from multiple experts who may disagree (given my interest in multiple viewpoints).

Next, I attended a panel about Diana Forsyth (obit; some pubs; edited works), an ethnographer who had a big impact on the field of medical informatics (and others as well) … she has passed away, and perhaps only a small number of people read work, but her work had an enormous influence on those people who encountered her writing on methods, research topics, and so on. She was compared by one panelist to Arthur Kleinman (who helped to make the distinction between the abstraction of disease and the human experience of illness; treatment and healing). Some of the most interesting parts of the discussion were focused on how the field is changing over time, prompted by a question of Katie Siek’s — for example getting data into the computer, computers into the hospitals, now making them work for people correctly, and what comes after that? Another interesting comment was about the authority of the physician being in part based on their ability to diagnose (which conveys all sorts of societal benefits). This points to the role of the physician (when diagnosis doesn’t exist human creativity is especially needed) versus IT (which can handle more well defined situations). However with respect to healing, maybe the power of physicians is in listening as much as diagnosing (also something computer’s can’t do, right?). Other topics that came up included the importance of the patient voice and patient empowerment/participation.

After lunch with a friend from high school I attended S66 (User centered design for patients and clinicians). In line with the hopes of the Forsyth panel I saw a mixture of techniques here including qualitative analysis. Unfortunately, what I did not see was technology innovation (something that may point to a different in vocabulary regarding what “user centered design” means). However the qualitative methods seemed strong. One interesting talk explored the issues in information transfer from the hospital to home health care nurses. A nice example of some of the breakdowns that occur between stakeholders in the caregiver community. More and more, however, I find myself wondering why so much of the work here only focuses on caregivers with degrees of some sort in medicine (as opposed to the full ecology of caregivers). I was pleased to see low-income settings represented, exploring the potential of mobile technology to help with reminders to attend appointments and other reminders; and a series of 3 studies on health routines culminating in a mobile snack application (published at PervasiveHealth) by Katie Siek & collaborators. One nice aspect of this project was that the same application had differing interfaces for different stakeholders (e.g. teenagers vs parents).

I started to attend the crowdsourcing session after the break, but it did not appear to have much in terms of actual crowdsourcing. An open area for health informatics? Instead I went on to S71, family health history & health literacy. The most interesting paper in the session, to me, looked at health literacy in low SES communities (by many co-authors including Suzanne Bakken). In particular, they have data from 4500 households which they would like to visualize back to the participants to support increased health literacy. Their exploration of visualization options was very detailed and user centered and resulted in the website GetHealthyHeights.org (which doesn’t seem to be alive at the moment). However I have concerns about the very general set of goals with respect to what they hope people will get out of the visualizations. It would be interesting to explore whether there’s a higher level narrative that can be provided to help with this. Similarly, does it make sense to present “typical” cases rather than specific data.

On Wednesday I began in S86: Late breaking abstracts on machine learning in relation to EMRs. This session had some interesting exploration of tools as well as some patient focused work. One study looked at prediction of mobility improvements for older adults receiving home health care, by subgrouping 270k patients and looking at factors associated with the subgroups. Steps included de-identification; standardize data; accounting for confounding factors; divide into sub groups; and then used data mining to look at factors that affected individual scores and group scores using clustering and pattern mining. An interesting take on what is part of the data “pipeline” that goes beyond some of the things I’ve been thinking are needed for lowering barriers to data science. Another looked at decision support for pre-operative medication management (an interesting problem when I consider some of the difficulties faced by the many doctors coordinating my mother-in-law’s care recently).  This work was heuristic in nature (a surprising amount of work here is still focusing on heuristics over other more statistically based approaches). From this work I noticed another trend however, the need to connect many different types of information together (such as published work on drugs, clinical notes, and patient history).

The last session I attended was S92, one of the few sessions focused specifically on patients (and not very well attended…). The first talk was about creating materials for patient consumption, supporting access to EHRs, 2-way secure messaging, and customized healthcare recommendations. They focused especially on summarizing medication information concisely. The second is about a national network for comparative effectiveness. Maybe this is the crowdsourcing of health IT? This was focus group based research (a surprising popular method across AMIA given how little support there is for this method in HCI) exploring user attitudes about data sharing. Interesting that the work presented here ignored a long history of research in trust in computing e.g. from Cliff Nass, the e-commerce literature, and so on. However, the data was nicely nuanced in exploring a variety of ethical issues and acknowledging the relative sophistication of the group members with respect to these issues. The issues raised are complex — who benefits, who owns the data, how would the bureaucracy function, how to manage authorization given that studies aren’t always known yet (and opt-in vs opt-out approaches). I wonder how a market for research would function (think kickstarter but I donate my data instead of money…). The next paper looked at what predicted people thinking EHR are important both for themselves and their providers, and through a disparities lens.

The closing plenary was given by Mary Czerwinski (pubs) from Microsoft Research. I always enjoy her talks and this was no exception. Her focus was on her work with affective systems, relating to stress management. Her presentation included the a system for giving clinicians feedback about their empathy in consults with patients; as well as a system for giving parents reminders when they were too stressed to remember the key interactions that could help their ADHD kids. Interestingly, in the parent case, (1) the training itself is helpful and (2) the timing is really important — you need to predict a stress situation is building to intervene successfully (I would love to use this at home :). She ended by talking about a project submitted to CHI 2014 that used machine learning to make stress management suggestions based on things people already do (e.g. visit your favorite social network; take a walk; etc). One of the most interesting questions was whether emotional state could predict mistake making in coding data (or other tasks).

Would I go back a second time? Maybe … It is a potentially valuable setting for networking with physicians; the technical work is deep enough to be of interest (though the data sets are not as broad as I’d like to see). It’s a field that seems willing to accept HCI and to grow and change over time. And the people are really great. The publishing model is problematic (high acceptance rates; archival) and I think had an impact on the phase of the 3421work that was presented at times. What was missing from this conference? Crowdsourcing, quantified self research, patient websites like PatientsLikeMe, patient produced data (such as support group posts), significant interactive technology innovation outside the hospital silo. In the end, the trip was definitely worthwhile.

Some observations about people who might be interesting to other HCI professionals interested in healthcare. For example, I noticed that MITRE had a big presence here, perhaps because of their recent federally funded research center. In no particular order here are some people I spoke with and/or heard about while at AMIA 2013:


Patti Brennan (some pubs) is the person who introduced me to or told me about many of the people below, and generally welcomed me to AMIA. She studies health care in the home and takes a multi-stakeholder perspective on this. A breath of fresh air in a conference that has been very focused on things that happen inside the physician/hospital silo.

Bonnie Kaplan is at the center for medical informatics in the Yale school of medicine. Her research focuses on “Ethical, legal, social, and organizational issues involving information technologies in health care, including electronic health and medical records, privacy, and changing roles of patients and clinicians.”

Mike Sanders from www.seekersolutions.com, which is providing support for shared information between nurses, caregivers & patients, based in B.C. (Canada).

Amy Franklin from UT Health Sciences Center, has done qualitative work exploring unplanned decision making using ethnographic methods. Her focus seems to be primarily on caregivers, though the concepts might well transfer to patients.

Dave Kaufman is a cognitive scientist at ASU who studies, among other HCI and health including “conceptual understanding of biomedical information and decision making by lay people.”  His studies of mental models and miscommunication in the context of patient handoff seem particularly relevant to the question of how the multi-stakeholder system involved in dealing with illness functions.

Paul Tang (Palo Alto Medical Foundation) is a national leader in the area of electronic health records and patient-facing access to healthcare information.

Danny Sands (bio; some pubs)– doctor; entrepreneur; founded society for participatory medicines; focus on doctor-patient communication and related tools; studies of ways to improve e.g. patient doctor email communication.

Dave deBronkart (e-patient Dave, who’s primary physician was Dr. Sands during his major encounter with the healthcare system), best summarized in his Ted talk “Let Patients Help” (here’s his blog post on AMIA 2013)George Demiris from University of Washington studies “design and evaluation of home based technologies for older adults and patients with chronic conditions and disabilities, smart homes and ambient assisted living applications and the use of telehealth in home care and hospice.”. His projects seem focused on elders both healthy and sick. One innovative project explored the use of skype to bring homebound patients into the discussions by the hospice team.
Mary Goldstein who worked on temporal vision of patient data (KNAVE-II) and generally “studies innovative methods of implementing evidence-based clinical practice guidelines for quality improvement” including decision support.

Mark Musen studies “mechanisms by which computers can assist in the development of large, electronic biomedical knowledge bases. Emphasis is placed on new methods for the automated generation of computer-based tools that end-users can use to enter knowledge of specific biomedical content.” and has created the Protégé knowledge base framework and ontology editing system.

Carol Friedman does “both basic and applied research in the area of natural language processing, specializing in the medical domain” including creating the MedLEE system (“a general natural language extraction and encoding system for the clinical domain”). Her overview of NLP paper was mentioned in the year in review above.

Suzanne Bakken (pubs) has been doing very interesting work in low income communities around Columbia in which she is particularly interested in communicating the data back to the data producers rather than just focusing on its use for data consumers.Henry Feldman (pubs) who was an IT professional prior to becoming a physician has some very interesting thoughts on open charts, leading to the “Open Notes” project

Bradley Malin (pubs) is a former CMU student focused on privacy who has moved into the health domain who is currently faculty at Vanderbilt. His work provides a welcome and necessary theoretical dive into exactly how private various approaches to de-identifying patient data are. For example, his 2010 JAMIA article showed that “More than 96% of 2800 patients’ records are shown to be uniquely identified by their diagnosis codes with respect to a population of 1.2 million patients.”


Jina Huh
 (pubs) studies social media for health. One of her recent publications looked at health video loggers as a source of social support for patients. She shares an interest with me in integrating clinical perspectives into peer-produced data.
Katie Siek (pubs) who recently joined the faculty at Indiana does a combination of HCI and health research mostly focusing on pervasive computing technologies. One presentation by her group at AMIA this year focused on a mobile snacking advice application that presented different views to different stakeholders.
Madhu Reddy (some pubs) trained at UC Irvine under Paul Dourish and Wanda Pratt and brings a qualitative perspective to AMIA (he was on the Diana Forsyth panel for instance). He studies “collaboration and coordination in healthcare settings”
Kathy Kim who spoke in the last session I attended about her investigations of patient views on a large data sharing network to support research, but also does work that is very patient centered (e.g. mobile platforms for youth).
Steve Downs who works in decision support as well as policy around “how families and society value health outcomes in children”
Chris Gibbons (some pubs) who focuses on health disparity (e.g. barriers to inclusion in clinical trials and the potential of eHealth systems).

Data Collection & Analytics Tools?

I have become fascinated recently with the question of the role that data has in supporting analysis, action, and reflection. Actually, it would be more accurate to say that I’ve become aware recently that this is an intrinsic driver in much of the work I do, and thus it has become something I want to reflect on more directly. In this post, I want to explore some of the tools others have already built that might support analytics, machine learning, and so on. If you know of something I’ve missed, feel free to share it in the comments! So, in no particular order:

  • Hazy  provides a small handful of key primitives for data analysis. These include Victor, which “uses RDBMS to solve a large class of statistical data analysis problems (supervised machine learning using incremental gradiant algorithms) and WisCi (/ DeepDive, it’s successor), which is “an encylopedia powered by machines, for the people. ” RapidMiner is a similar tool that has been used by thousands. It is open source and supports data analysis and mining.
  • Protégé is “a suite of tools to construct domain models and knowledge-based applications with ontologies” including visualization and manipulation
  • NELL learns over time from the web. It has been running since 2010 and has “accumulated over 50 million candidate beliefs.”  A similar system is
  • Ohmage and Ushahidi are open source citizen sensing platforms (think citizen based data collection). Both support mobile and web based data entry. This stands in contrast to things like Mechanical Turk which is a for-pay service, and games and other dual-impact systems like PeekaBoom (von Ahn et al.) which can label objects in an image using crowd labor, or systems like Kylin (Hoffmann et al.) which simultaneously accelerates community content creation and information extraction.
  • WEKA and LightSide support GUI based machine learning (WEKA requires some expertise and comes with a whole textbook, while LightSide is built on WEKA but simplifies aspects of it, and specializes in mining textual data). For more text mining support, check out Coh-Metrix, which “calculates the coherence of texts on a wide range of measures. It replaces common readability formulas by applying the latest in computational linguistics and linking this to the latest research in psycholinguistics.” Similarly, LIWC, which supports linguistic analysis (not free) by providing a dictionary and a way to compare to that dictionary to analyze the presence of 70 language dimensions in a new text from negative emotions to casual words.

Deployed tools research and products aside, there is also a bunch of research in this area, ranging from early work such as aCappela (Dey et al.), Screen Crayons (Olsen, et al.). More recently, Gestalt (Patel et al.“allows developers to implement a classification pipeline” and Kietz et al. use an analysis of RapidMiner’s many data analysis traces to automatically predict optimal KDD-Workflows.

Luis von Ahn, Ruoran Liu and Manuel Blum Peekaboom: A Game for Locating Objects in Images In ACM CHI 2006

Hoffmann, R., Amershi, S., Patel, K., Wu, F., Fogarty, J., & Weld, D. S. (2009, April). Amplifying community content creation with mixed initiative information extraction. In Proceedings of the 27th international conference on Human factors in computing systems (pp. 1849-1858). ACM.

Dey, A. K., Hamid, R., Beckmann, C., Li, I., & Hsu, D. (2004, April). a CAPpella: programming by demonstration of context-aware applications. InProceedings of the SIGCHI conference on Human factors in computing systems (pp. 33-40). ACM.

Olsen Jr, Dan R., Trent Taufer, and Jerry Alan Fails. “ScreenCrayons: annotating anything.” Proceedings of the 17th annual ACM symposium on User interface software and technology. ACM, 2004.

Kayur Patel, Naomi Bancroft, Steven M. Drucker, James Fogarty, Andrew J. Ko, James A. Landay: Gestalt: integrated support for implementation and analysis in machine learning. UIST 2010: 37-46

Kietz et al. (2012). Designing KDD-Workflows via HTN-Planning, 1–2. doi:10.3233/978-1-61499-098-7-1011

Search and Rescue and Probability Theory

A man and a dog together belaying down a rock face

Canine Search and Rescue (photo from AMRG website)

I spent a fascinating evening with the Allegheny Mountain Rescue Group today. This is a well run organization that provides free help for search and rescue efforts in the Pittsburgh area and beyond. I was in attendance because my kids and I were looking for a way to give Gryffin (our new puppy) a job in life beyond “pet” and we love to work in the outdoors. Canine search and rescue sounded like a fascinating way to do this and we wanted to learn more. During the meeting, I discovered a team of well-organized, highly trained, passionate and committed individuals that has a healthy influx of new people interested in taking part and a strong core of experienced people who help to run things. The discussions of recent rescues were at times heart rending, and very inspiring.

Later in the evening during a rope training session I started asking questions and soon learned much more about how a search operates. I discovered that about a third of searches end in mystery. Of those for which the outcome is known, there is about an even split between finding people who are injured, fine, or have died. Searches often involve multiple organizations simultaneously, and it is actually preferable to create teams that mix people from different search organizations rather than having a team that always works together. Some searches may involve volunteers as well. A large search may have as many as 500 volunteers, and if the target of the search may still be alive, it may go day and night. Searches can last for days. And this is what led me to one of the most unexpected facts of the evening.

I asked: How do you know when a search is over? And the answer I got was that a combination of statistics and modeling is used to decide this in a fairly sophisticated fashion. A search is broken up into multiple segments, and a probability is associated with each segment (that the person who is lost is in a segment). When a segment is searched, the type of search (human only, canine, helicopter, etc.) and locations searched, along with a field report containing details that may not be available otherwise are used to update the probability that a person is in that segment (but was missed) or absent from that segment. Finally, these probabilities are combined using a spreadsheet(s?) to help support decision making about whether (and how) to proceed. According to the people I was speaking with, a lot of this work is done by hand because it is faster than entering data in and dealing with more sophisticated GIS systems (though typically a computer is available at the search’s base, which may be something like a trailer with a generator). GPS systems may be used as well to help searchers with location information and/or track dogs.

Some of the challenges mentioned are the presence of conflicting information, the variability in how reliable different human searchers are, the fact that terrain may not be flat or easily represented in two dimensions, the speed of computer modeling, the difficulty of producing exact estimates of how different searchers affect the probability of finding someone and the variable skill levels of searchers (and the need to organize large numbers of searchers, at times partly untrained). When I raised the possibility of finding technology donations such as more GPS systems, I was also told that it is critical that any technology, especially technology intended for use in the field, be ultra simple to use (there is no time to mess with it), and consistent (i.e. searchers can all be trained once on the same thing).

Although this blog post summarizes what was really just a brief (maybe hour long) conversation with two people, the conversation had me thinking about research opportunities. The need for good human centered design is clear here, as is the value of being able to provide technology that can support probabilistic analysis and decision making. Although it sounds like they are not in use currently, predictive models could be applicable, and apparently a fair amount of data is gathered about each search (and searches are relatively frequent). Certainly visualization opportunities exist as well. Indeed, a recent VAST publication (Malik et al., 2011) looked specifically at visual analytics and its role in maritime resource allocation (across multiple search and rescue operations).

But the thing that especially caught my attention is the need to handle uncertain information in the face of both ignorance and conflict. I have been reading recently about Dempster-Shafer theory, which is useful when fusing multiple sources of data that may not be easily modeled with standar probabilities. Dempster-Shafer theory assigns a probability mass to each piece of evidence, and is able to explicitly model ignorance. It is best interpreted as producing information about the provability of a hypothesis, which means that at times it may produce a high certainty for something that is unlikely (but more provable than the alternatives). For example, suppose two people disagree about something (which disease someone has, for instance), but share a small point of agreement (perhaps both people have a low-confidence hypothesis that the problem is a brain tumor) that is highly improbable from the perspective of both individuals (one of whom suspects migraines, the other a concussion).  That small overlap will be the most certain outcome of combining their belief models in Dempster-Shafer theory, so a brain tumor, although both doctors agree it is unlikely, would be considered most probable by the theory.

One next obvious step would be to do some qualitative research and get a better picture of what really goes on in a search and rescue operation. Another possible step would be to collect a data set from one or more willing organizations (assuming the data exists) and explore algorithms that could aid decision making or make predictions using the existing data. Or then again, one could start by collecting GPS devices (I am suer some of you out there must have some sitting in a box that you could donate) and explore whether there are enough similar ones (android + google maps?) to meet the constraint of easy use and trainability. I don’t know yet whether I will pick up this challenge, but I certainly hope I have the opportunity to. It is a fascinating, meaningful, and technically interesting opportunity.

Malik, A., Maciejewski, R., Maule, B., & Ebert, D. S. (2011). A visual analytics process for maritime resource allocation and risk assessment. In the 2011 IEEE Conference on Visual Analytics Science and Technology (VAST), pp. 221-230.

Why study the future?

I have asked myself that question numerous times over the last several years. Why years? Because the paper that I will be presenting at CHI 2013 (Looking Past Yesterday’s Tomorrow: Using Futures Studies Methods to Extend the Research Horizon) is the 5th iteration of an idea that began at CHI 2009, was submitted in its initial form to CHI 2011 and 2012, then DIS 2012, Ubicomp 2012, and finally CHI 2013 (and, I think, winner for most iterations of one paper I’ve ever submitted). Each submission sparked long and informative reviews and led to major revisions (in one case even a new study), excepting the last one (which was accepted).

I am telling this story for two reasons. First, I want to explore what drove me to lead this effort despite the difficulty of succeeding. Second, I want to explore what I learned from the process that might help others publishing difficult papers.Continue Reading →

Things we wish we’d know when planning a trip to Zürich

Welcome to Zürich ☺. If you are moving here there are a few things you might want to know, in no particular order :). This is especially geared towards folks living in ETH university housing.

Paperwork

(most of these things take around 30 mins – 60 mins once you find the right place)

  • To register, you will need to go to Kreisburo 6 first, then Berninerplatz (a stop on the 10 tram, the Kreisburo will give you an appointment there)
  • To leave, you will need to go to the migrationsamt (in the city hall, next to the fraumünster church) and register to leave
  • To get a half-pass (half off all tram and train travel) you can go to Bellevue (on the 9) and go into the office in the building that’s right at the center of the stop OR go to the main train station and go into the “travel agency” (take a ticket and be prepared to wait a bit
  • To get a monthly pass (free travel all month long) with your half pass you can go to the train station or use any of the newer (“fancy”) electronic machines (like the one at the Winkleriedstr. Tram stop). The “fancy” machines have an English button. There is no “fancy” machine at the airport, so don’t expect to renew your monthly pass at the end of a trip back and forth to Zürich

Money & Phones

  • I would get a Bank account at the post finance (in the post office, if you walk from Winkleried str Tram stop to Rigiblick Tram stop, you’ll see it on the left).
  • When you get bills, you typically get a “pink slip” – bring it with some cash to the post office, and you can pay it there.
  • The post office is closed over lunch
  • ETH pay can be picked up between 11 and 2 in the back right corner of the 2nd floor (I think) of the main ETH building
  • Sunrise pre-pay is the simplest mobile phone plan. You can “top it up” at any Co-op grocery store (just ask for, say “50 CHF on my sunrise)
  • Sunrise pre-pay charges you 1 CHF on each day that you make a call, text or use internet (up to 3 CHF per day total). You can get an add-on plan for unlimited internet if you use it a lot, for about 10 CHF a month.

Shopping

We’re not big shoppers, so this is just the basics.

  • H&M has reasonable clothing. The big Co-op and Migro stores have inexpensive clothing options too. There are also lots of sales in the “mall” under the main train station
  • There are a number of farmer’s markets worth checking out.

Things to check out that you might otherwise miss

  • Feminist Zürich: The labrynth and Feminist Tours of Zürich
  • The rooftop swimming pool & spa (“Thermalbad Zürich”)
  • Dolder ice rink 
  • Swimming in the clear cool clean lake of Zürich (‘nough said)
  • Tour the archeological ruins of Zürich (register at City Hall to get a “key to the city” and a map). Takes time to get the key, so this is really only for folks living here.
  • Lots of wonderful places to walk in the Züriberg (Look for the life-sized elephant fountain in the woods) and the Jütliberg. Enjoy them.
  • There’s lots of festivals in Zürich and Switzerland worth checking out. Basel Fasnacht in the spring, independence day parade in mid August, etc. etc. Google to find them. Don’t necessarily confine yourself to Switzerland – for example Austria has numerous “balls” in dance season (winter).

English speakers

  • The expat forums are a great place to find advice about all sorts of stuff
  • There’s some great meetup groups for childless expats – they do all sorts of sporty stuff in the mountains, if you’re into that. They tend to hold separate from the swiss
  • If you prefer to mix with the locals, try a yoga class, join an orchestra, etc. Downside is you have to speak some german and it helps if you’re working on understanding swiss german.
  • The ETH has a tandem-partner program. You can sign up to practice german and offer to help someone with English. We had great experiences with it. They also offer German classes (1x week)
  • If you have kids, the public school has an amazing program for helping them to learn german before shifting them to “regular” school. The teachers are wonderful and for my kids at least, the class worked wonders. Just register with the school system.
  • Be prepared for younger (even 1-3 grade) kids being done with school at noon two days a week or more, and having no school from 12-2. Don’t worry though, Hort will feed them a warm meal and let them play/do crafts during lunch, and as late as you need on weekdays.

Doctors

There’s an English speaking doctor’s office that has long hours at the main train station. There’s also a 24 hour pharmacy there. You should receive accident insurance through ETH, and you know better than I where you get your health insurance.

Garbage

  •  For ETH folks, you just buy regular garbage bags. For everyone else, there’s special taxed bags
  • Recycling: plastic goes inside the co-op in their wall collection unit. Metal and glass you can find bins for around the city 2-4 times a year (unsure how often) you will find a garbage bag in your mailbox for clothing and shoes. Anything else of quality, if you put it outside, someone is likely to take it.

Other

  • The climate makes gardening easy. The abundance of green space also makes foxes quite common. As a result, you can’t eat greens raw: they can leave a parasite on plants that is deadly in the rare case you catch it.
  • There is a community farm that you can help out at near the botanical gardens, if you want more than that. I’m sure there’s other options if you want an actual garden bed, but a year is short.
  • We were able to get permission to garden in the non-grassy areas of our yard.
  • We went to a Tot Shabbat service at a local liberal temple, the Jüdische Liberale Gemeinde. It’s a bit out of the way in what looks like an apartment building, but the people we met were wonderful and very welcoming. Be prepared for swiss german though :).

Have fun!

1 2 5