How will we approach evaluating the CHIRON toolkit? Lessons from other disciplines helps guide us toward meaningful assessment by Shaun Kalweit and Megan Doerr

Published on Sep 05, 2024. DOI 10.21428/4f83582b.7f7da98a


Screenshot 2025-12-21 at 7.44.26 PM.png



Community Health Interests for Researchers & Oversight Networks (CHIRON)’s goal is to develop and pilot a toolkit that will encourage researchers, ethics boards, and data access committees to consider group interests as they plan, execute, and report on big health data research. As we reported in August, the toolkit has been developed and refined through collaboration with community members and academics. The next stage for the CHIRON Toolkit is evaluation by its intended users: biorepository researchers and research oversight committees. To this end, teams of academic and community experts will conduct pilot studies at various partnering organizations spanning from three to twelve months in duration. 

What Are We Trying to Assess?

The primary goals of the pilot studies are to determine if the CHIRON toolkit has its intended impact—encouraging researchers’ and oversight committees’ consideration of group interests in big health data research—and to identify areas for toolkit improvement. Specific research objectives include understanding how the toolkit fits into and impacts pilot participants’ (researchers & oversight networks) work, assessing elements of the user experience, and identifying barriers to adoption.

The Challenges of Toolkit Assessment

Toolkits like CHIRON present unique challenges for researchers who seek to evaluate them: they are by definition composed of various parts intended for use within one’s existing environment, and their manner of use is therefore deeply reliant upon both the contexts in which they are deployed, and on users’ specific interaction choices within those environments. Because of this, toolkits must be evaluated using contextual methods that allow for use of the toolkit to unfold realistically, as opposed to relying solely on usability testing in a controlled setting. This is why the primary method we will use in the CHIRON pilots is the diary study, an ethnographic research method well-suited for toolkit evaluation.

Long-term adoption is another challenge. Implementing a toolkit often requires considerable time and energy on the part of the user, so designing for ease of use alone is insufficient – users and their organizations must possess a certain degree of interest in incorporating the toolkit into their work. Therefore, gauging user motivation and identifying factors that may hinder adoption are key objectives of the CHIRON pilots.

How Do We Assess Motivation?

Research on domestic robots, another contextually embedded technology, provides insight into how people integrate technology into their lives. One such study notes that we cannot truly determine whether a technology is a success without assessing whether users continue to use it after the initial “novelty effect” has worn off. This observation rings particularly true for toolkits, whose long-term adoption is a major area of concern. Given this, the CHIRON pilots will investigate use over time with the aim of understanding whether CHIRON is able to survive from initial adoption to long-term incorporation (and how we might iterate upon the toolkit to ensure that it does).

Another indicator of toolkit quality that surfaces frequently across toolkit scholarship is the importance of evaluating how users appropriate individual elements of the toolkit. Toolkits offer exceptional freedom of use due to their modular nature, and the ways that users actually use these “tools” often diverges from designers' expectations.

Understanding and designing to support personalized modes of use is critical; users prefer toolkits that they can easily apply to their own unique situations, and moreover, the aforementioned robot scholarship notes that the ability to modify and personalize use of a technology is a precursor to long-term adoption. The CHIRON pilots will seek to understand how users use the toolkit, including how they appropriate elements of the toolkit in unexpected ways.

Let’s Not Forget Usability

While this discussion has centered largely around the need for contextual research, usability testing remains important; as toolkit scholars in the field of Human Computer Interaction explain, different research methods present a “trade-off between realism, precision and control”—some inquiries require the control of structured usability testing. 

During usability testing, a researcher asks a participant to perform a standard set of tasks using a particular interface while the researcher observes. While this cannot assess how a toolkit will perform in the real world, the usability of a toolkit’s interface components must still be evaluated. If the interface is difficult to use, the toolkit itself will be difficult to use, regardless of context—and hard-to-use toolkits get abandoned. Usability testing will complement our contextual methods and enable targeted evaluation of the CHIRON toolkit’s various elements.

Methods for the pilots

Given all these considerations, we will use a variety of methods in the CHIRON pilots. Not all methods will be used in every study, and the choice of methods will depend on the specific context of each pilot.

Method

Purpose

Diary Studies

Collect contextual data over time through periodic “diary entries.”

Semi-Structured Interviews

Gather rich data to explore nuance through conversation.

Moderated Usability Testing

Identify pain points in the toolkit's interface materials.

Meeting Observation

Collect supplementary contextual data via meeting observation.

Document Analysis

Review CHIRON form submissions for usage and usability data.

Stay Tuned!

The first pilot studies of CHIRON will commence this fall, with studies continuing on a rolling basis through 2025. We look forward to reporting back with updates on our progress and insights!


  Theme Trial
Please upgrade to remove this banner.