Our previous post discussed the foundational phase of measure development – particularly, what the Sinclair Compassion Questionnaire (SCQ) is and why we developed it. In short, the reason for developing the SCQ was to address a gap in research and patient care – the lack of a valid and reliable patient-reported compassion measure to advance compassion science, and the lack of a clinical tool to routinely measure and improve patients’ experiences of compassion. Bearing our goal for measure development in mind, our efforts were further informed through literature searches, our empirical models of compassion, and the perspectives of both patients and healthcare providers (HCPs). The topic of this article is strongly tethered to the foundational phase, extending the “what” and “why” to how—how to best measure compassion.
“Measure what is measurable, and make measurable what is not so.” – Galileo Galilei
Compassion is a rather subjective phenomenon that varies between individuals and circumstances, and is not readily measurable in the same manner as are objective quantities such as temperature, height, or blood pressure. Thus, we were challenged with the latter half of Galileo’s statement in making measurable what is not so obviously measurable in a way that was integral to both the art and science of compassion. We were further challenged by the fact that there was no manual or recipe book for how to design a compassion measure; we needed to be creative, clinically relevant, and scientifically rigorous, so we began by utilizing a Table of Specifications (TOS) to help us guide our item (i.e. question) generation efforts . A TOS is a tool that is used in the development of assessments and exams for students to ensure that course content and learning objectives are mapped to specific exam questions to demonstrate learner competencies, in a comprehensive and accurate manner. While an under-utilized resource in measure development, a TOS helps to ensure congruence between the overall goal of the proposed measure (patients’ experiences of compassion) and the individual items or questions within the measure – to ensure that each of the questions within a compassion measure tap into the overarching construct of interest and each of the domains within it.
Students are likely all too familiar with the experience of spending endless hours preparing for an exam, only to discover that a large portion of the exam questions focus on a single topic, lecture slide, or page from the course textbook – with the rest of the course objectives, learnings, and materials being apparently thrown to the wayside. An exam with too narrow of a focus does not allow students to demonstrate their competencies across the entire spectrum of class content. In the same way, it was imperative that the questions (or “items”) within the SCQ comprehensively covered all of the domains and themes that constituted compassion, as defined in our Patient Compassion Model .
Another indicator of a poorly constructed exam or measure pertains to difficulty, or having questions that can delineate between novice learners and those who possess more advanced knowledge and skills. Difficulty, in relation to developing a compassion measure meant including a range of questions assessing aspects of compassion that might be easier or more difficult for HCPs to exhibit. For example, many HCPs would likely score highly on an item such as ‘My Healthcare Providers were supportive when they talked to me’ by virtue of the fact that providing support is a part of their job. However, a question like ‘My Healthcare Providers were very supportive when they talked to me’ has the ability to separate the good from the great. A measure that only contained basic items is unable to distinguish between compassion and standard care, rendering the measure useless as both a research instrument and as a clinical tool. The “specifications” of a TOS, provide “specificity” to ensure the measure is comprehensive and sensitive enough to address these foundational issues of difficulty and content coverage among other factors. In turn, measure developers are provided a framework/table to guide item development, ensuring that their questions robustly and rigorously assess the topic of interest while mitigating floor and ceiling effects (i.e. increase response variability).
Another central principle through this second study phase and the overall development of the SCQ was fidelity. In our context, fidelity can be understood as “the extent to which delivery of an intervention adheres to the protocol or program model originally developed” . This entailed developing questions that were integral to the domains of the Patient Compassion Model and constructing the questions in a language that both resonated with and was accessible to patients. To accomplish this, we were careful to construct questions using the words of actual patients in our previous qualitative studies and by having each question approved by our patient advisory group.
In addition to the content, difficulty, and fidelity of a measure, there are many other choices that need to be made when developing a measure including the response scale, question stems, and time frame being assessed. First, much like writing a recipe and determining the units of measurement – grams, ounces, cups, or millilitres – we had to establish the “units” of compassion. Should we use a response scale that measures frequency (e.g. the number of times a nurse displayed a compassionate behaviour), satisfaction (patients’ satisfaction with aspects of compassion) or an agreement (whether patients felt a certain aspect of compassion) scale? Next, we needed to decide what period of time we should ask for patients to reflect upon their experiences of compassion? Today? The last 24 hours? This past week or past month? What level of granularity within the Patient Compassion Model should the measure questions reflect (each individual code, sub-theme, theme, or domain)? While initially, this might seem like mere navel gazing, extraneous questions, or theoretical musings, these issues require thoughtful and purposeful consideration prior to developing the measure – lest a half-baked, unappetizing, lukewarm final product emerge from the measurement oven. Fortunately, our past foundational research developing a Patient Compassion Model, in addition to providing the key ingredients, also provided cooking instructions that answered many of these questions. While on the surface it may seem trivial and arbitrary to consider whether a compassion scale measures patient experience over the past 24 hours,week, or past month as they all are measuring compassion, ensuring the optimal time frame to adequately assess experiences of compassion while avoiding recall bias is crucial. It is like finding the ideal cooking time of a pie: too little and you are eating a soup of dough and raw ingredients, too long and it is burnt and unrecognizable, but just right…perfection!
A Table of Specifications addresses these fundamental development issues in the initial stages of measure development – serving as the blueprint to which measure developers can return as needed, as they traverse the somewhat unpredictable and unruly landscape of measure development, to ensure that the measure questions, recall period, and response scale accurately and comprehensively assess the construct of interest– bon appétit!
Photo by Brandon Cormier on Unsplash
 Sinclair, S., Jaggi, P., Hack, T. F., McClement, S. E., & Cuthbertson, L. (2020). A Practical Guide for Item Generation in Measure Development: Insights From the Development of a Patient-Reported Experience Measure of Compassion. Journal of Nursing Measurement, 1, 138–156. https://doi.org/10.1891/JNM-D-19-00020
 Sinclair, S., McClement, S., Raffin-Bouchal, S., Hack, T.F., Hagen, N. A., McConnell, S., & Chochinov, H. M. (2016). Compassion in Health Care: An Empirical Model. Journal of pain and symptom management, 51(2),193–203. https://doi.org/10.1016/j.jpainsymman.2015.10.009
 Mowbray, C. T., Holter, M. C., Teague, G. B., & Bybee, D. (2003). Fidelity Criteria: Development, Measurement, and Validation. American Journal of Evaluation, 24(3), 315–340. https://doi.org/10.1177/109821400302400303