登录查看更多内容

The difference between validity and reliability, and why it matters

Craig Solid, PhD

Healthcare Value Assessment, ROI

发布日期: 2018年8月15日

When discussing the merits of a particular healthcare quality measure, the concepts of “validity” and “reliability” are often used interchangeably (if used at all). This is understandable: they are technical terms (i.e., jargon) and they both refer to how “good” a measure is, so it’s no surprise that the details of what falls under each category can get a little muddy.

However, understanding the differences in these terms and what they truly represent is critical for not just those involved in measure development, but for anyone who uses a quality measure or wants to interpret its results. Just like other statistical concepts, the proper use of a metric requires an understanding of what that metric does - and doesn't - represent. For example, when someone on a hospital staff starts talking about the average length of stay or time to thrombolytics, you probably know that averages can be influenced by outlying data points and therefore you know to ask some more probing questions about sample size and the spread of the data from which those averages are calculated. The same is true (or should be true) for validity and reliability.

As it relates to quality measures, validity reflects how accurately a measure represents whatever it is trying to capture. Healthcare "quality" is not often directly measurable, so you're typically forced to measure surrogates you hope reflect the true underlying quality. For example, measuring the time from "door-to-balloon" for an AMI patient is intended to reflect the level of quality of AMI care. That metric is valid as a measure of quality of AMI care as long as shorter times can be considered to consistently represent better quality.

We can demonstrate validity by correlating the measure with other, similar measures (convergent validity), by showing that it discriminates between entities we know to be different by some other metric (known-groups validity), or by having content experts weigh in on its merits (face validity). Threats to validity can be either be conceptual or practical in nature. That is, conceptually there may be reasons why “door-to-balloon” time is or is not a good surrogate for AMI care quality; perhaps there are legitimate reasons to delay angioplasty not reflected in the measure, for example. On a practical level we need to be sure that we accurately and consistently acquire the data used to calculate it. Are the timestamps in the electronic health record accurate? Can we be sure certain data fields are always populated?

Reliability, on the other hand, relates to the ability of the measure to correctly discriminate differing levels of quality (or changes in quality) between entities. That is, if two hospitals or physicians actually differ in their care quality, how likely is it that the measure will detect that difference?

To establish the reliability of a measure, we test things like agreement between data pulled by multiple abstractors from the same chart (inter-rater reliability) or that observed variability is due mostly to differences in actual performance versus random fluctuation (signal-to-noise analysis). Challenges to reliability occur when too much of the variation in measurement performance is due to reasons other than differences in the underlying quality of care delivery. Small sample sizes, unintended biases, and uncontrolled factors may make it difficult to reliably and consistently differentiate entities that truly differ in the quality of care they deliver.

Why does it matter? Because, while related, the concepts of validity and reliability are NOT the same, and a good measure is one that is BOTH valid AND reliable. Having one without the other is not sufficient. When you're developing a measure, it is important to consider how it will be used, where the data will come from, if the inclusion and exclusion criteria will systematically exclude a certain group of people, etc. Those who end up using the measure will need to have confidence that it reflects true quality (validity) and that when they improve the quality of care they deliver it will be reflected by an improvement in measure performance (reliability). If they don’t, it the measure becomes a burden rather than something that can be used to motivate and demonstrate quality improvement.

Once developed and used, in order to properly interpret the relative performance of measured entities one should consider aspects of both validity and reliability. That is, when looking at the relative performance of providers or facilities on a certain measure, try to think about what the potential threats to validity and reliability might be. Does the measure utilize data that seems like it would be difficult to consistently obtain? Are there situations where there might be legitimate reasons why someone would perform poorly on this measure other than their underlying quality? Do the inclusion or exclusion criteria allow for someone to potentially “game” the measure or “cherry pick” the best patients? In many cases, it’s likely that during development there was empirical testing performed to demonstrate adequate validity and reliability, but it’s difficult (if not impossible) to account for all possible scenarios and situations that could occur, so you need to be vigilant when interpreting measures.

When you read or hear about critiques and challenges to measures, it can be helpful to think about whether those critiques relate to the validity or reliability (or both) of the measure. For example, when someone challenges the use of patient outcome measures like 30-day readmission or mortality, are they challenging the validity of the measure itself (e.g., "too much can happen to a patient after discharge that's out of a hospital's control, and therefore the measure doesn't reflect the quality of care it provides")? Or are they unconvinced of its reliability (e.g., "Even with risk-adjustment, comparisons between these facilities are not appropriate or fair")? Viewing the critiques in this light allows you to understand the nature of the criticisms and can help you evaluate whether they have merit and what (if anything) should be done about them. A challenge to the appropriateness of a measure as a surrogate for underlying quality is a vastly different issue than whether the available data are adequate to make appropriate comparisons. A valid measure isn't necessarily a reliable one, and vice versa.

The Future of Validity and Reliability of Quality Measures

Going forward, the ever-expanding amount and availability of data will allow for more empirical testing of quality measures than ever before. Additionally, greater exploration and understanding of factors that influence population health – for example, the role of social determinants of health – will allow for the identification and specification of more appropriate surrogates for care quality and to more fully risk-adjust them for making comparisons. However, incorporating new knowledge into measures takes time, both in the conception of the measures themselves and in the collection and analysis of data related to that new knowledge. Validity and reliability will always be important considerations, and will always be at the heart of important discussions regarding the appropriateness and fairness of quality measures. A full and complete understanding of these concepts is essential for anyone hoping to develop, test, use, or interpret quality measures.

Mike Sacca

Independent consultant driving aligned and targeted growth.

6 年

Craig, thanks for sharing and stressing both the importance and difference between validity and reliability of a measure.

1 次回应

要查看或添加评论，请登录

Craig Solid, PhD的更多文章

The Science of Demonstrating Value

2024年7月10日

The Science of Demonstrating Value

In today's competitive MedTech landscape, startups face a critical challenge: demonstrating the true value of their…

2 条评论
The Power of Understanding Value: Why Medtech Startups Should Spend More Time at This Crucial Stage

2024年3月27日

The Power of Understanding Value: Why Medtech Startups Should Spend More Time at This Crucial Stage

In the fast-paced world of medtech innovation, startups often rush to quantify the value of their products or services.…

2 条评论
Where Are You in the Value Demonstration Journey?

2024年3月4日

Where Are You in the Value Demonstration Journey?

You may not realize it, but you may not be ready to demonstrate the value of your device or med-tech solution, even if…

1 条评论
Addressing Value Components: Identify Potential Barriers and Sources of Uncertainty

2023年12月12日

Addressing Value Components: Identify Potential Barriers and Sources of Uncertainty

This is the fifth article in a series about the components that influence the value of a novel healthcare solution and…

1 条评论
Addressing Value Components: Size the Opportunity and Potential Value Capture

2023年10月24日

Addressing Value Components: Size the Opportunity and Potential Value Capture

This is the fourth article in a series about the components that influence the value of a novel healthcare solution and…
Addressing Value Components: Exploring Relevant Perspectives

2023年9月18日

Addressing Value Components: Exploring Relevant Perspectives

This article is the third in a series on the components that influence the value of novel healthcare solutions and how…
Addressing Value Components: Map the Care Pathway

2023年8月15日

Addressing Value Components: Map the Care Pathway

This article is the second in a series on the components that influence the value of novel healthcare solutions and how…

2 条评论
Understanding the Components that Influence Value in Healthcare

2023年7月18日

Understanding the Components that Influence Value in Healthcare

These days, “value” is all the rage. If you are a medical device or med-tech start-up, everyone is telling you how…
One VC's Opinion of the Importance of ROI for Start-ups

2023年6月15日

One VC's Opinion of the Importance of ROI for Start-ups

I recently discovered an article authored by two members of a venture capital (VC) firm where they discuss how…

2 条评论
Developing Value Messages

2022年11月28日

Developing Value Messages

When device and med-tech companies realize that they need to formalize their value proposition into specific messages…

See all articles

The difference between validity and reliability, and why it matters

Craig Solid, PhD

Healthcare Value Assessment, ROI

Craig Solid, PhD的更多文章

社区洞察

其他会员也浏览了

More Interesting Musings on Some HHS Rulemaking: Where is HTI-2, and is CMS Imminent to Release a Final Rule on Healthcare Attachments?

Featured in the Press!

Why Do Errors Happen, and what we have to do.

Healthcare for a Better Future: Unlock the Benefits of Value-Based Care at Conviva

Measurement as a Performance Driver: The Case for a National Measurement System to Improve Patient Safety

Keltie's Digest Vol. 66 Aug 29th 2022

THE NATIONAL GUIDELINE CLEARINGHOUSE SHUTDOWN: CAUSE FOR CONCERN OR NO BIG DEAL??

Resolve Your Patients Pain in 1-3 Sessions With NST

Excel EHR Page 2: Orders

CMS Issues New Document on Texting Patient Information

Craig Solid, PhD的更多文章

The Science of Demonstrating Value

The Power of Understanding Value: Why Medtech Startups Should Spend More Time at This Crucial Stage

Where Are You in the Value Demonstration Journey?

Addressing Value Components: Identify Potential Barriers and Sources of Uncertainty

Addressing Value Components: Size the Opportunity and Potential Value Capture

Addressing Value Components: Exploring Relevant Perspectives

Addressing Value Components: Map the Care Pathway

Understanding the Components that Influence Value in Healthcare

One VC's Opinion of the Importance of ROI for Start-ups

Developing Value Messages

社区洞察

其他会员也浏览了

More Interesting Musings on Some HHS Rulemaking: Where is HTI-2, and is CMS Imminent to Release a Final Rule on Healthcare Attachments?

Featured in the Press!

Why Do Errors Happen, and what we have to do.

Healthcare for a Better Future: Unlock the Benefits of Value-Based Care at Conviva

Measurement as a Performance Driver: The Case for a National Measurement System to Improve Patient Safety

Keltie's Digest Vol. 66 Aug 29th 2022

THE NATIONAL GUIDELINE CLEARINGHOUSE SHUTDOWN: CAUSE FOR CONCERN OR NO BIG DEAL??

Resolve Your Patients Pain in 1-3 Sessions With NST

Excel EHR Page 2: Orders

CMS Issues New Document on Texting Patient Information