NICE’s severity modifier: a step in the right direction, but still a long way to go

Article by: Edward Oliver & David Mott

This blog post is the fifth in a series on the new National Institute for Health and Care Excellence (NICE) health technology evaluation manual. Each post provides a critical discussion on a particular topic, including the expected implications of the changes (or lack thereof) in the manual; what is still missing; and what further research is needed.

It is broadly accepted that the severity of a disease is an important consideration in health technology assessment (HTA). In many countries, HTA agencies use disease severity as a decision-making ‘modifier’ (Zhang and Garau, 2020). This is where flexibility is provided if the medicine being appraised is for a severe disease, which in some cases means that a higher cost-effectiveness threshold (or quality-adjusted life year [QALY] weighting) may be formally applied.

NICE essentially first introduced a ‘severity modifier’ – the ‘end-of-life (EOL) criteria’ – in 2009. The EOL criteria were criticised for being too narrow and not considering improvements in quality of life. Furthermore, UK-based societal preference studies typically did not find evidence in support of an end-of-life premium (Shah et al., 2018).

NICE’s new health technology evaluation manual replaces the EOL criteria with a new, broader severity modifier. This blog outlines the new modifier and discusses the need for further related research.

What is NICE’s new severity modifier?

NICE’s new severity modifier considers two different – but related – measures of disease severity: absolute QALY shortfall (AS) and proportional QALY shortfall (PS).

AS represents the number of future QALYs that are lost by people living with the disease. Younger individuals have a greater number of potential future QALYs to lose on average and therefore severe chronic diseases that affect younger patient populations may have higher AS scores on average compared to severe acute diseases that affect older populations.

The opposite is true for PS. PS represents the proportion of future QALYs that are lost by people living with the disease. Elderly patient populations that are nearer to the end of their lives have relatively fewer potential QALYs left on average and therefore are more likely to lose a higher proportion of their remaining QALYs as a result of a severe disease, leading to higher PS scores on average.

Given the above, the two measures can be considered as complementary and, when used together, they provide a relatively broad definition of severity. Indeed, NICE’s new severity modifier considers both measures. If the AS or PS scores are high enough, a QALY weight is applied, effectively increasing the cost-effectiveness threshold. The table below contains the three categories for the QALY weights. If both scores indicate that a QALY weight should be applied, and the weights differ, the higher weight is applied.

QALY weight

Proportional QALY shortfall

Absolute QALY shortfall

1

Less than 0.85

Less than 12

x1.2

0.85 to 0.95

12 to 18

x1.7

At least 0.95

At least 18

It is worth noting that NICE’s specific guidance on the calculation of AS and PS scores has some implications. Firstly, they recommend that the scores take into consideration the current standard of care, which effectively means that the severity modifier also accounts for aspects of ‘unmet need’ (Zhang et al., 2021). Furthermore, they recommend that the scores are calculated using discounting. Therefore, technically the AS (PS) scores represent the number (proportion) of discounted future QALYs that are lost by people living with the disease and receiving the standard of care.

How was the modifier designed and what future research is required?

NICE’s new severity modifier is arguably an improvement over the EOL criteria that it replaces because it defines severity more broadly. However, one of the main pieces of criticism that the EOL criteria (fairly) received also applies here: there is little evidence indicating that the modifier aligns with societal preferences.

To come up with the categories in the earlier table, NICE looked at a selection of past appraisals, and estimated the average QALY weight that was applied previously via the EOL criteria. They then estimated AS and PS scores for the same set of appraisals and identified a combination of cut-offs that would have resulted in an equivalent average QALY weight. They referred to this as an “opportunity cost neutral” approach.

Whilst this approach may make sense to arrive at a starting point, it is notable that the severity categories that are eligible for a QALY weight are higher than severity categories applied elsewhere in the world. In the Netherlands PS is used to define severity, and the highest severity category is for PS scores that exceed 0.70 (compared to ≥0.95 with NICE’s modifier). Furthermore, in Norway, AS is used to define severity, and the highest severity category is for undiscounted AS scores that exceed 20 (compared to discounted AS scores of ≥18). Therefore, NICE’s severity modifier may only be relevant in a small number of exceptional cases.

The appropriateness of these severity categories will be better understood when a societal preference study has been conducted – and thankfully NICE have acknowledged the need for such a study and further refinement of the modifier. However, the scope of such a study is unclear. Should societal preferences be used solely to determine the appropriate categories, or extend to the number of severity categories, or the magnitude of the adjustments?

There are also numerous technical challenges associated with conducting a societal preference study in this context. AS and PS are related measures, and it may prove difficult to include both within the same preference elicitation exercise. There is also the issue of using discounting in QALY shortfall calculations. The preference elicitation exercise will most likely consider undiscounted QALY shortfall scores to enable respondents to understand the task – but how will this be translated into categories based on discounted scores? And how will it be determined that the general population actually agree with the value judgements implied by discounting QALY shortfall scores?

NICE’s severity modifier is a step in the right direction, but there is still a long way to go.

OHE look forward to contributing to further research and debate in this area in the near future. Our next masterclass will explore the consideration of severity in value assessment – we hope to see you there!

Citation

Shah, K. K., Tsuchiya, A., & Wailoo, A. J. (2018). Valuing health at the end of life: A review of stated preference studies in the social sciences literature. Social Science & Medicine, 204, 39–50.

Zhang, K., and Garau, M. 2020. International Cost-Effectiveness Thresholds and Modifiers for HTA Decision Making. OHE Consulting Report.

Zhang, K., Kumar, G., & Skedgel, C. 2021. Towards a New Understanding of Unmet Medical Need. Applied Health Economics and Health Policy, 19(6), 785–788.

Related Research

Zhang, K., Garau, M. (2020) International Cost-Effectiveness Thresholds and Modifiers for HTA Decision Making. Consulting Report.

Posted in Health Technology Assessment, NICE | Tagged