BACKGROUND: Several organizations have developed frameworks to systematically assess the value of new drugs.
OBJECTIVE: To evaluate the convergent validity and interrater reliability of 4 value frameworks to understand the extent to which these tools can facilitate value-based treatment decisions in oncology.
METHODS: Eight panelists used the American Society of Clinical Oncology (ASCO), European Society for Medical Oncology (ESMO), Institute for Clinical and Economic Review (ICER), and National Comprehensive Cancer Network (NCCN) frameworks to conduct value assessments of 15 drugs for advanced lung and breast cancers and castration-refractory prostate cancer. Panelists received instructions and published clinical data required to complete the assessments, assigning each drug a numeric or letter score. Kendall's Coefficient of Concordance for Ranks (Kendall's W) was used to measure convergent validity by cancer type among the 4 frameworks. Intraclass correlation coefficients (ICCs) were used to measure interrater reliability for each framework across cancers. Panelists were surveyed on their experiences.
RESULTS: Kendall's W across all 4 frameworks for breast, lung, and prostate cancer drugs was 0.560 (P= 0.010), 0.562 (P = 0.010), and 0.920 (P < 0.001), respectively. Pairwise, Kendall's W for breast cancer drugs was highest for ESMO-ICER and ICER-NCCN (W = 0.950, P = 0.019 for both pairs) and lowest for ASCO-NCCN (W = 0.300, P = 0.748). For lung cancer drugs, W was highest pairwise for ESMO-ICER (W = 0.974, P = 0.007) and lowest for ASCO-NCCN (W = 0.218, P = 0.839); for prostate cancer drugs, pairwise W was highest for ICER-NCCN (W = 1.000, P < 0.001) and lowest for ESMO-ICER and ESMO-NCCN (W = 0.900, P = 0.052 for both pairs). When ranking drugs on distinct framework subdomains, Kendall's W among breast cancer drugs was highest for certainty (ICER, NCCN: W = 0.908, P = 0.046) and lowest for clinical benefit (ASCO, ESMO, NCCN: W = 0.345, P = 0.436). Among lung cancer drugs, W was highest for toxicity (ASCO, ESMO, NCCN: W = 0. 944, P < 0.001) and lowest for certainty (ICER, NCCN: W = 0.230, P = 0.827); and among prostate cancer drugs, it was highest for quality of life (ASCO, ESMO: W = 0.986, P = 0.003) and lowest for toxicity (ASCO, ESMO, NCCN: W = 0.200, P = 0.711). ICC (95% CI) for ASCO, ESMO, ICER, and NCCN were 0.800 (0.660-0.913), 0.818 (0.686-0.921), 0.652 (0.466-0.834), and 0.153 (0.045-0.371), respectively. When scores were rescaled to 0-100, NCCN provided the narrowest band of scores. When asked about their experiences using the ASCO, ESMO, ICER, and NCCN frameworks, panelists generally agreed that the frameworks were logically organized and reasonably easy to use, with NCCN rated somewhat easier.
CONCLUSIONS: Convergent validity among the ASCO, ESMO, ICER, and NCCN frameworks was fair to excellent, increasing with clinical benefit subdomain concordance and simplicity of drug trial data. Interrater reliability, highest for ASCO and ESMO, improved with clarity of instructions and specificity of score definitions. Continued use, analyses, and refinements of these frameworks will bring us closer to the ultimate goal of using value-based treatment decisions to improve patient care and outcomes.
DISCLOSURES: This work was funded by Eisai Inc. Copher and Knoth are employees of Eisai Inc. Bentley, Lee, Zambrano, and Broder are employees of Partnership for Health Analytic Research, a health services research company paid by Eisai Inc. to conduct this research. For this study, Cohen, Huynh, and Neville report fees from Partnership for Health Analytic Research. Outside of this study, Cohen receives grants and direct consulting fees from various companies that manufacture and market pharmaceuticals. Mei reports a grant from Eisai Inc. during this study. The other authors have no disclosures to report. Study concept and design were contributed by Bentley and Broder, with assistance from Elkin and Cohen. Bentley took the lead in data collection, along with Elkin, Huynh, Mukherjea, Neville, Mei, Popescu, Lee, and Zambrano. Data interpretation was performed by Bentley and Broder, along with Elkin, Cohen, Copher, and Knoth. The manuscript was written primarily by Bentley, along with Elkin and Broder, and revised by Bentley, Broder, Elkin, Cohen, Copher, and Knoth. Select components of this work's methods were presented at ISPOR 19th Annual European Congress held in Vienna, Austria, October 29-November 2, 2016, and Society for Medical Decision Making 38th Annual North American Meeting held in Vancouver, Canada, October 23-26, 2016.