Development of a nomogram for predicting recurrence in breast cancer patients using a machine learning method

Abhishek O. Tibrewal


Background: Current breast cancer (BC) recurrence models do not account for treatment modalities, one of the strongest prognostic factors. This analysis was conducted to apply machine learning (ML) algorithm to identify BC patients at a higher recurrence risk.

Methods: It is based on a downloadable BC Wisconsin dataset, containing 9 independent (socio-demographic, tumor and treatment-related) and a dependent (recurrence) variable(s). Using training dataset (70% sample), a multivariate LR model was developed using univariate analysis identified variables (p<0.2). The model performance was assessed on test dataset (remaining 30%) using standard statistical measures. A nomogram was developed using model identified variables (p<0.05), and its cut-off score categorized BC patients into a high/low recurrence risk.

Results: 277 patients (recurrence (n=81)) were included. In univariate analysis, tumor size (p=0.002), invasive nodes number (p<0.001), node capsule (p<0.001), degree of malignancy (p<0.001) and irradiation (p<0.001) were associated with recurrence. After balancing, both groups included 243 patients. Using training dataset (n=342), invasive nodes (p<0.05), degree of malignancy (p<0.05) and irradiation (p=0.0009) were significant in a multivariate model. The model’s accuracy and area under curve (AUC) were 74% (66-81%) and 0.74 (0.67-0.81), respectively in the test dataset (n=144). The nomogram’s cut-off score of 55 has an AUC of 0.73 (0.66-0.80) for recurrence prediction, indicative fair discriminating ability.

Conclusions: The developed nomogram can be a valuable tool in guiding appropriate treatment based on recurrence risk. ML and data mining methods can be the future of clinical decision process.


Breast cancer, Machine learning, Nomogram, Recurrence

Full Text:



Malvia S, Bagadi SA, Dubey US, Saxena S. Epidemiology of breast cancer in Indian women. Asia Pac J Clin Oncol. 2017;13:289-95.

Wadasadawala T, Kannan S, Gudi S, Rishi A, Budrukkar A, Parmar V, et al. Predicting loco-regional recurrence risk in T1, T2 breast cancer with 1–3 positive axillary nodes postmastectomy: Development of a predictive nomogram. Indian J Cancer. 2017;54:352-7.

Galea MH, Blamey RW, Elston CE, Ellis IO. The Nottingham prognostic index in primary breast cancer. Breast Cancer Res Treat. 1992;22:207-19.

Hearne BJ, Teare MD, Butt M. Comparison of nottingham prognostic index and adjuvant online prognostic tools in young women with breast cancer: review of a single-institution experience. BMJ Open. 2015;5:e005576.

Hajage D, Rycke Y, Bollet M, Savignoni A, Caly M, Pierga JY. External validation of adjuvant online breast cancer prognosis tool. prioritising recommendations for improvement. PLoS One. 2011;6(11):e27446.

Wazir U, Mokbel K, Carmichael A, Mokbel K. Are online prediction tools a valid alternative to genomic profiling in the context of systemic treatment of ER-positive breast cancer? Cell Mol Biol Lett. 2017;22:20.

Vanschoren, J. Exploring machine learning better, together. Available at Accessed on 17 April 2019.

The R project for statistical computing. Available at Accessed on 14 August 2019.

Brownlee J. 8 tactics to combat imbalanced classes in your machine learning dataset, machine learning mastery. Available at https:// machinelearning Accessed on 7 October 2019.

Tonellotto F, Bergmann A, Abrahão K, Aguiar S, Bello MA, Thuler L. Impact of number of positive lymph nodes and lymph node ratio on survival of women with node-positive breast cancer. Eur J Breast Health. 2019;15(2):76-84.

Nair N, Shet T, Parmar V, Havaldar R, Gupta S, Budrukkar A, et al. Breast cancer in a tertiary cancer center in India - an audit, with outcome analysis. Indian J Cancer. 2018;55:16-22.

Harris SR, Dahlstrom JE, Gupta R, Zhang Y, Craft P, Shadbolt B. Recurrence in early breast cancer: analysis of data from 3,765 Australian women treated between 1997 and 2015. The Breast. 2019;44:153-9.

Varela M, Chin YS, Makris A. Current indications for post-mastectomy radiation. Int Semin Surg Oncol. 2009;6:5.

Early Breast Cancer Trialists' Collaborative Group. Favourable and unfavourable effects on long-term survival of radiotherapy for early breast cancer an overview of the randomised trials. Lancet. 2000;355(9217):1757-70.

Wadasadawala T, Vadgaonkar R, Bajpai J. Management of isolated locoregional recurrences in breast cancer: a review of local and systemic modalities. Clin Breast Cancer. 2017;17(7):493-502.

Witteveen A, Vliegen IMH, Sonke GS. Personalisation of breast cancer follow-up: a time-dependent prognostic nomogram for the estimation of annual risk of locoregional recurrence in early breast cancer patients. Breast Cancer Res Treat. 2015;152:627-36.