Application of latent class analysis to estimate susceptibility to adverse health outcomes based on several risk factors

Ankita Dey, Arun K. Chakraborty, Kunal K. Majumdar, Asok K. Mandal


Background: The study demonstrates the use of latent class analysis (LCA) to segregate population in two latent classes e.g. susceptible or not susceptible to adverse health outcomes according to the observed risk factors as a method of medical diagnosis.

Methods: The present study uses a secondary data set on 420 patients referred to the University of California, Los Angeles (UCLA). Adult Cardiac Imaging and Hemodynamics Laboratories for Dobutamine stress echocardiography (DSE) between March1991 & March1996. LCA is used for estimating the individual item-response probabilities in each latent group and also the latent class sizes. The observed variables or indicators of the latent subgroups are the common risk factors viz. history of smoking, history of cardiac issues etc. The interaction effect of hypertension & diabetes is also included in the analysis.

Results: Based on the behaviour of the estimates of latent class model parameters, the unobserved groups are identified and named. Proportion of individuals falling in each latent class are approximately 0.20 & 0.80 respectively. The susceptibility to adverse health outcomes in future is the most in male individuals having a positive history of hypertension and/or diabetes, as the corresponding indicators have higher positive item-response probabilities (0.72 & 0.83 respectively) than the rest.

Conclusions: The study briefly explains the application of LCA for identifying subgroups according to susceptibility to adverse health effects in a large population. Assessment of common risk factors in predicting latent class sizes provides estimates of probabilities for being a member in each class. The importance of the combined effect of hypertension & diabetes in predicting future health problems related to cardiac issues is highlighted. Class assignments of individuals according to their pattern of response are also listed.


Indicators, Interaction effect, Item-response probabilities, Latent class analysis, Medical diagnosis, Risk factors

Full Text:



Skrondal A, Rabe-Hesketh S. General latent variable modelling-multilevel, longitudinal, and structural equation models. Interdisciplinary statistics 1st ed., Chapman & Hall 2002; Boca Raton, Fla. ISBN 1584880007.

Biemer PP. Latent class analysis of survey error. A John Wiley and Sons, Inc. Publication; 2011.

Collins LM, Lanza ST. Latent class and latent transition analysis with applications in the social, behavioural and health sciences. A John Wiley and Sons, Inc. Publication; 2010.

Formann AK. Constrained latent class models: theory and applications. Br J Mathematical and Statistical Psychology. 1984;38: 87-111.

Formann AK. Linear logistic latent class analysis for polytomous data. J Am Statistical Association. 1992;87:418.

Lazarsfeld PF. The logical and mathematical foundations of latent structure analysis. S. A. Stouffer Ed.1950a; Measurement and Prediction, Volume IV of The American Soldier: Studies in Social Psychology in World War II, Princeton University Press: 362-472.

Goodman LA. Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika. 1974;61(2):215.

Haberman SJ. Analysis of Qualitative Data. New York: Academic Press 1979; Vol. 2: New Developments.

Vermunt JK. Log-linear models for event histories. Thousand Oaks, CA: Sage 1997.

Hagenaars J. Categorical longitudinal data: Log-linear panel, trend, and cohort analysis. Newbury Park, CA: Sage 1990.

Patterson BH, Dayton CM, Graubard BI. Latent class analysis of complex sample survey data. J Am Statistical Association. 2002;97:459 721-41.

Castle DJ, Sham PC, Wessely S, Murray RM. The sub-typing of schizophrenia in men and women: a latent class analysis. Psychological Medicine. 1994;24:41-51.

Rees KV, Vermunt J, Verboord M. Cultural classifications under discussion Latent class analysis of highbrow and lowbrow reading. Poetics. 1999;26:349-65.

Lin SW, Tai WC. Latent Class Analysis of Students' Mathematics Learning Strategies and the Relationship between Learning Strategy and Mathematical Literacy. Universal J Educational Research. 2015;3(6):390-5.

Bhatnagar A, Ghose S. A latent class segmentation analysis of e-shoppers. J Business Research. 2004;57:758-67.

Dunn KM, Jordan K, Croft PR. Characterizing the course of low back pain: a latent class analysis. American Journal of Epidemiology 2006;163:8.

Mooijaart AB. The EM algorithm for latent class analysis with equality constraints. Psychometrika. 1992;57(2):261-9.

Schwarz G. Estimating the dimension of a model. Ann Statistics. 1978;6(2):461-4.

Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. J Royal Statistical Society Series B (Methodological). 1977;39(1):1-38.

Krivokapich J, Child JS, Walter DO, Garfinkel A. Prognostic value of Dobutamine stress echocardiography in predicting cardiac events in patients with known or suspected coronary artery disease. J American College of Cardiology. 1999;33:3.

Panwar R, Gupta R, Gupta BK, Raja S, Vaishnav J, Khatri M, et al. Atherothrombotic risk factors and premature coronary heart disease in India: A case-control study. Indian J Med Res. 2011;134:26-32.

Bahr R, Holme I. Risk factors for sports injuries—a methodological approach. Br J Sports Med. 2003;37:384-392.

Wilson PF, D'Agostino R, Levy D, Belanger AM, Silbershatz H, Kannel WB. Prediction of Coronary Heart Disease Using Risk Factor Categories. Circulation-Journal of the American Heart Association. 1998;97:1837-47.

University of California, Los Angeles (UCLA), Department of Statistics [homepage on the internet]. Available from: projects/ datasets/. Accessed 9 March 2016.

Linzer DA, Lewis JB. poLCA: An R package for polytomous variable latent class analysis. J Statistical Software. 2011;42:10.