Censored data: Evaluation two methods of estimating correlations within R package lava
Document Type
Presentation
Publication Date
2021
Publication Title
American Association of Behavioral and Social Sciences Annual Conference
First page number:
1
Last page number:
4
Abstract
Censored data, which occur when exact values of variables are unknown, can distort statistical results and invalidate conclusions. Holst et al.’s (2015) R package lava can be used to estimate bivariate correlations for censored data using a correlation model or a regression model. To compare these two models, the present study simulated normally distributed data with various correlations and sample sizes, and then censored them to the required degrees. For both models, lava always converged on an estimate and bias was minimal unless both variables had 80% censoring and the correlation was large and negative. Moreover, bias decreased when sample size increased from 500 to 1000. Although both models produced unbiased correlation estimates when samples were large and censoring was low to moderate, the regression model ensured estimates were valid (between -1 and 1). Future research may wish to investigate lava’s performance for non-normal distributions and smaller sample sizes.
Controlled Subject
Censored observations (Statistics)
Disciplines
Data Science
Repository Citation
Hoffman, C. K.,
Thiess, E. K.,
Ayele, F. A.,
Barchard, K. A.
(2021).
Censored data: Evaluation two methods of estimating correlations within R package lava.
American Association of Behavioral and Social Sciences Annual Conference
1-4.