Award Date

12-1-2021

Degree Type

Thesis

Degree Name

Master of Science in Computer Science

Department

Computer Science

First Committee Member

Fatma Nasoz

Second Committee Member

Qing Wu

Third Committee Member

Kazem Taghva

Fourth Committee Member

Mingon Kang

Fifth Committee Member

Jee Woong Park

Number of Pages

48

Abstract

Osteoporosis is a debilitating disease in which an individual’s bones weaken, making bones fragile and more susceptible to fracture. While commonly found amongst postmenopausal Caucasian and Asian women based on previous studies, those of African descent (African American/Black) have largely been ignored when it comes to osteoporotic studies, especially when it comes to Genome Wide Association Studies (GWAS). From GWA studies, we gain access to single nucleotide poly-morphisms (SNPs) that may contribute to certain illnesses, such as osteoporosis. With low Bone Mineral Density (BMD) being one of the primary markers of potential osteoporosis, it is prudent that proper research is done in order to help the African American population circumvent or mitigate the worst symptoms and complications of osteoporosis. In this thesis, we implemented and applied machine learning algorithms to analyze genetic data of African American women in order to make predictions that map SNPs to BMD. Using Coefficient of Determination (R2) and Mean Squared Error (MSE) for evaluation, the machine learning techniques we utilized for this regression task are: regularized linear regression (Ridge, Lasso, and ElasticNet), gradient boosted trees (XGBoost and LightGBM), and artificial neural networks.

With these models, we performed an analysis on 3 datasets, compromised of 12,600, 69,476, and 158,444 variants respectively. The first dataset, known as SNP-1, received its highest overall test R2 score of 0.227 through Lasso Regression. The second dataset, SNP-5, received its highest overall test R2 score of 0.437 through LightGBM. And lastly, the third dataset, SNP-10, received its highest overall test R2 score of 0.574 from Ridge Regression.

Keywords

African Americans; Computer Science; Machine Learning; osteoporosis; SNPs

Disciplines

Bioinformatics | Computer Sciences

File Format

pdf

File Size

1516 KB

Degree Grantor

University of Nevada, Las Vegas

Language

English

Rights

IN COPYRIGHT. For more information about this rights statement, please visit http://rightsstatements.org/vocab/InC/1.0/


Share

COinS