Award Date

5-1-2020

Degree Type

Thesis

Degree Name

Master of Science in Computer Science

Department

Computer Science

First Committee Member

Fatma Nasoz

Second Committee Member

Qing Wu

Third Committee Member

Kazem Taghva

Fourth Committee Member

Laxmi Gewali

Fifth Committee Member

Mira Han

Number of Pages

60

Abstract

Osteoporosis is one of the most common diseases seen in postmenopausal women, it decreases the bone density and quality, and later causes bone loss. Generally, bone loss occurs when bone losses its content and become porous: a sponge like substance. In most Genome Wide Association Studies (GWAS), researchers perform experiments with genomic data that contains some millions of numbers of single nucleotide polymorphisms (SNPs) and checks their association with the trait or disease. In this thesis, we performed two separate analyses with 2207 (of bone loss and bone gain) and 645 (of bone loss) instances separately. For predicting the SNPs associated with bone loss rate (a regression problem), we considered both genotype and phenotype data from Women’s Health Initiative (WHI) and, performed data processing and analysis as described further. We started with a metadata analysis on the genomic dataset and imputed the datasets with 2207 and 645 instances separately. Next, we performed the linear association analysis between the SNPs and the bone loss rate from phenotype data, and later we applied LASSO regression with narrow sense heritability using PLINK, which resulted in two sets of SNPs: 680 SNPs for 2207 instances and, 308 SNPs for 645 instances. Lastly, we mixed the phenotype data with SNPs based on Subject-ID for both analyses, and then we trained machine learning models including ridge regression (RR), support vector machine (SVM), random forest (RF), and multi-layer perceptron (MLP), on the two datasets and evaluated the mean squared error (MSE) and R^2 for each model. The RR model gave the best performance for 680 SNPs than the other models with an R^2 of 0.858 for training data and R^2 of 0.719 for testing data, whereas for 308 SNPs, the MLP gave the best performance than the other models with an R^2 of 0.982 for training data and R^2 of 0.894 for testing data.

Keywords

Osteoporosis; Bone loss; Single nucleotide polymorphisms (SNPs); Analysis

Disciplines

Computer Sciences

File Format

pdf

File Size

3.081 MB

Degree Grantor

University of Nevada, Las Vegas

Language

English

Rights

IN COPYRIGHT. For more information about this rights statement, please visit http://rightsstatements.org/vocab/InC/1.0/


COinS