Award Date
5-1-2020
Degree Type
Thesis
Degree Name
Master of Science (MS)
Department
Computer Science
First Committee Member
Fatma Nasoz
Second Committee Member
Mira Han
Third Committee Member
Laxmi Gewali
Fourth Committee Member
Yoohwan Kim
Fifth Committee Member
Qing Wu
Number of Pages
64
Abstract
Cancer has become one of the major factors responsible for global deaths, due to late diagnoses and lack of proper treatment. It involves the abnormal and uncontrolled growth of cells inside the body, which might spread from one place to different parts. Ribonucleic acid (RNA) sequencing can detect the changes occurring inside cells and helps to analyze the transcriptome of gene expression patterns inside RNA. Machine learning techniques can assist in the prediction of cancer at an early stage, if data is available. The objective of this thesis is to build models and classify different types of cancer. For this purpose, we implemented various machine learning models like support vector machine (SVM), random forest (RF), k-nearest neighbors (KNN) and multilayer perceptron (MLP) to classify the samples according to their labels. The datasets for this research were collected from The Cancer genome Atlas (TCGA) and Genotype-Tissue Expression (GTEX). The machine learning models were trained on TCGA data and tested on independent dataset (GTEX). The data representation obtained using stacked denoising autoencoders were used to train and test the models. The models did not have very high performance; however, MLP performed better than others. The best features that were selected using SelectKBest, were also used to compare the performances. It was observed that the K-nearest neighbor classifier gave better results, with and accuracy of 85.12% while tested on independent data, and the training accuracy was 98.4%.
Disciplines
Computer Sciences
File Format
File Size
2.9 MB
Degree Grantor
University of Nevada, Las Vegas
Language
English
Repository Citation
Maharjan, Aashi, "Machine Learning Approach for Predicting Cancer Using Gene Expression" (2020). UNLV Theses, Dissertations, Professional Papers, and Capstones. 3922.
http://dx.doi.org/10.34917/19412120
Rights
IN COPYRIGHT. For more information about this rights statement, please visit http://rightsstatements.org/vocab/InC/1.0/