Master of Science (MS)
First Committee Member
Second Committee Member
Third Committee Member
Fourth Committee Member
Fifth Committee Member
Number of Pages
Cancer has become one of the major factors responsible for global deaths, due to late diagnoses and lack of proper treatment. It involves the abnormal and uncontrolled growth of cells inside the body, which might spread from one place to different parts. Ribonucleic acid (RNA) sequencing can detect the changes occurring inside cells and helps to analyze the transcriptome of gene expression patterns inside RNA. Machine learning techniques can assist in the prediction of cancer at an early stage, if data is available. The objective of this thesis is to build models and classify different types of cancer. For this purpose, we implemented various machine learning models like support vector machine (SVM), random forest (RF), k-nearest neighbors (KNN) and multilayer perceptron (MLP) to classify the samples according to their labels. The datasets for this research were collected from The Cancer genome Atlas (TCGA) and Genotype-Tissue Expression (GTEX). The machine learning models were trained on TCGA data and tested on independent dataset (GTEX). The data representation obtained using stacked denoising autoencoders were used to train and test the models. The models did not have very high performance; however, MLP performed better than others. The best features that were selected using SelectKBest, were also used to compare the performances. It was observed that the K-nearest neighbor classifier gave better results, with and accuracy of 85.12% while tested on independent data, and the training accuracy was 98.4%.
University of Nevada, Las Vegas
Maharjan, Aashi, "Machine Learning Approach for Predicting Cancer Using Gene Expression" (2020). UNLV Theses, Dissertations, Professional Papers, and Capstones. 3922.
IN COPYRIGHT. For more information about this rights statement, please visit http://rightsstatements.org/vocab/InC/1.0/