Master of Science in Computer Science
First Committee Member
Second Committee Member
Third Committee Member
Fourth Committee Member
Fifth Committee Member
Number of Pages
Large amounts of data is being generated constantly each day, so much data that it is difficult to find patterns in order to predict outcomes and make decisions for both humans and machines alike. It would be useful if this data could be simplified using machine learning techniques. For example, biological cell identity is dependent on many factors tied to genetic processes. Such factors include proteins, gene transcription, and gene methylation. Each of these factors are highly complex mechanism with immense amounts of data. Simplifying these can then be helpful in finding patterns in them. Error-Correcting Output Codes (ECOC) does this for classification by breaking the problem into multiple binary cases. This thesis proposes a new approach that also splits the feature set into multiple subsets called views. This new proposed method is tested on multiple datasets from the University of California, Irvine (UCI) to analyze performance. The method is then applied to genetic data collected from The Cancer Genome Atls (TCGA) and the Gene Expression Omnibus (GEO) to try and improve results on classifying the tissue of origin for various tumor samples.
ECOC; Ensemble learning; Error Correcting Codes; Genetics; Machine Learning; Multiomics
Artificial Intelligence and Robotics | Computer Engineering | Computer Sciences | Genetics
University of Nevada, Las Vegas
Alvarez, Daniel, "An Investigation into Multi-View Error Correcting Output Code Classifiers Applied to Organ Tissue Classification" (2020). UNLV Theses, Dissertations, Professional Papers, and Capstones. 3982.
IN COPYRIGHT. For more information about this rights statement, please visit http://rightsstatements.org/vocab/InC/1.0/