Award Date
May 2023
Degree Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Life Sciences
First Committee Member
Mira Han
Second Committee Member
Daniel Thompson
Third Committee Member
Kelly Tseng
Fourth Committee Member
Jeffery Shen
Fifth Committee Member
Edwin Oh
Number of Pages
148
Abstract
With the increase of diverse genomic data types, machine learning provides an opportunity to integrate several omics datasets into one cohesive annotation. In this dissertation, I apply an unsupervised clustering approach to a novel representation of 3D chromosome conformation data and chromatin mark data. Specifically I use this new method to annotate the regulatory function of human endogenous retrovirus H (HERVH). In chapter 1, I propose a synthesized model of HERVH function as an activating lncRNA based on previously published work. As HERVH and transposable elements in general are repetitive due to their methods of retrotransposition, in chapter 2 I explore the mappability of transposable elements using traditional short read approaches. This mappability study validates that most transposable element loci are in fact highly mappable in the human genome. In chapter 3, I present a novel aggregation method to integrate both 3D chromatin conformation data and chromatin state labels to be used for downstream clustering. I show that this method provides additional annotation beyond chromatin conformation data of chromatin state data alone. Finally in chapter 4, I perform a meta-analysis of individual HERVH loci by synthesizing data from over 10 years of past research and applying the method developed in chapter 3. I propose that 5’ and 3’ HERVH LTRs may function as promoters and enhancers, respectively, and that the act of transcription and accompanying chromatin marks at the 5’ LTR are essential for DNA folding. This dissertation aims to present a novel method for condensing multi-omics data into whole genome annotation.
Keywords
Chromatin State; Hi-C; K-means; Multi-Omics; Transposable Elements
Disciplines
Bioinformatics
File Format
Degree Grantor
University of Nevada, Las Vegas
Language
English
Repository Citation
Sexton, Corinne, "Applying Unsupervised Multi-Omic Learning to Identify Patterns of Human Genomic Regulatory Regions with an Emphasis in Characterizing HERVH Loci." (2023). UNLV Theses, Dissertations, Professional Papers, and Capstones. 4775.
http://dx.doi.org/10.34917/36114800
Rights
IN COPYRIGHT. For more information about this rights statement, please visit http://rightsstatements.org/vocab/InC/1.0/