Award Date

12-1-2017

Degree Type

Thesis

Degree Name

Master of Science (MS)

Department

Life Sciences

First Committee Member

Martin Schiller

Second Committee Member

Nora Caberoy

Third Committee Member

Mira Han

Fourth Committee Member

Ernesto Abel-Santos

Number of Pages

149

Abstract

Detailed knowledge of protein function is critical for both the study of protein interactions and the development of drugs which target specific proteins. Currently, there are few techniques that directly examine protein function. The techniques that are available are time consuming and can only address one variant of a protein at a time. Our laboratory has designed 3 high throughput protein function screens. We hypothesize that these will address this shortfall.

The first screen is the Chimeric Minimotif Decoy (CMD) Assay. For this screen, we constructed red fluorescent proteins with one or more C-terminal minimotifs. Minimotifs are short, contiguous amino acid sequences with a known function. These CMD proteins are then over-expressed in a cell, which is then infected with HIV. HIV relies on host proteins to replicate. The minimotifs of the protein may competitively bind the host proteins as a decoy and inhibit HIV infection. HIV infection is indicated by the expression of a green fluorescent protein transcribed from an LTR-GFP HIV infection reporter construct.

The second technique we developed is the GigaAssay Driver Mutagenesis Screen. With this screen, the effects of all single mutations on a protein’s ability to function can be comprehensively examined. The screen is constructed by using a CRISPR-Cas9 system to integrate the GigaAssay cassette into a cell. This cassette includes a randomly mutagenized variant of a driver protein and a reporter protein. The sequences of both of these proteins share a DNA barcode sequence that is unique to that cell. In our proof-of-concept experiment, the driver protein is Tat, the transcription factor responsible for the replication of the HIV-1 genome. Here, Tat drives the transcription of the reporter, GFP. The strength of each variant of Tat can be determined by the ratio between the reporter and driver mRNA transcripts.

Our third technique is the GigaAssay MD screen. Here, the GigaAssay cassette has a CMD protein sequence and an LTR-GFP HIV infection reporter sequence, and the CMD sequence and reporter sequence share a unique barcode. The CMD protein’s ability to inhibit HIV infection can be determined by the ratio between the CMD and reporter mRNA transcripts.

However, the large amount of nucleic acid sequencing data produced by these screens cannot be interpreted by any currently existing analysis pipelines. The goal of my thesis was to optimize the construction of proteins for the Chimeric Minimotif Decoy assay and to write and validate a suite of Java software to interpret and visualize the results of all of the screens. The software I wrote correctly interprets the sequence of 100% of the CMD proteins in the CMD Assay. It also correctly interprets 98.6% of synthetic test transcript sequences in the GigaAssay Driver Mutagenesis Screen and 98.9% of synthetic test transcript sequences in the GigaAssay MD Screen.

I also wrote and validated a suite of Java software for the generation of personalized dietary recommendations based on meta-analysis of nutrigenetics studies. Nutrigenetics is the study of the effect of genetic variation on the response of phenotypes and phenotypic risk to diet and dietary changes. The software first composes a dataset by comparing our laboratory’s meta-analysis data with the USDA Nutrient Database, matching foods to recommendations, and calculating recommended food portions. The software then builds a MySQL database with this dataset. This database is then used by the software to analyze a user’s personal genome file and match variants in the file to dietary recommendations, which are then exported to a personalized dietary suggestion report. The software correctly builds the MySQL database and matches genome file variants accurately in all test cases.

Keywords

bioinformatics; genomics; nutrigenetics; protein function assay; protein function screen; proteomics

Disciplines

Bioinformatics | Biology | Molecular Biology

Language

English

Available for download on Tuesday, December 15, 2020


Share

COinS