Document Type

Article

Publication Date

8-13-2021

Publication Title

Patterns

Volume

2

Issue

8

First page number:

1

Last page number:

11

Abstract

In this article, we propose a new approach to analyze large genomics data. We considered individual genetic variants as pixels in an image and transformed a collection of variants into an artificial image object (AIO), which could be classified as a regular image by CNN algorithms. Using schizophrenia as a case study, we demonstrate the principles and their applications with 3 datasets. With 4,096 SNVs, the CNN models achieved an accuracy of 0.678 ± 0.007 and an AUC of 0.738 ± 0.008 for the diagnosis phenotype. With 44,100 SNVs, the models achieved class-specific accuracies of 0.806 ± 0.032 and 0.820 ± 0.049, and AUCs of 0.930 ± 0.017 and 0.867 ± 0.040 for the bottom and top classes stratified by the patient's polygenic risk scores. These results suggest that, once transformed to images, large genomics data can be analyzed effectively with image classification algorithms.

Keywords

Artificial image objects; Artificial intelligence; Machine learning; Convolutional neural network; Disease risk modeling; GWAS-selected genetic variants; Image classification; Polygenic risk score; Random forest; Schizophrenia classification; Support vector machine

Disciplines

Genetics and Genomics | Genomics | Life Sciences

File Format

pdf

File Size

5204 KB

Language

English

Rights

IN COPYRIGHT. For more information about this rights statement, please visit http://rightsstatements.org/vocab/InC/1.0/

Creative Commons License

Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

UNLV article access

Search your library

Included in

Genomics Commons

COinS