Award Date

12-1-2021

Degree Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Science

First Committee Member

Fatma Nasoz

Second Committee Member

Mingon Kang

Third Committee Member

Beiyu Lin

Fourth Committee Member

Kazem Taghva

Fifth Committee Member

Brendan Morris

Number of Pages

Abstract

State-of-the-art image captioning models can successfully produce a diverse set of accurate captions. Previous research has focused on improving caption diversity while maintaining a high level of fidelity. We shift the focus from accuracy and diversity to controllability. We use a modified version of the traditional encoder-decoder network that allows the model to produce a meaningful and structured latent space. We then explore the latent space using several latent cartographic methods: lerp, slerp, analogy completion, attribute vector rotation, and interpolation graphs. Additionally, we discuss different categories of latent space and provide modifications for each of the cartographic methods. Finally, we show that it is possible to generate a set of diverse and accurate captions with desired real space semantics by sampling from different areas of the latent space.

Keywords

Computer Vision; Generative Networks; Image Captioning; Latent Space; Machine Learning; Natural Language Processing

Disciplines

Computer Sciences

File Format

pdf

File Size

16200 KB

Degree Grantor

University of Nevada, Las Vegas

Language

English

Repository Citation

Musser, Mikian J., "Exploring the Latent Space of Image Captioning Networks" (2021). UNLV Theses, Dissertations, Professional Papers, and Capstones. 4306.
http://dx.doi.org/10.34917/28340356

Rights

IN COPYRIGHT. For more information about this rights statement, please visit http://rightsstatements.org/vocab/InC/1.0/

Download

Included in

Computer Sciences Commons

COinS

Digital Scholarship@UNLV

UNLV Theses, Dissertations, Professional Papers, and Capstones

Exploring the Latent Space of Image Captioning Networks

Award Date

Degree Type

Degree Name

Department

First Committee Member

Second Committee Member

Third Committee Member

Fourth Committee Member

Fifth Committee Member

Number of Pages

Abstract

Keywords

Disciplines

File Format

File Size

Degree Grantor

Language

Repository Citation

Rights

Included in

Browse

Digital Scholarship@UNLV

UNLV Theses, Dissertations, Professional Papers, and Capstones

Exploring the Latent Space of Image Captioning Networks

Author

Award Date

Degree Type

Degree Name

Department

First Committee Member

Second Committee Member

Third Committee Member

Fourth Committee Member

Fifth Committee Member

Number of Pages

Abstract

Keywords

Disciplines

File Format

File Size

Degree Grantor

Language

Repository Citation

Rights

Included in

Share

Browse