Award Date

1-1-2001

Degree Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Science

First Committee Member

Kazem Taghva

Number of Pages

44

Abstract

In this thesis; we report on our experiments on training and categorization of optically recognized documents. In, particular, we present a lexicon-based error correction algorithm to improve the categorization process. This algorithm is based on edit distance techniques and information from highly weighted words in the categorizers.

Keywords

Categorization; Effects; Errors; OCR; Text

Controlled Subject

Computer science

File Format

pdf

File Size

1617.92 KB

Degree Grantor

University of Nevada, Las Vegas

Language

English

Permissions

If you are the rightful copyright holder of this dissertation or thesis and wish to have the full text removed from Digital Scholarship@UNLV, please submit a request to digitalscholarship@unlv.edu and include clear identification of the work, preferably with URL.

Rights

IN COPYRIGHT. For more information about this rights statement, please visit http://rightsstatements.org/vocab/InC/1.0/


COinS