Award Date
1-1-2001
Degree Type
Thesis
Degree Name
Master of Science (MS)
Department
Computer Science
First Committee Member
Kazem Taghva
Number of Pages
44
Abstract
In this thesis; we report on our experiments on training and categorization of optically recognized documents. In, particular, we present a lexicon-based error correction algorithm to improve the categorization process. This algorithm is based on edit distance techniques and information from highly weighted words in the categorizers.
Keywords
Categorization; Effects; Errors; OCR; Text
Controlled Subject
Computer science
File Format
File Size
1617.92 KB
Degree Grantor
University of Nevada, Las Vegas
Language
English
Permissions
If you are the rightful copyright holder of this dissertation or thesis and wish to have the full text removed from Digital Scholarship@UNLV, please submit a request to digitalscholarship@unlv.edu and include clear identification of the work, preferably with URL.
Repository Citation
Mackovski, Lidija K, "Effects of OCR errors on text categorization" (2001). UNLV Retrospective Theses & Dissertations. 1331.
http://dx.doi.org/10.25669/md4f-jxk0
Rights
IN COPYRIGHT. For more information about this rights statement, please visit http://rightsstatements.org/vocab/InC/1.0/
COinS