Award Date
1-1-2000
Degree Type
Thesis
Degree Name
Master of Science (MS)
Department
Computer Science
Number of Pages
55
Abstract
The objective of this work is to study different binarization methods and to investigate their effect on the performance of OCR systems. Two sets of document images and four OCR systems were used to study several binarization algorithms. The simplest method that chooses the median value of the gray levels, i.e., 127 from 256 levels, as the global threshold value did not work well unless the scanner characteristic matched with the nature of a collection of documents by chance. The best-fixed method uses the global threshold value that minimizes the number of overall errors for a combination of an OCR system and a collection of documents. Both Otsu's global algorithm and Niblack's local algorithm performed, on the average, as well as the best-fixed method for the test data sets. The ideal global threshold method selects the best global threshold value for each combination of a page and an OCR system. Although the ideal method outperformed, on the average, Niblack's method, Niblack's method processed some images better than the ideal method.
Keywords
Binarization; Directed; Evaluation; OCR; Techniques
Controlled Subject
Computer science
File Format
File Size
2693.12 KB
Degree Grantor
University of Nevada, Las Vegas
Language
English
Permissions
If you are the rightful copyright holder of this dissertation or thesis and wish to have the full text removed from Digital Scholarship@UNLV, please submit a request to digitalscholarship@unlv.edu and include clear identification of the work, preferably with URL.
Repository Citation
Vinas, Diego Antonio, "OCR-directed evaluation of binarization techniques" (2000). UNLV Retrospective Theses & Dissertations. 3137.
http://dx.doi.org/10.25669/9h93-418b
Rights
IN COPYRIGHT. For more information about this rights statement, please visit http://rightsstatements.org/vocab/InC/1.0/
COinS