Master of Science (MS)
Number of Pages
The objective of this work is to study different binarization methods and to investigate their effect on the performance of OCR systems. Two sets of document images and four OCR systems were used to study several binarization algorithms. The simplest method that chooses the median value of the gray levels, i.e., 127 from 256 levels, as the global threshold value did not work well unless the scanner characteristic matched with the nature of a collection of documents by chance. The best-fixed method uses the global threshold value that minimizes the number of overall errors for a combination of an OCR system and a collection of documents. Both Otsu's global algorithm and Niblack's local algorithm performed, on the average, as well as the best-fixed method for the test data sets. The ideal global threshold method selects the best global threshold value for each combination of a page and an OCR system. Although the ideal method outperformed, on the average, Niblack's method, Niblack's method processed some images better than the ideal method.
Binarization; Directed; Evaluation; OCR; Techniques
University of Nevada, Las Vegas
If you are the rightful copyright holder of this dissertation or thesis and wish to have the full text removed from Digital Scholarship@UNLV, please submit a request to email@example.com and include clear identification of the work, preferably with URL.
Vinas, Diego Antonio, "OCR-directed evaluation of binarization techniques" (2000). UNLV Retrospective Theses & Dissertations. 3137.
IN COPYRIGHT. For more information about this rights statement, please visit http://rightsstatements.org/vocab/InC/1.0/