Award Date

1-1-1996

Degree Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Science

Number of Pages

61

Abstract

A new projection profile based skew estimation algorithm was developed. This algorithm extracts fiducial points representing character elements by decoding a JBIG compressed image without reconstructing the original image. These points are projected along parallel lines into an accumulator array to determine the maximum alignment and the corresponding skew angle. Methods for characterizing the performance of skew estimation techniques were also investigated. In addition to the new skew estimator, three projection based algorithms were implemented and tested using 1,246 single column text zones extracted from a sample of 460 page images. Linear regression analyses of the experimental results indicate that our new skew estimation algorithm performs competitively with the other three techniques. These analyses also show that estimators using connected components as a fiducial representation perform worse than the others on the entire set of text zones. It is shown that all of the algorithms are sensitive to typographical features. The number of text lines in a zone significantly affects the accuracy of the connected component based methods. We also developed two aggregate measures of skew for entire pages. Experiments performed on the 460 unconstrained pages indicate the need to filter non-text features from consideration. Graphic and noise elements from page images contribute a significant amount of the error for the JBIG algorithm.

Keywords

Algorithms; Document; Estimation; Image; Skew

Controlled Subject

Computer science

File Format

pdf

File Size

1771.52 KB

Degree Grantor

University of Nevada, Las Vegas

Language

English

Permissions

If you are the rightful copyright holder of this dissertation or thesis and wish to have the full text removed from Digital Scholarship@UNLV, please submit a request to digitalscholarship@unlv.edu and include clear identification of the work, preferably with URL.

Identifier

https://doi.org/10.25669/ao3r-anf1


Share

COinS