Recognition and Identification of Form Document Layouts
International Conference on Information Technology: Coding and Computing
We introduce a hierarchical tree representation to represent the logical structure of a form document. Different forms might have the same logical structure, so the representation will be ambiguous. We propose an improvement to solve the ambiguity problem by using the physical information of the blocks. A pixel tracing approach is used to extract form layout structures from form documents. Compared with Hough transform, it requires less computation. This algorithm has been tested on 50 different table forms. The algorithm applies to table form documents.
Ambiguity problem; Document handling; Electronic records; Form document; Form layout extraction; Frame based representation; Hierarchical tree representation; Hough transforms; Logical structure; Optical character recognition; Optical pattern recognition; Paperwork (Office practice); Pattern recognition; Pixel tracing; Table form document; Tables (Systematic lists)
Computer Engineering | Computer Sciences | Electrical and Computer Engineering | Software Engineering
Use Find in Your Library, contact the author, or interlibrary loan to garner a copy of the item. Publisher policy does not allow archiving the final published version. If a post-print (author's peer-reviewed manuscript) is allowed and available, or publisher policy changes, the item will be deposited.
Recognition and Identification of Form Document Layouts.
Presentation at International Conference on Information Technology: Coding and Computing,
Available at: http://digitalscholarship.unlv.edu/ece_presentations/31