Recognition and Identification of Form Document Layouts
Meeting name
International Conference on Information Technology: Coding and Computing
Document Type
Conference Proceeding
Publication Date
4-5-2004
Abstract
We introduce a hierarchical tree representation to represent the logical structure of a form document. Different forms might have the same logical structure, so the representation will be ambiguous. We propose an improvement to solve the ambiguity problem by using the physical information of the blocks. A pixel tracing approach is used to extract form layout structures from form documents. Compared with Hough transform, it requires less computation. This algorithm has been tested on 50 different table forms. The algorithm applies to table form documents.
Keywords
Ambiguity problem; Document handling; Electronic records; Form document; Form layout extraction; Frame based representation; Hierarchical tree representation; Hough transforms; Logical structure; Optical character recognition; Optical pattern recognition; Paperwork (Office practice); Pattern recognition; Pixel tracing; Table form document; Tables (Systematic lists)
Disciplines
Computer Engineering | Computer Sciences | Electrical and Computer Engineering | Software Engineering
Permissions
Use Find in Your Library, contact the author, or interlibrary loan to garner a copy of the item. Publisher policy does not allow archiving the final published version. If a post-print (author's peer-reviewed manuscript) is allowed and available, or publisher policy changes, the item will be deposited.
Repository Citation
Luo, K.,
Latifi, S.,
Taghva, K.,
Regentova, E.
(2004, April).
Recognition and Identification of Form Document Layouts.
Presentation at International Conference on Information Technology: Coding and Computing,
Available at: https://digitalscholarship.unlv.edu/ece_presentations/31