Recognition and Identification of Form Document Layouts

Meeting name

International Conference on Information Technology: Coding and Computing

Document Type

Conference Proceeding

Publication Date

4-5-2004

Abstract

We introduce a hierarchical tree representation to represent the logical structure of a form document. Different forms might have the same logical structure, so the representation will be ambiguous. We propose an improvement to solve the ambiguity problem by using the physical information of the blocks. A pixel tracing approach is used to extract form layout structures from form documents. Compared with Hough transform, it requires less computation. This algorithm has been tested on 50 different table forms. The algorithm applies to table form documents.

Keywords

Ambiguity problem; Document handling; Electronic records; Form document; Form layout extraction; Frame based representation; Hierarchical tree representation; Hough transforms; Logical structure; Optical character recognition; Optical pattern recognition; Paperwork (Office practice); Pattern recognition; Pixel tracing; Table form document; Tables (Systematic lists)

Disciplines

Computer Engineering | Computer Sciences | Electrical and Computer Engineering | Software Engineering

Permissions

Use Find in Your Library, contact the author, or interlibrary loan to garner a copy of the item. Publisher policy does not allow archiving the final published version. If a post-print (author's peer-reviewed manuscript) is allowed and available, or publisher policy changes, the item will be deposited.

UNLV article access

Share

COinS