OCRSpell: An Interactive Spelling Correction System for OCR Errors in Text
Document Type
Article
Publication Date
3-2001
Publication Title
International Journal on Document Analysis and Recognition
Volume
3
Issue
3
First page number:
125
Last page number:
137
Abstract
In this paper, we describe a spelling correction system designed specifically for OCR-generated text that selects candidate words through the use of information gathered from multiple knowledge sources. This system for text correction is based on static and dynamic device mappings, approximate string matching, and n-gram analysis. Our statistically based, Bayesian system incorporates a learning feature that collects confusion information at the collection and document levels. An evaluation of the new system is presented as well.
Keywords
Error correction; Information retrieval; OCR-Spell checkers; Scanning
Disciplines
Electrical and Computer Engineering | Engineering
Language
English
Permissions
Use Find in Your Library, contact the author, or interlibrary loan to garner a copy of the item. Publisher policy does not allow archiving the final published version. If a post-print (author's peer-reviewed manuscript) is allowed and available, or publisher policy changes, the item will be deposited.
Repository Citation
Taghva, K.,
Stofsky, E.
(2001).
OCRSpell: An Interactive Spelling Correction System for OCR Errors in Text.
International Journal on Document Analysis and Recognition, 3(3),
125-137.