Identification of Sensitive Unclassified Information

Editors

Shlomo Argamon; Newton Howard

Document Type

Chapter

Publication Date

2009

Publication Title

Computational Methods for Counterterrorism

Publisher

Springer Berlin Heidelberg

First page number:

89

Last page number:

108

Abstract

Sensitive Unclassified information is defined as any unclassified information that may cause adverse consequences against the government facilities. In this chapter, we explore the use of categorization techniques and information extraction to discover this kind of information in scanned documents.

We show here that the combined use of a K-Dependence Bayesian categorization engine and a semi-automated review application reduce by nearly 95% the number of man hours required to redact sensitive unclassified information. We also discuss and provide statistics on how OCR errors can affect the information extraction tasks.

Disciplines

Electrical and Computer Engineering | Engineering

Language

English

Permissions

Use Find in Your Library, contact the author, or interlibrary loan to garner a copy of the item. Publisher policy does not allow archiving the final published version. If a post-print (author's peer-reviewed manuscript) is allowed and available, or publisher policy changes, the item will be deposited.

UNLV article access

Share

COinS