Name Identification and Extraction with Formal Concept Analysis

Document Type

Article

Publication Date

1-1-2017

Publication Title

International Journal of Machine Learning and Cybernetics

Volume

8

Issue

1

First page number:

171

Last page number:

178

Abstract

One of the applications of the Formal concept analysis (FCA) is the ability to extract structured information from textual documents. Typically, one can define a set of attributes that will characterize the objects. Consequently, these defined objects will be extracted by standard FCA algorithms. In this paper, we describe how FCA identifies and extracts personal names as units of thought similar to the decoding of text sequences by Viterbi algorithm as used with Hidden Markov Models. We further exhibit how FCA mimics the thought process that goes into a rule-based information extraction system. We then observe that the formal approach of FCA combined with already established computational techniques such as bottom up intersection algorithm avoids the difficulties associated with hand coding and maintenance of rule-based systems. © 2016, Springer-Verlag Berlin Heidelberg.

Keywords

Data mining; Information extraction; Big data; Entity extraction; Data science; Hidden Markov models; Learning algorithms

Language

english

UNLV article access

Find in your library

Share

COinS