Name Identification and Extraction with Formal Concept Analysis
International Journal of Machine Learning and Cybernetics
First page number:
Last page number:
One of the applications of the Formal concept analysis (FCA) is the ability to extract structured information from textual documents. Typically, one can define a set of attributes that will characterize the objects. Consequently, these defined objects will be extracted by standard FCA algorithms. In this paper, we describe how FCA identifies and extracts personal names as units of thought similar to the decoding of text sequences by Viterbi algorithm as used with Hidden Markov Models. We further exhibit how FCA mimics the thought process that goes into a rule-based information extraction system. We then observe that the formal approach of FCA combined with already established computational techniques such as bottom up intersection algorithm avoids the difficulties associated with hand coding and maintenance of rule-based systems. © 2016, Springer-Verlag Berlin Heidelberg.
Data mining; Information extraction; Big data; Entity extraction; Data science; Hidden Markov models; Learning algorithms
Name Identification and Extraction with Formal Concept Analysis.
International Journal of Machine Learning and Cybernetics, 8(1),