Master of Science in Computer Science
First Committee Member
Number of Pages
In general, technical papers are augmented with a list of bibliographic citations to support the arguments and the merits of the approach presented. Each and every citation is made up of parts like author, journal, volume, book etc. Extracting the parts of the citation from a written document and properly separating into its parts is the problem that is being addressed in this thesis; We use an Information Extraction (IE) technique based on Hidden Markov Model (HMM) to solve this problem. This solution consists of the design of an HMM, the training of the HMM with tagged data, and an implementation of Forward Chaining algorithm for extraction of citation parts. Our test on a collection of 150 citations has recall and precision of 0.8 and 0.81 respectively.
Hidden; Markov; Model; References; Standardization
University of Nevada, Las Vegas
If you are the rightful copyright holder of this dissertation or thesis and wish to have the full text removed from Digital Scholarship@UNLV, please submit a request to firstname.lastname@example.org and include clear identification of the work, preferably with URL.
Sambamurthy, Swamynathan, "Standardization of references using Hidden Markov Model" (2008). UNLV Retrospective Theses & Dissertations. 2384.