Award Date
5-2009
Degree Type
Thesis
Degree Name
Master of Science in Computer Science
Department
Computer Science
First Committee Member
Kazem Taghva, Chair
Second Committee Member
Ajoy K. Datta
Third Committee Member
Laxmi P. Gewali
Graduate Faculty Representative
Muthukumar Venkatesan
Number of Pages
71
Abstract
Automatic Text categorization is the task of assigning an electronic document to one or more categories, based on its contents. There are many known techniques to efficiently solve categorization problems. Typically these techniques fall into two distinct methodologies which are either logic based or probabilistic. In recent years, many researchers have tried approaches which area hybrid of these two methodologies.
In this thesis, we deal with document categorization using Apriori Algorithm. The Apriori algorithm was initially developed for data mining and basket analysis applications in the relational databases. Although the technique is logic based, it also relies on the statistical characteristics of the data. As a part of this work, we will implement all the tools which are necessary to carry out automatic categorization using Apriori algorithm. We will also report on the categorization effectiveness by applying this technique to standard collections.
Keywords
Bayesian statistical decision theory; Computer algorithms; Data mining; Machine learning
Disciplines
Computer Engineering | Systems and Communications
File Format
Degree Grantor
University of Nevada, Las Vegas
Language
English
Repository Citation
Madadi, Prathima, "Text Categorization Based on Apriori Algorithm's Frequent Itemsets" (2009). UNLV Theses, Dissertations, Professional Papers, and Capstones. 1191.
http://dx.doi.org/10.34917/2649978
Rights
IN COPYRIGHT. For more information about this rights statement, please visit http://rightsstatements.org/vocab/InC/1.0/
Comments
Signatures have been redacted for privacy and security measures.