Master of Science in Computer Science
First Committee Member
Kazem Taghva, Chair
Second Committee Member
Ajoy K. Datta
Third Committee Member
Laxmi P. Gewali
Graduate Faculty Representative
Number of Pages
Automatic Text categorization is the task of assigning an electronic document to one or more categories, based on its contents. There are many known techniques to efficiently solve categorization problems. Typically these techniques fall into two distinct methodologies which are either logic based or probabilistic. In recent years, many researchers have tried approaches which area hybrid of these two methodologies.
In this thesis, we deal with document categorization using Apriori Algorithm. The Apriori algorithm was initially developed for data mining and basket analysis applications in the relational databases. Although the technique is logic based, it also relies on the statistical characteristics of the data. As a part of this work, we will implement all the tools which are necessary to carry out automatic categorization using Apriori algorithm. We will also report on the categorization effectiveness by applying this technique to standard collections.
Bayesian statistical decision theory; Computer algorithms; Data mining; Machine learning
Computer Engineering | Systems and Communications
Madadi, Prathima, "Text Categorization Based on Apriori Algorithm's Frequent Itemsets" (2009). UNLV Theses, Dissertations, Professional Papers, and Capstones. 1191.