Master of Science in Computer Science
First Committee Member
Kazem Taghva, Chair
Second Committee Member
Ajoy K. Datta
Third Committee Member
Laxmi P. Gewali
Graduate Faculty Representative
Number of Pages
Automated text categorization is a supervised learning task, defined as assigning category labels to new documents based on likelihood suggested by a training set of labeled documents. Two examples of methodology for text categorizations are Naive Bayes and K-Nearest Neighbor.
In this thesis, we implement two categorization engines based on Naive Bayes and K-Nearest Neighbor methodology. We then compare the effectiveness of these two engines by calculating standard precision and recall for a collection of documents. We will further report on time efficiency of these two engines.
Automatic classification; Automatic indexing; Information Retrieval; Machine learning; Text processing (Computer science)
Computer Sciences | Databases and Information Systems | Library and Information Science
Karamcheti, Aditya Chainulu, "A Comparative study on text categorization" (2010). UNLV Theses, Dissertations, Professional Papers, and Capstones. 322.