Award Date

5-2010

Degree Type

Thesis

Degree Name

Master of Science in Computer Science

Department

Computer Science

First Committee Member

Kazem Taghva, Chair

Second Committee Member

Ajoy K. Datta

Third Committee Member

Laxmi P. Gewali

Graduate Faculty Representative

Muthukumar Venkatesan

Number of Pages

Abstract

Automated text categorization is a supervised learning task, defined as assigning category labels to new documents based on likelihood suggested by a training set of labeled documents. Two examples of methodology for text categorizations are Naive Bayes and K-Nearest Neighbor.

In this thesis, we implement two categorization engines based on Naive Bayes and K-Nearest Neighbor methodology. We then compare the effectiveness of these two engines by calculating standard precision and recall for a collection of documents. We will further report on time efficiency of these two engines.

Keywords

Automatic classification; Automatic indexing; Information Retrieval; Machine learning; Text processing (Computer science)

Disciplines

Computer Sciences | Databases and Information Systems | Library and Information Science

File Format

pdf

Degree Grantor

University of Nevada, Las Vegas

Language

English

Repository Citation

Karamcheti, Aditya Chainulu, "A Comparative study on text categorization" (2010). UNLV Theses, Dissertations, Professional Papers, and Capstones. 322.
http://dx.doi.org/10.34917/1563704

Rights

IN COPYRIGHT. For more information about this rights statement, please visit http://rightsstatements.org/vocab/InC/1.0/

Download

Included in

Databases and Information Systems Commons, Library and Information Science Commons

COinS

Digital Scholarship@UNLV

UNLV Theses, Dissertations, Professional Papers, and Capstones

A Comparative study on text categorization

Award Date

Degree Type

Degree Name

Department

First Committee Member

Second Committee Member

Third Committee Member

Graduate Faculty Representative

Number of Pages

Abstract

Keywords

Disciplines

File Format

Degree Grantor

Language

Repository Citation

Rights

Included in

Browse

Digital Scholarship@UNLV

UNLV Theses, Dissertations, Professional Papers, and Capstones

A Comparative study on text categorization

Author

Award Date

Degree Type

Degree Name

Department

First Committee Member

Second Committee Member

Third Committee Member

Graduate Faculty Representative

Number of Pages

Abstract

Keywords

Disciplines

File Format

Degree Grantor

Language

Repository Citation

Rights

Included in

Share

Browse