Award Date

2009

Degree Type

Thesis

Degree Name

Master of Science in Computer Science

Department

Computer Science

Advisor 1

Kazem Taghva

First Committee Member

Ajoy K. Datta

Second Committee Member

Laxmi P. Gewali

Graduate Faculty Representative

Muthukumar Venkatesan

Number of Pages

Abstract

The result set produced by a search engine in response to the user query is very large. It is typically the responsibility of the user to browse the result set to identify relevant documents. Many tools have been developed to assist the user to identify the most relevant documents. One such a tool is clustering technique. In this method, the closely related documents are grouped based on their contents. Hence if a document turns out to be relevant, so are the rest of the documents in the cluster. So it would be easy for a user to sift through the result set and find the related documents, if all the closely related documents can be grouped together and displayed.

This thesis deals with the computational overhead involved when the sizes of document collections grow very large. We will provide a survey of some clustering methods that efficiently utilize memory and overcome the computational problems when large datasets are involved.

Keywords

Clustering; Datasets; DBSCAN; Large datasets; Memory availability; Tree-based data structures

Disciplines

Databases and Information Systems

File Format

pdf

Degree Grantor

University of Nevada, Las Vegas

Language

English

Repository Citation

Nemala, Vasanth, "Efficient clustering techniques for managing large datasets" (2009). UNLV Theses, Dissertations, Professional Papers, and Capstones. 72.
http://dx.doi.org/10.34870/1374219

Rights

IN COPYRIGHT. For more information about this rights statement, please visit http://rightsstatements.org/vocab/InC/1.0/

Download

Included in

Databases and Information Systems Commons

COinS

Digital Scholarship@UNLV

UNLV Theses, Dissertations, Professional Papers, and Capstones

Efficient clustering techniques for managing large datasets

Award Date

Degree Type

Degree Name

Department

Advisor 1

First Committee Member

Second Committee Member

Graduate Faculty Representative

Number of Pages

Abstract

Keywords

Disciplines

File Format

Degree Grantor

Language

Repository Citation

Rights

Included in

Browse

Digital Scholarship@UNLV

UNLV Theses, Dissertations, Professional Papers, and Capstones

Efficient clustering techniques for managing large datasets

Author

Award Date

Degree Type

Degree Name

Department

Advisor 1

First Committee Member

Second Committee Member

Graduate Faculty Representative

Number of Pages

Abstract

Keywords

Disciplines

File Format

Degree Grantor

Language

Repository Citation

Rights

Included in

Share

Browse