Computer Science Faculty Research

Finding Top-k Dominance on Incomplete Big Data Using Map Reduce Framework

Payam Ezatpoor, University of Nevada, Las VegasFollow
Justin Zhan, University of Nevada, Las VegasFollow
Jimmy Ming-Tai Wu, University of Nevada, Las VegasFollow
Carter Chiu, University of Nevada, Las Vegas

Document Type

Article

Publication Date

1-23-2018

Publication Title

IEEE Access

Volume

First page number:

7872

Last page number:

7887

Abstract

Incomplete data is one major kind of multi-dimensional dataset that has random-distributed missing nodes in its dimensions. It is very difficult to retrieve information from this type of dataset when it becomes large. Finding top-k dominant values in this type of dataset is a challenging procedure. Some algorithms are present to enhance this process, but most are efficient only when dealing with small incomplete data. One of the algorithms that make the application of top-k dominating (TKD) query possible is the Bitmap Index Guided (BIG) algorithm. This algorithm greatly improves the performance for incomplete data, but it is not designed to find top-k dominant values in incomplete big data. Several other algorithms have been proposed to find the TKD query, such as Skyband Based and Upper Bound Based algorithms, but their performance is also questionable. Algorithms developed previously were among the first attempts to apply TKD query on incomplete data; however, these algorithms suffered from weak performance. This paper proposes MapReduced Enhanced Bitmap Index Guided Algorithm (MRBIG) for dealing with the aforementioned issues. MRBIG uses the MapReduce framework to enhance the performance of applying top-k dominance queries on large incomplete datasets. The proposed approach uses the MapReduce parallel computing approach involving multiple computing nodes. The framework separates the tasks between several computing nodes to independently and simultaneously work to find the result. This method has achieved up to two times faster processing time in finding the TKD query result when compared to previously proposed algorithms.

Keywords

Top-k dominance; Incomplete data; Big data; Cloud computing; Data analysis; Upper bound

Disciplines

Computer Sciences

Language

English

Repository Citation

Ezatpoor, P., Zhan, J., Wu, J. M., Chiu, C. (2018). Finding Top-k Dominance on Incomplete Big Data Using Map Reduce Framework. IEEE Access, 6 7872-7887.
http://dx.doi.org/10.1109/ACCESS.2018.2797048

Find in your library

COinS

Digital Scholarship@UNLV

Computer Science Faculty Research

Finding Top-k Dominance on Incomplete Big Data Using Map Reduce Framework

Document Type

Publication Date

Publication Title

Volume

First page number:

Last page number:

Abstract

Keywords

Disciplines

Language

Repository Citation

Browse

Links

Digital Scholarship@UNLV

Computer Science Faculty Research

Finding Top-k Dominance on Incomplete Big Data Using Map Reduce Framework

Authors

Document Type

Publication Date

Publication Title

Volume

First page number:

Last page number:

Abstract

Keywords

Disciplines

Language

Repository Citation

Share

Browse

Links