Computer Science Faculty Research

An Efficient Algorithm to Mine High Average-utility Itemsets

Jerry Chun-Wei Lin, Harbin Institute of TechnologyFollow
Ting Li, Harbin Institute of TechnologyFollow
Philippe Fournier-Viger, Harbin Institute of TechnologyFollow
Tzung-Pei Hong, University of KaohsiungFollow
Justin Zhan, University of Nevada, Las VegasFollow
Miroslav Voznak, University of OstravaFollow

Document Type

Article

Publication Date

1-1-2016

Publication Title

Advanced Engineering Informatics

Volume

Issue

First page number:

233

Last page number:

243

Abstract

With the ever increasing number of applications of data mining, high-utility itemset mining (HUIM) has become a critical issue in recent decades. In traditional HUIM, the utility of an itemset is defined as the sum of the utilities of its items, in transactions where it appears. An important problem with this definition is that it does not take itemset length into account. Because the utility of larger itemset is generally greater than the utility of smaller itemset, traditional HUIM algorithms tend to be biased toward finding a set of large itemsets. Thus, this definition is not a fair measurement of utility. To provide a better assessment of each itemset's utility, the task of high average-utility itemset mining (HAUIM) was proposed. It introduces the average utility measure, which considers both the length of itemsets and their utilities, and is thus more appropriate in real-world situations. Several algorithms have been designed for this task. They can be generally categorized as either level-wise or pattern-growth approaches. Both of them require, however, the amount of computation to find the actual high average-utility itemsets (HAUIs). In this paper, we present an efficient average-utility (AU)-list structure to discover the HAUIs more efficiently. A depth-first search algorithm named HAUI-Miner is proposed to explore the search space without candidate generation, and an efficient pruning strategy is developed to reduce the search space and speed up the mining process. Extensive experiments are conducted to compare the performance of HAUI-Miner with the state-of-the-art HAUIM algorithms in terms of runtime, number of determining nodes, memory usage and scalability. © 2016 Elsevier Ltd.All rights reserved.

Keywords

Data mining; HAUIM; High average-utility itemsets; List structure

Language

English

Repository Citation

Lin, J. C., Li, T., Fournier-Viger, P., Hong, T., Zhan, J., Voznak, M. (2016). An Efficient Algorithm to Mine High Average-utility Itemsets. Advanced Engineering Informatics, 30(2), 233-243.
http://dx.doi.org/10.1016/j.aei.2016.04.002

Find It

UNLV article access

Find in your library

COinS

Digital Scholarship@UNLV

Computer Science Faculty Research

An Efficient Algorithm to Mine High Average-utility Itemsets

Document Type

Publication Date

Publication Title

Volume

Issue

First page number:

Last page number:

Abstract

Keywords

Language

Repository Citation

Browse

Links

Digital Scholarship@UNLV

Computer Science Faculty Research

An Efficient Algorithm to Mine High Average-utility Itemsets

Authors

Document Type

Publication Date

Publication Title

Volume

Issue

First page number:

Last page number:

Abstract

Keywords

Language

Repository Citation

Share

Browse

Links