Fast algorithms for hiding sensitive high-utility itemsets in privacy-preserving utility mining

Document Type



High-Utility Itemset Mining (HUIM) is an extension of frequent itemset mining, which discovers itemsets yielding a high profit in transaction databases (HUIs). In recent years, a major issue that has arisen is that data publicly published or shared by organizations may lead to privacy threats since sensitive or confidential information may be uncovered by data mining techniques. To address this issue, techniques for privacy-preserving data mining (PPDM) have been proposed. Recently, privacy-preserving utility mining (PPUM) has become an important topic in PPDM. PPUM is the process of hiding sensitive HUIs (SHUIs) appearing in a database, such that the resulting sanitized database will not reveal these itemsets. In the past, the HHUIF and MSICF algorithms were proposed to hide SHUIs, and are the state-of-the-art approaches for PPUM. In this paper, two novel algorithms, namely Maximum Sensitive Utility-MAximum item Utility (MSU-MAU) and Maximum Sensitive Utility-MInimum item Utility (MSU-MIU), are respectively proposed to minimize the side effects of the sanitization process for hiding SHUIs. The proposed algorithms are designed to efficiently delete SHUIs or decrease their utilities using the concepts of maximum and minimum utility. A projection mechanism is also adopted in the two designed algorithms to speed up the sanitization process. Besides, since the evaluation criteria proposed for PPDM are insufficient and inappropriate for evaluating the sanitization performed by PPUM algorithms, this paper introduces three similarity measures to respectively assess the database structure, database utility and item utility of a sanitized database. These criteria are proposed as a new evaluation standard for PPUM. © 2016 Elsevier Ltd

UNLV article access

Find in your library