Efficient Hiding of Confidential High-utility Itemsets with Minimal Side Effects

Document Type


Publication Date


Publication Title

Journal of Experimental and Theoretical Artificial Intelligence





First page number:


Last page number:



Privacy preserving data mining (PPDM) is an emerging research problem that has become critical in the last decades. PPDM consists of hiding sensitive information to ensure that it cannot be discovered by data mining algorithms. Several PPDM algorithms have been developed. Most of them are designed for hiding sensitive frequent itemsets or association rules. Hiding sensitive information in a database can have several side effects such as hiding other non-sensitive information and introducing redundant information. Finding the set of itemsets or transactions to be sanitised that minimises side effects is an NP-hard problem. In this paper, a genetic algorithm (GA) using transaction deletion is designed to hide sensitive high-utility itemsets for PPUM. A flexible fitness function with three adjustable weights is used to evaluate the goodness of each chromosome for hiding sensitive high-utility itemsets. To speed up the evolution process, the pre-large concept is adopted in the designed algorithm. It reduces the number of database scans required for verifying the goodness of an evaluated chromosome. Substantial experiments are conducted to compare the performance of the designed GA approach (with/without the pre-large concept), with a GA-based approach relying on transaction insertion and a non-evolutionary algorithm, in terms of execution time, side effects, database integrity and utility integrity. Results demonstrate that the proposed algorithm hides sensitive high-utility itemsets with fewer side effects than previous studies, while preserving high database and utility integrity. © 2017 Informa UK Limited, trading as Taylor & Francis Group.



UNLV article access

Find in your library