amazon

Thursday, August 13, 2015

Efficient Algorithms for Mining the Concise and Lossless Representation of Closed+ High Utility Itemsets




Efficient Algorithms for Mining the Concise and Lossless Representation
of Closed+ High Utility Itemsets






Abstract:




Mining high utility itemsets (HUIs) from databases is an important data mining task, which refers to the discovery of itemsets with high utilities (e.g. high profits). However, it may present too many HUIs to users, which also degrades the efficiency of the mining process. To achieve high efficiency for the mining task and provide a concise mining result to users, we propose a novel framework in this paper for mining closed+ high utility itemsets (CHUIs), which serves as a compact and lossless representation of HUIs. We propose three efficient algorithms named AprioriCH (Apriori- based algorithm for mining High utility Closed+ itemsets), AprioriHC-D (AprioriHC algorithm with Discarding unpromising and isolated items) and CHUD (Closed+ High Utility itemset Discovery) to find this representation. Further, a method called DAHU (Derive All High Utility itemsets) is proposed to recover all HUIs from the set of CHUIs without accessing the original database. Results on real and synthetic datasets show that the proposed algorithms are very efficient and that our approaches





achieve a massive reduction in the number of HUIs. In addition, when all
HUIs can be recovered by DAHU, the combination of CHUD and DAHU
outperforms the state-of-the-art algorithms for mining HUIs.







Existing System:

Frequent itemset mining (FIM) is a fundamental research topic in data mining. One of its popular applications is market basket analysis, which refers to the discovery of sets of items (itemsets) that are frequently purchased together by customers.

However, in this application, the traditional model of FIM may discover a large amount of frequent but low revenue itemsets and lose the information on valuable itemsets having low selling frequencies.




These problems are caused by the facts that (1) FIM treats all items as having the same importance/unit profit/weight and (2) it assumes that every item in a transaction appears in a binary form, i.e., an item can be either present or absent in a transaction, which does not indicate its purchase quantity in the transaction.

Hence, FIM cannot satisfy the requirement of users who desire to discover itemsets with high utilities such as high profits.




Proposed System:





HUI mining is not an easy task since the downward closure property  in
FIM does not hold in utility mining. In other words, the search space for mining HUIs cannot be directly reduced as it is done in FIM because a superset of a low utility itemset can be a high utility itemset.

Were proposed for mining HUIs, but they often present a large number of high utility itemsets to users. A very large number of high utility itemsets makes it difficult for the users to comprehend the results. It may also cause the algorithms to become inefficient in terms of time and memory requirement, or even run out of memory.

It is widely recognized that the more high utility itemsets the algorithms generate, the more processing they consume. The performance of the mining task decreases greatly for low minimum utility thresholds or when dealing with dense databases.




Hardware Requirements:



•       System              : Pentium IV 2.4 GHz.

•       Hard Disk        : 40 GB.

•       Floppy Drive   : 1.44 Mb.

•       Monitor            : 15 VGA Colour.

•       Mouse               : Logitech.

•       RAM                 : 256 Mb.




Software Requirements:







•       Operating system    : - Windows XP.

•       Front End                  : - JSP

•       Back End                   : - SQL Server

Software Requirements:


•       Operating system    : - Windows XP.

•       Front End                  : - .Net


•       Back End                   : - SQL Server

No comments:

Post a Comment