Knowledge-Based Interactive Postmining
of Association Rules Using Ontologies(2010)
Note: Please Scroll
Down to See the Download Link.
ABSTRACT
In Data Mining, the usefulness of
association rules is strongly limited by the huge amount of delivered rules. To
overcome this drawback, several methods were proposed in the literature such as
itemset concise representations, redundancy reduction, and postprocessing.
Thus, it is crucial to help the decision-maker with an efficient postprocessing
step in order to reduce the number of rules. This paper proposes a new
interactive approach to prune and filter discovered rules. First, we propose to
use ontologies in order to improve the integration of user knowledge in the
postprocessing task. Second, we propose the Rule Schema formalism extending the
specification language proposed by Liu et al. for user expectations.
Furthermore, an interactive framework is designed to assist the user throughout
the analyzing task. Applying our new approach over voluminous sets of rules, we
were able, by integrating domain expert knowledge in the postprocessing step,
to reduce the number of rules to several dozens or less. Moreover, the quality
of the filtered rules was validated by the domain expert at various points in
the interactive process.
Existing System:
In Data Mining, the usefulness of
association rules is strongly limited by the huge amount of delivered rules. It
is very well known that miningalgorithms can discover a prohibitive amount of
association\rules; for instance, thousands of rules are extracted from
adatabase of several dozens of attributes and several hundreds of transactions.
To overcome this drawback, several methods were proposed in the literature such
as itemset concise representations, redundancy reduction, and postprocessing.
However, being generally based on statistical information
Disadvantages or Demerits of Existing System:
As a result, it is necessary to bring
the support thresholdlow enough in order to extract valuable
informationUnfortunately, the lower the support is, the larger thevolume of
rules becomes, making it intractable for a decision-maker to analyze the mining
result. Experiments show that rules become almost impossible to use when the
number of rules overpasses 100. Thus, it is crucial to help the decision-maker
with an efficient technique for reducing the number of rules
Proposed System:
This paper proposes a new interactive
postprocessing approach, ARIPSO (Association Rule Interactive post-Processing
using Schemas and Ontologies) to prune and filter discovered rules. First, we
propose to use Domain Ontologies in order to strengthen the integration of user
knowledge in the postprocessing task. Second, we introduce Rule Schema
formalism by extending the specification language proposed by Liu et al.
for user beliefs and expectations toward the use of ontology concepts.
Furthermore, an interactive and iterative framework is designed to assist the
user throughout the analyzing task. The interactivity of our approach relies on
a set of rule mining operators defined over the Rule Schemas in order to
describe the actions that the user can perform.
Advantages or Merits of Proposed System:
iterative framework is designed to
assist the user throughout the analyzing task. The interactivity of our
approach relies on a set of rule mining operators defined over the Rule Schemas
in order to describe the actions that the user can perform.
Modules:
1. Design of Dataset
2. Clustering process
3. Distance-based projected clustering
algorithm
MODULES DESCRIPTION:
1. Design of Dataset
Create dataset which has the datas
like location, server id and service. Assign constraints to the columns
in tbe dataset.
These constraints are used to avoid the
duplicate rows in the table. Here these use constraints like Not Null and
primarykey.
2. Clustering process
The clustering process is based on the
k-means algorithm, with the computation of distance restricted to subsets of
attributes where object values are dense. Our algorithm is capable of detecting
projected clusters of low dimensionality embedded in a high-dimensional space
and avoids the computation of the distance in the full-dimensional space. The
suitability of our proposal has been demonstrated through an empirical study
using synthetic and real datasets
3. Distance-based projected clustering algorithm
The algorithm consists of three phases.
The first phase performs attribute relevance analysis by detecting dense and
sparse regions and their location in each attribute. Starting from the results
of the first phase, the goal of the second phase is to eliminate outliers,
while the third phase aims to discover clusters in different subspaces.
SOFTWARE REQUIREMENTS
Operating
system : Windows XP
Professional
Front End
: Microsoft Visual Studio .Net 2005
Coding Language
: Visual C# .Net,ASP.NET2.0
Back
End
: SqlServer 2000
HARDWARE REQUIREMENTS
SYSTEM
: Pentium IV 2.4 GHz
HARD
DISK : 40 GB
FLOPPY DRIVE
: 1.44 MB
MONITOR : 15
VGA colour
MOUSE
: Logitech.
RAM
: 256 MB
KEYBOARD : 110 keys enhanced.
No comments:
Post a Comment