Association rules Viewer
This free software was developed in collaboration with 2 students of the
DESS IAGL of Lille : D. Delautre and S. Demay.
Software
There exists some representations of association rules in the literature.
These representations have the aim to facilitate for the user the exploitation
of results given by rule mining algorithms. But the major representations
are only available on commercial datamining software.
Our software ``Association rules viewer'' proposes three representations.
The idea of such an application is to be able to compare rules obtained
according to different criteria, and in our case, according to the optimization
criteria J1 but also to other widely used measure: support (supp),
confidence (conf), completeness (compl.), ... This application is independent
of the rule mining algorithm and takes into input a XML file.
Input Format
For using Association Rules Viewer you must use an XML file in which are
the association rules. For opening a file, simply choose "Open" in the
"File" menu.
The rules DTD
The XML file must use this DTD:
<?xml version="1.0" encoding="iso-8859-15" ?>
<!ELEMENT RULES (RULE*)>
<!ELEMENT RULE (LHS, RHS, CRITERE*, CARD?)>
<!ELEMENT LHS (ITEM*)>
<!ELEMENT RHS (ITEM*)>
<!ELEMENT CARD (ATTR)>
<!ELEMENT ATTR (VALUE+)>
<!ELEMENT VALUE (ATTR*, DIMENSION?)>
<!ATTLIST CRITERE name CDATA #REQUIRED value CDATA #REQUIRED>
<!ATTLIST ITEM name CDATA #REQUIRED value CDATA #REQUIRED>
<!ATTLIST ATTR name CDATA #REQUIRED>
<!ATTLIST VALUE value CDATA #REQUIRED>
<!ATTLIST DIMENSION confidence CDATA #REQUIRED frequency CDATA #REQUIRED>
You can find some XML examples in the directory "xml" which is in your ARViewer directory.
The file is composed by a root element RULES which contains all the association rules you want to visualize with the software.
For each rule (element RULE) you must specify the description of the rule itself and the values of the criteria associated with it.
And if you want to use the third visualisation (Double Decker Plot), you must specify in the element CARD the specific informations for this visualisation.(as shown below)
Get the DTD directly
Description of the rules:
The rule is divided in two parts: the left and the right which are represented
respectively by the elements LHS and RHS. In the LHS element there can
be several ITEM which represent the attributes of the condition of the
rule. In the RHS element there can be only one ITEM element. The ITEM element
is composed by his name and his value. In the value you must specify if
the attribute is equals to, greater than, or lesser than the value by using
the right sign ('=', '<', '>', '<=', '>=') like in this example:-
Criteria:
The values of the criteria for this rule are represented by the elements
CRITERE. For each one, there are the name and the value.
Specific informations for Double Decker Plot:
As it was said before, the informations for the Double Decker Plot visualisation
are in the element named CARD. The content of this element is the XML representation
of a tree composed by the different values of each attribute implicated
in the rule. For example if, in a rule, there are 3 atrributes named attr1,
attr2 and attr3 and each one have 2 possible values: val1 and val2, the
tree will be the following: So for each possibility we have the frequency
of this new rule and its confidence. The frequency is the number of rules
in the database which have the same condition (in %) and the confidence
is the number of theses rules that satisfy the conclusion (also in %).
An example
The rule IF SNP7=ab AND SNP16=aa AND SNP46=bb THEN
Status=Affected IS encoded in XML by
<?xml version="1.0" encoding="ISO-8859-15"?>
<!DOCTYPE INFO SYSTEM "rules.dtd">
<RULES>
<RULE>
<LHS>
<ITEM name="SNP7" "value=ab" />
<ITEM name="SNP16" "value=aa" />
<ITEM name="SNP46" "value=bb" />
</LHS>
<RHS>
<ITEM name="Status" "value=Affected" />
</RHS>
<CRITERE name="J1" value="0.0366" />
<CRITERE name="supp" value="0.09" />
<CRITERE name="compl" value="0.25" />
<CRITERE name="conf" value="0.9167" />
</RULE>
</RULES>
Interface
The selection pane is divided in three parts, one for each visualisation.
Its goal is to let you select the rules you want to see in each visualisation.
It lets you also sort the rules by different criteria, like the number
of atributes involved in the condition of the rule or by conclusion (a
lexicographic sort on the conclusion) or by the criteria specified in the
input file.
Figure 1. : Example of selection interface.
Like you can see on the screenshot, for the 3D Representation and the
N Dimensonal Line, you can select several rules and have two buttons for
selecting all the rules or unselecting them. You can sort the rules by
selecting the sort under the list of rules. It is a combined sort, so in
case of egality on the first sort, the second is used to separate the rules.
For the 3D Representation, you must also select the criteria you want to
see on the visualisation, because it is limited to two criteria.
The Visualizations
The 3D Visualisation
We have based our 3D representation on the one presented by Wong which
is able to visualize ``many-to-one'' association rules. The rows
of the matrix floor represent items and the columns represent item associations.
The green and red blocks of each column (rule) represent the Condition
(or antecedent) and the Prediction (or consequent). Identities of the items
are shown along the right side of the matrix. As we want to be able to
evaluate a rule not only thanks to the support and the confidence, we add
to the 3D representation the opportunity for the user to see the different
measures of quality compute by ASGARD. In the 3D representation, the blue
and the cyan represent two chosen criteria. Our display system supports
several sort functions to ameliorate the readability of the result. The
display has a mouse-controlled zooming and rotation capability.
Figure 2. : Example of 3D vizualisation on some discovered rules by
ASGARD.
The line visualization
The second visualization allows to compare several rules thanks to all
the criteria computed by ASGARD for example. Each line represent a rule
and each criteria is a point of the X-axis. The value of the criteria is
plotted on the Y-axis.

Figure 3.: Exemple of line visualization of the different criteria
If you have selected more than 14 rules, the lines are all in black
and it shows only the general comportement of your rules.
Figure 3b. : Exemple of line visualization of the different criteria
for more than 14 rules.
Like in the 3D Representation there is also a list of rules under the visualisation.
You can select only one rule to visualize it on the Double Decker Plot.
The double decker plot Visualization
The third representation is a double-decker representation. It
allows to visualize an association rule IF C THEN P by combining all the
attributes involved in the left-hand-side selection C as explanatory variables
and by drawing them within one mosaic plot and by visualizing the response
P by highlighting the corresponding categories in a bar-chart. \\
Figure 3. : Example of Double decker vizualisation on some discovered
rules by ASGARD
Interesting features
Printing
You can print the visualisation you want by clicking on the "Print" item
in the "File" menu. It opens a print dialog depending on your printer.
For the N Dimensional Line, the list of the rules represented is also printing
for viewing to which line they correspond. This is made only if there are
less than 15 rules reprensented.
Saving
ARViewer offers you also the possibility of saving a visualisation in a
JPEG image. For that simply choose "Save as" in the "File" menu. You can
now select where you want to save your image. For the N Dimensional Line,
the list of the rules represented is also drawn on the image for viewing
to which line they correspond. This is made only if there are less than
15 rules reprensented.