Class AssociationRule
I = {i1, i2,..., in}
be a set of n
binary attributes called items. Let
D = {t1, t2,..., tm}
be a set of transactions called the database. Each transaction in
D
has a unique transaction ID and contains a subset
of the items in I
. An association rule is defined
as an implication of the form X ⇒ Y
where X, Y ⊆ I
and X ∩ Y = Ø
.
The item sets X
and Y
are called
antecedent (left-hand-side or LHS) and consequent (right-hand-side or RHS)
of the rule, respectively.
The support supp(X)
of an item
set X
is defined as the proportion of transactions
in the database which contain the item set. Note that the support of
an association rule X ⇒ Y
is supp(X ∪ Y)
.
The confidence of a rule is defined as
conf(X ⇒ Y) = supp(X ∪ Y) / supp(X)
.
Confidence can be interpreted as an estimate of the probability
P(Y | X)
, the probability of finding the RHS of the rule
in transactions under the condition that these transactions also contain
the LHS.
Lift is a measure of the performance of a targeting model
(association rule) at predicting or classifying cases as having
an enhanced response (with respect to the population as a whole),
measured against a random choice targeting model. A targeting model
is doing a good job if the response within the target is much better
than the average for the population as a whole. Lift is simply the ratio
of these values: target response divided by average response.
For an association rule X ⇒ Y
, if the lift is equal
to 1, it means that X and Y are independent. If the lift is higher
than 1, it means that X and Y are positively correlated.
If the lift is lower than 1, it means that X and Y are negatively
correlated.
-
Field Summary
Modifier and TypeFieldDescriptionfinal int[]
Antecedent itemset.final double
The confidence value.final int[]
Consequent itemset.final double
The difference between the probability of the rule and the expected probability if the items were statistically independent.final double
How many times more often antecedent and consequent occur together than expected if they were statistically independent.final double
The support value. -
Constructor Summary
ConstructorDescriptionAssociationRule
(int[] antecedent, int[] consequent, double support, double confidence, double lift, double leverage) Constructor. -
Method Summary
-
Field Details
-
antecedent
public final int[] antecedentAntecedent itemset. -
consequent
public final int[] consequentConsequent itemset. -
support
public final double supportThe support value. The support supp(X) of an itemset X is defined as the proportion of transactions in the database which contain the itemset. -
confidence
public final double confidenceThe confidence value. The confidence of a rule is defined conf(X ⇒ Y) = supp(X ∪ Y) / supp(X). Confidence can be interpreted as an estimate of the probability P(Y | X), the probability of finding the RHS of the rule in transactions under the condition that these transactions also contain the LHS. -
lift
public final double liftHow many times more often antecedent and consequent occur together than expected if they were statistically independent. Lift is a measure of the performance of a targeting model (association rule) at predicting or classifying cases as having an enhanced response (with respect to the population as a whole), measured against a random choice targeting model. A targeting model is doing a good job if the response within the target is much better than the average for the population as a whole. Lift is simply the ratio of these values: target response divided by average response. For an association ruleX ⇒ Y
, if the lift is equal to 1, it means that X and Y are independent. If the lift is higher than 1, it means that X and Y are positively correlated. If the lift is lower than 1, it means that X and Y are negatively correlated. -
leverage
public final double leverageThe difference between the probability of the rule and the expected probability if the items were statistically independent.
-
-
Constructor Details
-
AssociationRule
public AssociationRule(int[] antecedent, int[] consequent, double support, double confidence, double lift, double leverage) Constructor.- Parameters:
antecedent
- the antecedent itemset (LHS) of the association rule.consequent
- the consequent itemset (RHS) of the association rule.support
- the proportion of instances in the dataset that contain an itemset.confidence
- the percentage of instances that contain the consequent and antecedent together over the number of instances that only contain the antecedent.lift
- how many times more often antecedent and consequent occur together than expected if they were statistically independent.leverage
- the difference between the probability of the rule and the expected probability if the items were statistically independent.
-
-
Method Details