Market basket analysis (association rules mining). Please review the following s
ID: 3692751 • Letter: M
Question
Market basket analysis (association rules mining). Please review the following sales data from a small grocery store. The data in the diagram below shows 8 shopping carts (baskets) containing different products (A, B, C, etc.) that customers checked out.
Cart 1(A,CE,F). CART 2(A,F,E), CART 3(D,G), CART4(A,B,C), CART 5(C,E,F). CART 6(G) CART 7(C,F) CART 8(A,B).
The store manager conducted market basket analysis using the above data and is now considering one of the following two rules to help the store cross-sell products and increase sales. That is, when a customer buys a product, the store would like to recommend to the customer another product that she/he would be most likely to buy as well.
Rule 1: F -> E
Rule 2: E -> F
On the basis of given data, which of the above two rules would be a better predictor of cross-sale than the other? Please support your answer with these two metrics: support and confidence, which are typically used to determine how good a rule is.
Explanation / Answer
Support/confidence
Support shows the frequency of the patterns in the rule; it is the percentage of transactions that contain both A and B, i.e.
Support = Probability(A and B)
Support = (# of transactions involving A and B) / (total number of transactions).
Confidence is the strength of implication of a rule; it is the percentage of transactions that contain B if they contain A, ie.
Confidence = Probability (B if A) = P(B/A)
Confidence =
(# of transactions involving A and B) / (total number of transactions that have A).
Cart1
A,C,E,F
Cart2
A,F,E
Cart3
D,G
Cart4
A,B,C
Cart5
C,E,F
Cart6
G
Cart7
C,F
Cart8
A,B
From the given table we have
Support of the rule E ->F is P(E and F) = 3/8 = 37.5% (this indicates the fraction of transactions that contain both E and F)
Confidence of the rule is P(E/F) = 3/4 = 75% (this indicates the measure how often item F appears in transactions that contain E)
Support for the rule F-> E is P(F and E) = 3/8 = 37.5% ( support is a symmetric relation)
Confidence for this rule is P(F/E) = 3/3 =100%(this indicates the measure how often item E appears in transactions that contain F)
So rule F-> E has to be considered for more cross sale.
Cart1
A,C,E,F
Cart2
A,F,E
Cart3
D,G
Cart4
A,B,C
Cart5
C,E,F
Cart6
G
Cart7
C,F
Cart8
A,B