Questions about Correlated and Statistically-Significant Patterns
(from the Pattern Mining Course)
Click on a question to see the answer.
Transaction id | Items |
t1 | {a, c, d} |
t2 | {b, c, e} |
t3 | {a, b, c, e} |
t4 | {b, e} |
t5 | {a, b, c, e} |
What is the bond of itemset {c}?
What is the bond of itemset {c,e}?
What is the bond of itemset {a,b,c}?
The bond of {c} is sup({c}) / dsup({c}) = 4 / 4 = 1
The bond of {c,e} is sup({c,e}) / dsup({c,e}) = 3 / 5
The bond of {a,b,c} is sup({a,b,c}) / dsup({a,b,c}) = 2 / 5 = 0.2
What is the all-confidence of itemset {c}?
What is the all-confidence of itemset {c,e}?
What is the all-confidence of itemset {a,b,c}?
The all-confidence of {c} is allconf({c}) = 4 / 4 = 1
The all-confidence of {c,e} is allconf({c,e}) = 3 / 4 = 0.75
The all-confidence of {a,b,c} is allconf({a,b,c}) = 2 / 4 = 0.5
What is the TIDLIST of itemset Z = X ∪Y ?
The TIDLIST of itemset Z = X ∪Y is TIDLIST(Z) = TIDLIST(X) ∩TIDLIST(Y) = {t1, t2, t4}∩{t2, t3, t4, t5} = {t2, t4}
What is the DTIDLIST of itemset Z = X ∪Y ?
The DTIDLIST of itemset Z = X ∪Y is DTIDLIST(Z) = DTIDLIST(X) ∪DTIDLIST(Y) = {t1, t2, t4}∪{t2, t3, t4, t5} = {t1,t2,t3, t4, t5}
If we design an algorithm to mine rare correlated itemsets using the all-confidence, what are the main properties that we can use to reduce the search space?
If we design an algorithm to mine frequent correlated itemsets using the all-confidence, we can use two properties to reduce the search space:
- The Apriori (anti-monotonicity) property of the support, i.e. the support of an itemset cannot be more than that of its subsets
- The Apriori (anti-monotonicity) property of the all-confidence, i.e. the all-confidence of an itemset cannot be more than that of its subsets
If we design an algorithm to mine rare correlated itemsets using the all-confidence, we can use one property to reduce the search space:
- The Apriori property of the all-confidence, i.e. the all-confidence of an itemset cannot be more than that of its subsets
A measure is null-invariant if the measure's value for any itemset X is not influenced by transactions that does not contain X. In other words, if you add or remove transactions that do not contain X to a database, the value of the measure for X will not change. This is desirable because it ensures that the measure behave in a more stable way. For example if you have an itemset {apple, orange}, the bond of {apple, orange} will not be influenced by people who do not buy apple or orange.