Questions about Frequent Itemsets and Association Rules
(from the Pattern Mining Course)
Click on a question to see the answer.
Transaction id | Items |
t1 | {a, c, d} |
t2 | {b, c, e} |
t3 | {a, b, c, e} |
t4 | {b, e} |
t5 | {a, b, c, e} |
If minsup = 2, write down all the frequent itemsets and their support.
The frequent itemsets are :
itemsets | support |
{a} | 3 |
{b} | 4 |
{c} | 4 |
{e} | 4 |
{a, b} | 2 |
{a, c} | 3 |
{a, e} | 2 |
{b, c} | 3 |
{b, e} | 4 |
{c, e} | 3 |
{a, b, c} | 2 |
{a, b, e} | 2 |
{a, c, e} | 2 |
{b, c, e} | 3 |
{a, b, c, e} | 2 |
Transaction id | Items |
t1 | {a, b, d, e} |
t2 | {b, c, e} |
t3 | {a, b, d, e} |
t4 | {a, b, c, e} |
t5 | {a, b, c, d, e} |
t6 | {b, c, d} |
What is the confidence and support of the rule {a} ==>{b, d, e}?
What is the confidence and support of the rule {b, d, e} ==>{a}?
The support and confidence of the rule {a} ==>{b, d, e} are 3 transactions and 0.75 (75%), respectively.
The support and confidence of the rule {b, d, e} ==>{a} are 3 transaction and 1 (100%), respectively
Let there be two itemsets X and Y. If X ⊂ Y, the support of Y is less than or equal to the support of X.
This property is important in frequent itemset mining because it allows to reduce the search space, that is to find all the frequent itemsets without having to explore all the possibilities. If we know that an itemset is infrequent, it is not necessary to explore its supersets.
The candidates will be {a,b,c} and {c,d,e}.
The vertical database is:
Item | Transaction list |
a | t1, t3, t5 |
b | t2, t3, t5 |
c | t1, t2, t3, t5 |
d | t2, t5 |
e | t1, t2, t3, t5 |
What is the minimum number of frequent itemsets in that database?
What is the maximum number of frequent itemsets in that database?
Generally, if there are n items in a database, there are at most 2^n - 1 frequent itemsets (excluding the empty itemset).
Since infrequent items cannot be part of a frequent itemset, we can ignore them for this question.
Thus, if n = 3, we have a maximum of 2^3 - 1 = 7 frequent itemsets.
Now what about the minimum? According to the question, we know that three items are frequent, so the minimum number of frequent itemsets in that database is 3.