Questions about Concise Representations of Itemsets
(from the Pattern Mining Course)
Click on a question to see the answer.
Transaction id | Items |
t1 | {a, c, d} |
t2 | {b, c, e} |
t3 | {a, b, c, e} |
t4 | {b, e} |
t5 | {a, b, c, e} |
If minsup = 2, write down all the frequent closed itemsets.
The frequent closed itemsets are:
frequent closed itemsets | support |
{c} | 4 |
{a, c} | 3 |
{b, e} | 4 |
{b, c, e} | 3 |
{a, b, c, e} | 2 |
The frequent maximal itemsets are:
frequent maximal itemsets | support |
{a, b, c, e} | 2 |
The frequent generator itemsets are:
frequent generator itemsets | support |
{} the empty set | 5 |
{a} | 3 |
{a,b} | 2 |
{a,e} | 2 |
{b} | 4 |
{b,c} | 3 |
{c} | 4 |
{c, e} | 3 |
{e} | 4 |
It can be observed that the empty set is also a frequent generator if we are very strict about the definition of generator.
Question 4: a) What is the advantage of discovering closed itemsets instead of maximal itemsets?
b) What is the advantage of discovering maximal itemsets instead of closed itemsets?
Both maximal and closed itemsets are concise representations of all itemsets.
The advantage of closed itemsets over maximal itemsets is that closed itemsets are lossless. This means that from the frequent closed itemsets, we can recover all the frequent itemsets and their support without having to read the database. Thus, we don't loose any information when we mine the closed itemsets. Differently, the maximal itemsets are not lossess. From the maximal itemsets, we can recover all the frequent itemsets but we don't know their support values. To find the support values, we need to read the database to calculate the support values. This is a disadvantage of maximal itemsets.
The main advantage of maximal itemsets over closed itemsets is that the set of maximal itemsets is a subset of that of closed itemsets. Thus, the number of maximal itemsets can be smaller than that of closed itemsets.
{a}, {b}, {c}, {a,b}, {b,c}, {a,c}, {a, b, c}
Write down all the frequent itemsets and their support.
{a}, support = 4
{b}, support = 4
{c}, support = 3
{a,b}, support = 4
{b,c}, support = 3
{a,c}, support = 3
{a, b, c} support = 3
An itemset can only be a generator if all its subsets are also generators.