Questions about Rare Patterns
(from the Pattern Mining Course)

Click on a question to see the answer.

Question 1: To find rare itemsets in data, we need to define what is "rare". The most simple definition is to define rare itemsets as the itemsets that are infrequent, that is have a support that is less than minsup.

What are the problems with this definition?

There are generally too many infrequent itemsets. And some infrequent itemsets may not even appear in the data ( they may have a support of 0).

Question 2: What is a minimal rare itemset?

An itemset X is a minimal rare itemset if sup(X) < minsup and all the proper subsets of X are frequent itemsets.

Question 3: What is a perfectly rare itemset?

An itemset X is a perfectly rare itemset if sup(X)โ‰ฅ minsup, sup(X)<maxsup and for any non empty subset ๐‘Œ โŠ‚ ๐‘‹, sup(Y) โ‰ค maxsup.

Question 4: Why is it challenging to design an algorithm to find the minimal rare itemsets compared to finding the frequent itemsets?

Generally, frequent itemset mining algorithms start from single items and combine them to find larger itemsets.

As itemsets become larger, the support can decrease.

Searching for minimal rare itemsets is challenging because to reach the minimal rare itemsets, we must pass through ยป the frequent itemsets. In other words, we cannot start by directly searching for minimal rare itemsets without considering how to pass through the frequent itemsets.

Question 5: What is the main difference between Apriori and AprioriInverse?

Contrarily to Apriori, AprioriInverse has two parameters : minsup and maxsup.

The main difference between Apriori and AprioriInverse from an algorithmic perspective, is that AprioriInverse will initially discards all items that have a support that is greater than maxsup. Then, AprioriInverse will search for frequent itemsets using the same procedure as Apriori.