Questions about Periodic Pattern Mining
(from the Pattern Mining Course)

Click on a question to see the answer.

Question 1: Explain a limitation of traditional of periodic pattern mining using only the maxper threshold.

A limitation of the traditional problem of periodic itemset mining is that it is too strict. If an itemset has only one period that is greater than maxper, it will be discarded as non periodic. This can eliminate some itemsets that are almost always periodic.

Question 2: Consider the following transaction database:
Transaction id Items
t1 {a, c}
t2 {e}
t3 {a, b, c, d, e}
t4 {b, c, d, e}
t5 {a, c, d}
t6 {a,c,e}
t7 {b, c, e}

What are the periods of itemset {b,c}?

What is the maximum periodicity of itemset {b,c}?

What is the average periodicity of itemset {b,c}?

The periods of {b,c} are ps({b,c}) = {3, 1, 3}

The maximum periodicity of itemset {b,c} is max({b,c}) = max({3, 1, 3}) = 3

The average periodicity of itemset {b,c} is avgper({b,c}) = (3 +1 +3) / 3 = 1.75

Question 3: What properties can be used in periodic pattern mining to reduce the search space?

Two important properties related to periodic patterns that can be used to reduce the search space are the following:

  • The maximum periodicity is monotonic. In other words, the maximum periodicity of an itemset Y cannot be less than the maximum periodicity of its subsets. This means that if an itemset X has a maximum periodicity that is greater than maxper, all the supersets of X will also have a maximum periodicity that is greater than maxper.
  • The average periodicity is monotonic. In other words, the average periodicity of an itemset Y cannot be less than the average periodicity of its subsets. This means that if an itemset X has an averageperiodicity that is greater than maxAvg, all the supersets of X will also have an average periodicity that is greater than maxAvg.
Question 4: Is there a relationship between the support of an itemset and its average periodicity?

Yes, the support of an itemset and its average periodicity are related to each other.

Let there be an itemset X and a database D. The relationship is: avgper(X) = |D|/(sup(X) + 1) where |D| is the number of transactions in the database and sup(X) is the number of transactions containing X.

For instance, if we know that the itemset {b,c} from question 1 appears in 3 transactions and that the database contains 7 transactions, we can derive the average periodicity as avgper({b,c}) = 7 / (3 + 1) = 1.75.

This property was shown and proven in the following paper:

Fournier-Viger, P., Lin, C.-W., Duong, Q.-H., Dam, T.-L., Sevcic, L., Uhrin, D., Voznak, M. (2016). PFPM: Discovering Periodic Frequent Patterns with Novel Periodicity Measures. Proc. 2nd Czech-China Scientific Conference 2016, Elsevier, 10 pages