# The Pattern Mining Course (BETA)

Philippe Fournier-Viger
Distinguished professor, Ph.D.
https://www.philippe-fournier-viger.com

## Introduction

This is a free online course about pattern mining. It is designed to introduce students or researchers to the different topics of pattern mining, and explain the key algorithms and key concepts.

Pattern mining is a subfield of data mining that aim at applying algorithms to discover interesting patterns in data. These patterns can be used to understand the data or to support decision-making or tasks such as prediction.

This course consists of multiple lectures, where some videos are provided for each lecture. In general, it is not necessary to watch all the content. Someone could skip some topics as needed.

Note: this is a beta version of the course. Thus, this page will evolve over time with more content.

If you have any comments or suggestions, you may send me an e-mail or post a message in the data mining forum.

## Lectures

 # Lecture Exercises 1 Introduction Introduction to data mining (pdf / ppt / video - 45 min) 2 Frequent itemset mining and association rule mining Frequent itemset mining and the Apriori Algorithm (pdf / ppt / video - 63 min) The Eclat algorithm (pdf / ppt / video - 37 min) Association analysis (pdf / ppt / video - 48 min) Additional ressource(s): Fournier-Viger, P., Lin, J. C.-W., Vo, B, Chi, T.T., Zhang, J., Le, H. B. (2017). A Survey of Itemset Mining. WIREs Data Mining and Knowledge Discovery, Wiley, e1207 doi: 10.1002/widm.1207, 18 pages. Luna, J. M., Fournier-Viger, P., Ventura, S. (2019). Frequent Itemset Mining: a 25 Years Review. WIREs Data Mining and Knowledge Discovery, Wiley, 9(6):e1329. DOI: 10.1002/widm.1329 3 Concise representations of patterns Maximal, closed and generator itemsets (pdf / ppt / video - 50 min) 4 Rare Pattern Mining Rare itemset mining (the AprioriRare and AprioriInverse algorithms) (pdf / ppt / video - 38 min) 5 Correlated and statistically significant patterns Correlated and statistically significant itemsets (pdf / ppt / video - 49 min ) 6 High Utility Itemset Mining High utility itemset mining (pdf / video - 18 min) The HUI-Miner and FHM algorithms (pdf / video - 45 min) 7 Sequential pattern mining An Introduction to Sequential Pattern Mining? (pdf / ppt / video - 23 min) The PrefixSpan algorithm (pdf / ppt / video - 32 min) Additional ressource(s): Fournier-Viger, P., Lin, J. C.-W., Kiran, R. U., Koh, Y. S., Thomas, R. (2017). A Survey of Sequential Pattern Mining. Data Science and Pattern Recognition (DSPR), vol. 1(1), pp. 54-77. Tutorial: Using the SPMF software to discover frequent patterns in text documents (tutorial) 8 Sequential rule mining An Introduction to Sequential Rule Mining (pdf / ppt / video - 33 min ) The CMRules algorithm (pdf / ppt / video - 32 min) 9 Episode Mining ... Questions about episode mining 11 Other topics Periodic pattern mining (pdf / ppt / video - 34 min) Approximate pattern mining Frequent subgraph mining (pdf / ppt / video - 11 min) Interactive pattern mining Classification using patterns .... Questions about periodic pattern mining Questions about frequent subgraph mining 10 ... ...

## Software, source code and datasets

To try the different pattern mining algorithms discussed in this course, you can download the SPMF data mining software. SPMF is an open-source software, offering over 230 algorithms. It is implemented in Java and there exist also unofficial wrappers for some other languages. Besides, you can find several public datasets to try the algorithms from SPMF on the datasets page of SPMF

## More videos on pattern mining

If you want to see more videos on pattern mining, you may also check:
- The video page on the SPMF website: SPMF: A Java Open-Source Data Mining Library (philippe-fournier-viger.com)

• How can I contact you if I find some error in the course?
Please can send me an e-mail and I will try to fix the errors, and you will be listed as a contributor on this webpage.
You may see the resources indicated on this page, as well as other videos on my Youtube Channel. Besides, you can try the different algorithms discussed in this course by using the SPMF software, which is free and open-source. Also if you have question, you can also post your questions in the data mining forum. I check this forum every few days and will try to answer your questions.
• Can I use and modify your Powerpoints to teach a course at my university?
Yes, I will be very happy about this! The goal of this free course is to share knowledge. But if you reuse my powerpoints, I ask you to cite this website in your modified PPT and indicate that your powerpoint is based on my content.

## Bibliography

This course is based on content from research articles mentioned in the PPTs and PDFs and also some information from those books:

1. Fournier-Viger, P., Lin. J. C.-W., Vo, B., Nkambou, R., Tseng, V. S. (editors). (2019) High-Utility Pattern Mining: Theory, Algorithms and Applications, Springer.
2. Han and Kamber (2011), Data Mining: Concepts and Techniques, 3rd edition, Morgan Kaufmann Publishers,
3. Tan, Steinbach & Kumar (2006), Introduction to Data Mining, Pearson education, ISBN-10: 0321321367.
4. Data Mining: The Textbook by Aggarwal (2015)
5. Data Mining and Analysis Fundamental Concepts and Algorithms by Zaki & Meira (2014)

## Contributors

Several people have given feedback, ideas or reported errors, related to this course:

• Chongsheng Zhang
• Wensheng Gan
• Tai Dinh
• ...

Visitors: 000105