## Algorithms

**SPMF **offers ** implementations of the following data
mining algorithms**.

### Sequential Pattern Mining

These algorithms discover sequential patterns in a set
of sequences. For a good overview of **sequential pattern mining
algorithms**, please read this survey
paper.

- algorithms for
**mining**sequential patterns (subsequences that appear in many sequences) of a sequence database- the
**CM-SPADE**algorithm (Fournier-Viger et al, 2014, powerpoint) - the
**CM-SPAM**algorithm (Fournier-Viger et al, 2014, powerpoint) - the
**FAST**algorithm (Salvemini et al, 2011) - the
**GSP**algorithm (Srikant et al., 1996) - the
**LAPIN**(aka LAPIN-SPAM) algorithm (Yang et al., 2005) - the
**PrefixSpan**algorithm (Pei et al., 2004, powerpoint, video) - the
**SPADE**algorithm (Zaki et al., 2001) - the
**SPAM**algorithm (Ayres et al., 2002)

- the
- algorithms for
**mining****closed sequential patterns**in a sequence database- the
**ClaSP**algorithm (Gomariz et al., 2013) - the
**CM-ClaSP**algorithm (Fournier-Viger et al, 2014, powerpoint) - the
**CloFAST**algorithm (Fumarola et al, 2016) - the
**CloSpan**algorithm (Yan et al., 2003) - the
**BIDE+**algorithm(Wang et al., 2007)

- the
- algorithms for
**mining****maximal sequential patterns**in a sequence database- the
**VMSP**algorithm (Fournier-Viger et al, 2014, powerpoint) - the
**MaxSP**algorithm (Fournier-Viger et al., 2013, powerpoint).

- the
- algorithms for
**mining the****top-k sequential patterns**in a sequence database- the
**TKS**algorithm (Fournier-Viger et al., 2013, powerpoint). - the
**TSP**algorithm (Tzvetkoz et al., 2003). - the
**Skopus**algorithm for mining the top-k sequential patterns using leverage and significance (Petijean et al., 2016)

- the
- algorithms for
**mining****sequential generator patterns**in a sequence database- the
**VGEN**algorithm (Fournier-Viger et al, 2014) - the
**FEAT**algorithm (Gao et al., 2008). - the
**FSGP**algorithm (Yi et al., 2011).

- the
- algorithms for mining
**nonoverlapping sequential patterns**in one or many sequences of symbols/characters (can count multiple occurrences of a pattern in each sequence)- the
**NOSEP**algorithm (Wu et al., 2018)

- the
- algorithms for mining
**compressing sequential patterns**- the
**GoKrimp**and**SeqKrimp**algorithms (Lam et al., 2012; Lam et al., 2014)

- the
- algorithm for identifying the
**top-k quantile based cohesive sequential patterns**in a single sequence or in multiple sequences- the
**QCSP**algorithm (Feremans et al., 2019)

- the
- algorithms for
**mining multidimensional sequential patterns**in a multidimensional sequence database- the
**SeqDIM**algorithm for mining**frequent multidimensional sequential patterns**in a multi-dimensional sequence database (Pinto et al., 2001) - the
**Songram et al.**algorithm for mining**frequent closed multidimensional sequential patterns**in a multi-dimensional sequence database (Songram et al. 2006)

- the
- the
**Fournier-Viger et al. algorithm,**a**sequential pattern mining algorithm**that combines several features from well-known sequential pattern mining algorithms and also proposes some original features (Fournier-Viger et al., 2008):- mining sequences with minimum support by database-projection (based on PrefixSpan, Pei et al., 2004)
- mining sequences with min/max time interval between events and min/max time length of a sequence (based on Hirate-Yamana, 2006)
- mining closed sequences (based on the BIDE+ algorithm by Wang et al. 2007)
- mining multi-dimensional sequences (based on Pinto et al. 2001)
- mining closed multi-dimensional sequences (based on Songram et al. 2006 and Pasquier et al., 1999)
- mining sequences with items having integer values and performing automatic clustering of these values (original extension described in Fournier-Viger et al., 2008)

- algorithm for
**mining****high-utility sequential patterns**in a**sequence****database**- the
**USPAN**algorithm (Yin et al. 2012)

- the
- algorithm for mining
**cost-efficient sequential patterns**(a.k.a.**low-cost high utility sequential patterns**)- the
**CorCEPB**algorithm for mining**cost-efficient patterns**in sequences with binary utility information and cost values (Fournier-Viger et al., 2020, ppt , video ) - the
**CEPB**algorithm for mining**cost-efficient patterns**in sequences with binary utility information and cost values - consider only sequence with positive utility(Fournier-Viger et al., 2020 , ppt, video ) - the
**CEPN**algorithm for mining**cost-efficient patterns**in sequences with numeric utility information and cost values (Fournier-Viger et al., 2020, ppt, video )

- the
- algorithm for
**mining****high-utility probability sequential patterns**in a**sequence****database**- the
**PHUSPM**algorithm (Zhang et al. 2018) - the
**UHUSPM**algorithm (Zhang et al. 2018)

- the
- algorithm for
**progressive sequential pattern mining**with**convergence guarantees**- the
**ProSecCo**algorithm (Servan-Schreiber et al. 2018)

- the
- algorithms for
**mining sequential patterns with flexible constraints**in a**time-extended sequence database**(eg. MOOC data)- the
**SPM-FC-L**algorithm (Song et al., 2022) - the
**SPM-FC-P**algorithm (Song et al., 2022)

- the
- the
**Occur**algorithm for finding all occurrences of some sequential patterns in sequences by post-processing. - algorithms for mining
**patterns**in**sequences of events**described by**time intervals**(i.e. Time Interval Related Pattern (TIRP) mining)- the
**FastTIRP**algorithm (Fournier-Viger et al., 2022) - the
**VertTIRP**algorithm (Mordvanyuk et al., 2021)

- the

### Sequential Rule Mining

These algorithms discover sequential rules in a set of sequences.

- algorithms for
**mining sequential rules in a sequence database**- the
**ERMiner**algorithm (Fournier-Viger et al., 2014) - the
**RuleGrowth**algorithm (Fournier-Viger et al., 2011, Fournier-Viger et al., 2015, powerpoint, video) - the
**CMRules**algorithm - the
**CMDeo**algorithm (Fournier-Viger et al., 2010) - the
**RuleGen**algorithm

- the
- algorithms for
**mining sequential rules**in a**sequence database**with the**window size constraint**- the
**TRuleGrowth**algorithm

- the
- algorithms for
**mining top-k sequential rules**in a**sequence database**- the
**TopSeqRules**algorithm for mining the**top-k sequential rules**(Fournier-Viger et al., 2011, powerpoint) - the
**TopSeqClassRules**algorithm for mining the**top-k class sequential rules**(a variation of Fournier-Viger et al., 2011) - the
**TNS**algorithm for mining**top-k non-redundant sequential rules**(Fournier-Viger 2013)

- the
- algorithm for
**mining****high-utility sequential rules**in a**sequence****database**- the
**HUSRM**algorithm (Zida et al., 2015)

- the

### Sequence Prediction

These algorithms for **predicting the next symbol
of a sequence based on a set of training sequences**

- the
**Compact Prediction Tree+ (CPT+)**algorithm (Gueniche et al., 2015, powerpoint, ) - the
**Compact Prediction Tree (CPT)**algorithm (Gueniche et al., 2013, ) - the
**First order Markov Chains (PPM - order 1)**(Clearly et al, 1984) - the
**Dependency Graph (DG)**(Padmanabhan, 1996) - the
**All-k-Order Markov Chains (AKOM)**(Pitkow, 1999) - the
**TDAG**(Laird & Saul, 1994) - the
**LZ78**(Ziv, 1978)

### Itemset Mining

These algorithms discover interesting itemsets (sets of
values) that appear in a transaction database (database records containing
symbolic data). For a good overview of ** itemset mining**,
please read this survey
paper.

- algorithms for discovering
**frequent itemsets**in a transaction database.- the
**Apriori**algorithm (Agrawal & Srikant, 1994, video - the
**AprioriTID**algorithm (Agrawal & Srikant, 1994) - the
**FP-Growth**algorithm (Han et al., 2004) - the
**Eclat**algorithm (Zaki, 2000) - the
**dEclat**algorithm (Zaki and Gouda, 2001, 2003) - the
**Relim**algorithm (Borgelt, 2005) - the
**H-Mine**algorithm (Pei et al., 2007) - the
**LCMFreq**algorithm (Uno et al., 2004) - the
**PrePost**and**PrePost+**algorithms (Deng et al., 2012, Deng et Lv, 2015) - the
**FIN**algorithm (Deng et al., 2014) - the
**DFIN**algorithm (Deng et al., 2016) - the
**NegFIN**algoritm (Aryabarzan et al., 2018)

- the
- algorithms for discovering
**frequent closed itemsets**in a transaction database.- the
**FPClose**algorithm (Grahne and Zhu, 2005) - the
**Charm**algorithm (Zaki and Hsiao, 2002) - the
**dCharm**algorithm (Zaki and Gouda, 2001) - the
**DCI_Closed**algorithm (Lucchese et al, 2004) - the
**LCM**algorithm (Uno et al., 2004) - the
**AprioriClose**aka**Close**algorithm (Pasquier et al., 1999) - the
**AprioriTID Close**algorithm (Pasquier et al., 1999, Agrawal & Srikant, 1994) - the
**NAFCP**algorithm (Le et al., 2015) - the
**NEclatClosed**algorithm (Aryabarzan et al., 2021)

- the
- algorithms for recovering all
**frequent itemsets**from**frequent closed itemsets**:- the
**LevelWise**algorithm (Pasquier et al., 1999) - the
**DFI-Growth**algorithm (Huang et al., 2019) - the
**DFI-List**algorithm (Wu et al., 2020)

- the
- algorithms for discovering
**frequent maximal itemsets**in a transaction database.- the
**FPMax**algorithm (Grahne and Zhu, 2003) - the
**Charm-MFI**algorithm for discovering**frequent closed itemsets**and**maximal frequent itemsets**by post-processing in a transaction database (Szathmary et al. 2006)

- the
- algorithms for
**mining frequent itemsets**with**multiple minimum supports**- the
**MSApriori**algorithm (Liu et al, 1999) - the
**CFPGrowth**++ algorithm (Uday & Reddy, 2011, Hu & Chen, 2006)

- the
- algorithms for
**mining generator itemsets**in a transaction database- the
**DefMe**algorithm for mining**frequent generator itemsets**in a transaction database (Soulet & Rioult, 2014) - the
**Pascal**algorithm for**mining frequent itemsets**, and identifying at the same time which one are**generators**(Bastide et al., 2002) - the
**Zart**algorithm for discovering**frequent closed itemsets**and their**generators**in a transaction database (Szathmary et al. 2007)

- the
- algorithms for
**mining rare itemsets**and/or**correlated itemsets**in a transaction database- the
**AprioriInverse**algorithm for mining**perfectly rare itemsets**(Koh & Roundtree, 2005) - the
**AprioriRare**algorithm for mining**minimal rare itemsets**and**frequent itemsets**(Szathmary et al. 2007b) - the
**CORI**algorithm for mining**minimal rare correlated itemsets**using the**support**and**bond measures**(Bouasker et al. 2015) - the
**RP-Growth**algorithm for mining rare itemsets (Tsang et al., 2011)

- the
- algorithms for performing
**targeted and dynamic queries about****association rules and frequent itemsets**.- the
**Itemset-Tree,**a data structure that can be updated incrementally, and algorithms for querying it. (Kubat et al, 2003) - the
**Memory-Efficient Itemset-Tree,**

- the
- algorithms
**to discover frequent itemsets**in a**stream**- the
**estDec**algorithm for mining**recent frequent itemsets**in a data stream (Chang & Lee, 2003) - the
**estDec+**algorithm for mining**recent frequent itemsets**in a data stream (Shin et al., 2014) - the
**CloStream**algorithm for mining**frequent closed itemsets**in a data stream (Yen et al, 2009)

- the
- the
**U-Apriori**algorithm for mining**frequent itemsets**in**uncertain data**(Chui et al, 2007) - the
**VME**algorithm for mining**erasable itemsets**(Deng & Xu, 2010) - algorithms to discover
**fuzzy frequent itemsets**in a**quantitative transaction database**- the
**FFI-Miner algorithm**for mining**fuzzy itemsets**(Lin et al., 2015) - the
**MFFI-Miner algorithm**for mining**multiple fuzzy itemsets**(Lin et al., 2016)

- the
- the
**OPUS-Miner**algorithm for mining self-sufficient itemsets (Webb et al., 2014) - algorithms to
**discover compressing itemsets**- the
**KRIMP**algorithm (Vreeken et al, 2011) - the
**SLIM**algorithm (Smets et al., 2012)

- the
- algorithms to discover the
**top-k most frequent itemsets**- the
**Apriori(top-k)**algorithm, which is a modified version of**Apriori** - the
**FPGrowth(top-k)**algorithm, which is a modified version of**FP-Growth**

- the

### Episode Mining

These algorithms discover **patterns**
(episodes) that appear in a **single** **sequence of
events**. For a good overview of ** episode mining**,
please read this survey
paper.

- algorithms for mining
**frequent episodes**- the
**EMMA**algorithm, which finds the frequent episodes, and counts the support based on the head frequency (Kuo-Yu et al., 2008) - the
**AFEM**algorithm, which finds the frequent episodes, and counts the support based on the head frequency (Fournier-Viger et al., 2022 ) - the
**MINEPI+**algorithm, which finds the frequent episodes, and counts the support based on the head frequency (Kuo-Yu et al., 2008) - the
**MINEPI**algorithm, which finds the frequent episodes, counts the support based on minimal occurrences, and does not allow simultaneous events (Mannila & Toivonen, 1997) - the
**TKE**algorithm, which finds the**top-k most frequent episodes**based on the head frequency (Fournier-Viger et al., 2020) - the
**MaxFEM**algorithm, which finds the**maximal frequent episodes**, and counts the support based on the head frequency (Fournier-Viger et al., 2022 , powerpoint)

- the
- algorithms for mining
**episode rules**- the
**POERM**algorithm for discovering**partially-ordered episode rules**in a sequence of events, using non-overlapping support (Fournier-Viger et al., 2021, , powerpoint) - the
**POERM-ALL**algorithm for discovering**partially-ordered episode rules**in a sequence of events, using non-overlapping support (Fournier-Viger et al., 2021, , powerpoint) - the
**POERMH**algorithm for discovering**partially-ordered episode rules**in a sequence of events, using the head support (Fournier-Viger et al., 2021, , powerpoint) - the
**NONEPI**algorithm for discovering episodes rules using the non-overlapping frequency (Ouarem et al., 2021) - algorithms to generate
**episodes rules**(Mannila & Toivonen, 1997) using the output of**TKE**,**AFEM**,**EMMA**or**MINEPI+**

- the
- algorithms for mining
**high utility episodes**in a sequence of complex events (a transaction database) with utility information- the
**HUE-SPAN**algorithm (Fournier-Viger et al., 2019, powerpoint) for mining**high utility episodes**in a sequence of complex events (a transaction database) with utility information - the
**US-SPAN**algorithm (Wu et al., 2013 ) for mining**high utility episodes**in a sequence of complex events (a transaction database) with utility information

- the
- the
**TUP**algorithm (Rathore et al., 2016) for mining**the top-k high utility episodes**in a sequence of complex events (a transaction database) with utility information - algorithms for mining
**nonoverlapping sequential patterns**in one or many sequences of symbols- the
**NOSEP**algorithm (Wu et al., 2018)

- the
- algorithms for
**mining frequent sequential patterns with periodic wilcard gaps**in a sequence of characters- the
**MAPD**algorithm (Wu, Y. et al., 2014)

- the
- algorithms for
**mining self-adaptive one-off weak-gap strong sequential pattern**s in a sequence of characters - the
**OWSP-Miner**algorithm (Wu, Y. et al., 2022)

### Periodic Pattern Mining

These algorithms discover **patterns**
that periodically appear in a **sequence** **of
records **(e.g. transactions)

- Algorithms for
**finding periodic patterns**in a**single sequence****of events**(also called a transaction database)- the
**PFPM**algorithm (Fournier-Viger et al, 2016a, powerpoint, video ) for mining**frequent periodic patterns**in a sequence of transactions (a transaction database))

- the
- Algorithms for
**mining stable periodic itemsets**in a sequence of events (also called a transaction database) with or without timestamps- the
**SPP-Growth**algorithm for mining**stable periodic patterns**(Fournier-Viger et al. 2019, powerpoint, video ) - the
**TSPIN**algorithm to discover the**Top-k Stable Periodic frequent itemsets**(Fournier-Viger et al.,2021 )

- the
- Algorithms for
**mining locally periodic patterns**in a transaction database with or without timestamps.- the
**LPP-Growth**algorithm (Fournier-Viger. 2020, powerpoint) - the
**LPPM_breadth**algorithm (Fournier-Viger. 2020, powerpoint) - the
**LPPM_depth**algorithm (Fournier-Viger. 2020, powerpoint)

- the
- Algorithms for discovering
**periodic patterns that are significant**or**non-redundant**- the
**NPFPM**algorithm to discover**non-redundant periodic frequent itemsets**in a transaction database (Afriyie et al., 2020, 2021) - the
**PPFP**algorithm to discover**productive periodic frequent itemsets**in a transaction database (Nofong, V. M., 2016) - the
**SRPFPM algorithm**to discover**self-reliant periodic frequent patterns**in a transaction database (Nofong et al, 2021)

- the
- Algorithms for
**mining periodic high utility itemsets**in a sequenceof transactions (a transaction database) containing utility information- the
**PHM**algorithm (Fournier-Viger et al, 2016b, powerpoint ) - the
**PHMN**algorithm (2023) for**mining periodic high utility itemsets with positive or negative utility** - the
**PHMN+**algorithm (2023) for**mining periodic high utility itemsets with positive or negative utility** - the
**PHM_irregular**algorithm for**mining irregular high utility itemsets**(the opposite of periodic), which is a simple variation of the**PHM****algorithm**

- the
- Algorithms for finding
**periodic patterns**in**multiple****sequences of events**- the
**MPFPS_BFS**algorithms (Fournier-Viger, P., Li, Z., et al., 2019, powerpoint) for mining**periodic patterns that are common to multiple sequences** - the
**MPFPS_DFS**algorithms (Fournier-Viger, P., Li, Z., et al., 2019, powerpoint) for mining**periodic patterns that are common to multiple sequences**

- the
- Algorithms for
**mining****rare correlated periodic patterns common to multiple sequences** - the
**MRCPPS**algorithm (Fournier-Viger et al., 2020)

### Graph Pattern Mining

These algorithms discover patterns in **graphs**

- Algorithms for
**mining frequent subgraphs**- the
**TKG**algorithm for mining the**top-k frequent subgraphs**in a graph database (Fournier-Viger, 2019, powerpoint) - the
**gSpan**algorithm for mining all the**frequent subgraphs**in a graph database (Yan et al., 2002) - the
**cgSpan**algorithm for mining all the**frequent closed subgraphs in a graph database or single graph**(using the traditional support or MNI support) (Shaul et al. 2021)

- the
- Algorithms for
**mining patterns in a dynamic attributed graph**- the
**TSeqMiner**algorithm (Fournier-Viger et al., 2019) - the
**AER-Miner**algorithm (Fournier-Viger et al., 2020, PPT)

- the

### High-Utility Pattern Mining

These algorithms discover patterns having a high utility
(importance) in different kinds of data. For a good overview of **
high utility itemset mining**, you may read this survey
paper, and the high
utility-pattern mining book.

- algorithms for mining
**high-utility itemsets**in a transaction database having profit information- the
**EFIM**algorithm (Zida et al. 2016, Zida et al., 2015, powerpoint) - the
**FHM**algorithm (Fournier-Viger et al., 2014, powerpoint, video) - the
**HUI-Miner**algorithm (Liu & Qu, 2012, video) - the
**HUP-Miner**algorithm (Krishnamoorthy, 2014) - the
**mHUIMiner**algorithm (Peng et al., 2017) - the
**UFH**algorithm (Dawar et al, 2017) - the
**HMiner**algorithm (Krishnamoorty, 2017) - the
**ULB-Miner**algorithm (Duong et al, 2018) - the
**IHUP**algorithm (Ahmed et al., 2009) - the
**Two-Phase**algorithm (Liu et al., 2005) - the
**UP-Growth**algorithm (Tseng et al., 2011) - the
**UP-Growth+**algorithm (Tseng et al., 2013) - the
**UP-Hist**algorithm (Dawar et al., 2015) - the
**d2HUP**algorithm (Liu et al, 2012) - the
**FHIM**algorithm (Sahoo et al., 2015)

- the
- algorithm for efficiently mining
**high-utility itemsets with**in a transaction database**length constraints**- the
**FHM+**algorithm (Fournier-Viger et al, 2016, powerpoint)

- the
- algorithm for
**mining****correlated high-utility itemsets**in a transaction database- the
**FCHM_bond**algorithm, to use the**bond**measure (Fournier-Viger et al, 2016, powerpoint, Fournier-Viger 2018 et al., to appear, video ) - the
**FCHM_allconfidence**algorithm, to use the**all-confidence**measure (Fournier-Viger et al, 2016, powerpoint, Fournier-Viger 2018 et al., to appear) - the
**ECHUM**algorithm (Ramesth et al., 2022, @aman955 under the GPL license)

- the
- algorithm for mining
**high-utility itemsets**in a transaction database containing**negative unit profit values**- the
**FHN**algorithm (Fournier-Viger et al., 2014, powerpoint) - the
**HUINIV-Mine**algorithm (Chu et al., 2009)

- the
- algorithms for mining
**multi-level or cross-level high utility itemsets**in a transaction database with a**taxonomy**:**CLH-Mine**r for discovering cross-level high-utility itemsets (Fournier-Viger et al., 2020, ppt)**FEACP**for discovering cross-level high-utility itemsets (Tung et al., 2022)**TKC**for discovering the top-k cross-level high-utility itemsets (Nouioua et al., 2020, video, ppt) -- will be in a next release of SPMF...**MLHUI-Mine**r for discovering the multi-level high utility itemsets (Cagliero et al., 2017)

- algorithm for mining
**low-cost high utility itemsets**in a transaction database with cost and utility information- the
**LCIM**algorithm (Fournier-Viger et al, 2022)

- the
- algorithm for mining
**frequent high-utility itemsets**in a transaction database- the
**FHMFreq**algorithm, a variation of the**FHM**algorithm (Fournier-Viger et al., 2014

- the
- algorithm for
**mining****on-shelf high-utility itemsets**in a transaction database containing information about time periods of items- the
**FOSHU**algorithm (Fournier-Viger et al., 2015, powerpoint) - the
**TS-HOUN**algorithm (Lan et al., 2014)

- the
- algorithm for
**incremental high-utility itemset mining**

- the
**EIHI**algorithm (Fournier-Viger et al., 2015, powerpoint) - the
**HUI-LIST-INS**algorithm (Lin et al., 2014)

- the
- algorithm for
**incremental closed high-utility itemset mining** - the
**IncCHUI**algorithm for**incrementally****discovering****the****closed high utility itemsets**(Dam et al., 2018, code obtained from github based on GPL license) - algorithm for
**mining**concise representations of**high-utility itemsets**in a transaction database- the
**HUG-Miner**algorithm (Fournier-Viger et al., 2014, powerpoint) for mining high-utility generators - the GHUI
**-Miner**algorithm (Fournier-Viger et al., 2014, powerpoint) for mining generators of high-utility itemsets - the
**MinFHM**algorithm (Fournier-Viger et al., 2016, powerpoint, video ) for**mining minimal high-utility itemsets** - the
**EFIM-Closed**algorithm (Fournier-Viger et al., 2016, powerpoint) for mining**closed high-utility itemsets** - the
**CHUI-Miner**algorithm (Wu et al., 2015) for mining closed high-utility itemsets - the
**CLS-Miner**algorithm (Dam et al., 2019 ) for mining closed high-utility itemsets - the
**HMiner_Closed**algorithm (Nguyen et al., 2019) for mining closed high-utility itemsets - the
**CHUD**algorithm for mining closed high-utility itemsets (Tseng et al., 2011/2015) - the
**CHUI-Miner(Max)**algorithm for mining**maximal high utility itemset**s (Wu et al., 2019). - the
**HUCI_Miner**algorithm for simultaneously mining**closed high utility itemsets**and**high utility generators**(Sahoo et al., 2015)

- the
- algorithm for mining the
**skyline high-utility itemsets**- the
**SkyMine**algorithm (Goyal et al., 2015) - the
**SFUI_UF**algorithm for mining**skyline frequent high utility itemsets**(Song et al., 2021 ) - the
**SFU_CE**algorithm for mining**skyline frequent high utility itemsets**(Song et al., 2021 , ppt) - the
**SFUPMinerUemax**algorithm for mining**skyline frequent high utility itemsets**(Lin et al, 2016) - the
**EMSFUI_D**algorithm for mining**skyline frequent high utility itemsets**(Liu et al., 2022) - the
**EMSFUI_B**algorithm for mining**skyline frequent high utility itemsets**(Liu et al., 2022)

- the
- algorithm for mining the
**top-k high-utility itemsets**- the
**TKU**algorithm (Tseng et al., 2015), obtained from UP-Miner under GPL license - the
**TKO-Basic**algorithm (Tseng et al., 2015) - the
**THUI**algorithm (Krishnamoorty, 2019)

- the
- algorithms for mining the
**top-k high utility itemsets**from a**data stream**with a window- the
**FHMDS**and**FHMDS-Naive**algorithms (Dawar et al. 2017)

- the
- algorithm for
**mining quantitative high utility itemsets**in a transaction database:- the
**FHUQI-Miner**algorithm (Nouioua et al., 2021, powerpoint) - the
**VHUQI**algorithm (Wu et al., 2014) - the
**TKQ**algorithm for mining the quantitative high utility itemsets (Nouioua et al. 2021, powerpoint, video) - the
**CHUQI-Miner**algorithm for mining the correlated quantitative high utility itemsets (Nouioua et al. 2021, powerpoint, video)

- the
- algorithm for
**mining****high-utility sequential rules**in a**sequence****database**- the
**HUSRM**algorithm (Zida et al., 2015)

- the
- algorithm for
**mining****high-utility sequential patterns**in a**sequence****database**- the
**USPAN**algorithm (Yin et al. 2012)

- the
- algorithm for mining
**cost-efficient sequential patterns**(a.k.a.**low-cost high utility sequential patterns**)- the
**CorCEPB**algorithm for mining**cost-efficient patterns**in sequences with binary utility information and cost values (Fournier-Viger et al., 2020 , ppt, video ) - the
**CEPB**algorithm for mining**cost-efficient patterns**in sequences with binary utility information and cost values - consider only sequence with positive utility(Fournier-Viger et al., 2020 , ppt) - the
**CEPN**algorithm for mining**cost-efficient patterns**in sequences with numeric utility information and cost values (Fournier-Viger et al., 2020, ppt, video )

- the
- algorithm for
**mining****high-utility probability sequential patterns**in a**sequence****database**- the
**PHUSPM**algorithm (Zhang et al. 2018) - the
**UHUSPM**algorithm (Zhang et al. 2018)

- the
- algorithm for
**heuristically mining the top-k high-utility itemsets** - the
**TKU-CE**algorithm (Song et al. 2021) - the
**TKU-CE+**algorithm (Song et al., 2021) - algorithm for
**mining high-utility itemsets****evolutionary algorithms, swarm intelligence techniques**or other**meta-heuristics**- the
**HUIM-AF**algorithm (Song et al., 2021) - the
**HUIM-HC**algorithm (Fournier-Viger et al., 2021) - the
**HUIM-SA**algorithm (Fournier-Viger et al., 2021) - the
**HUIM-ACO**algorithm (Song et al., 2020) - the
**HUIM-SPSO**algorithm (Song et al., 2020) - the
**HUIF-PSO**algorithm (Song et al., 2018) - the
**HUIF-GA**algorithm (Song et al., 2018) - the
**HUIF-BA**algorithm (Song et al., 2018) - the
**HUIM-ABC**algorithm (Song et al., 2018) - the
**HUIM-GA**algorithm (Kannimuthu et al., 2014) - the
**HUIM-BPSO**algorithm (Lin et al, 2016) - the
**HUIM-GA-tree**algorithm (Lin et al, 2016) - the
**HUIM-BPSO-tree**algorithm (Lin et al, 2016)

- the
- algorithm for
**mining high average-utility itemsets**in a transaction database- the
**HAUI-Miner**algorithm for mining**high average-utility itemsets**(Lin et al, 2016) - the
**EHAUPM**algorithm for mining**high average-utility itemsets**(Lin et al, 2017) - the
**HAUIM-GMU**algorithm for mining**high average-utility itemsets**(Song et al., 2021 ) - the
**HAUI-MMAU**algorithm for mining**high average-utility itemsets****with multiple thresholds**(Lin et al, 2016) - the
**MEMU**algorithm for mining**high average-utility itemsets****with multiple thresholds**(Lin et al, 2018)

- the
- algorithms for mining the
**top-k high average utility itemsets**- the
**ETAUIM**algorithm (2023) for mining the**top-k high average utility itemset**using a breadth-first search (obtained from github liuxuan615 under the GPL license due to containing GPL code)

- the
- algorithms for mining
**high utility episodes**in a**sequence of complex events**(a transaction database)- the
**HUE-SPAN**algorithm (Fournier-Viger et al., 2019, powerpoint) for mining**high utility episodes**in a sequence of complex events (a transaction database) with utility information - the
**TUP**algorithm (Rathore et al., 2016) for mining**top-k high utility episodes**in a sequence of transactions (a transaction database)) - the
**UP-SPAN**algorithm (Wu et al., 2013 ) for mining**high-utility episodes**(patterns that yield a high profit) in a sequence of transactions (a transaction database) containing utility information

- the
- algorithms for mining
**periodic high-utility patterns**(periodic patterns that yield a high profit) in a sequence of transactions (a transaction database) containing utility information- the
**PHM**algorithm (Fournier-Viger et al, 2016b, powerpoint) - the
**PHMN**and**PHMN+**algorithms (2023) for**mining periodic high utility itemsets with positive or negative utility**( obtained from Github @laughing1999 under the GPL license)

- the
- algorithms for discovering
**irregular high utility itemsets**(non periodic patterns) in a transaction database with utility information- the
**PHM_irregular**algorithm, which is a simple variation of the**PHM****algorithm**

- the
- algorithm for discovering
**local high utility itemsets**in a database with utility information and timestamps- the
**LHUI-Miner**algorithm (Fournier-Viger et al., 2019, powerpoint)

- the
- algorithm for discovering
**peak high utility itemsets**in a database with utility information and timestamps- the
**PHUI-Miner**algorithm (Fournier-Viger et al., 2019, powerpoint)

- the
- algorithm for discovering
**locally trending high utility itemsets**in a database with utility information and timestamps- the
**LTHUI-Miner**algorithm (Fournier-Viger et al., 2020, video ppt

- the
- algorithm for discovering
**high utility association rules**- the
**HGB_all**algorithm to derive al**l high utility association rules**or just the non redundant high utility association rules (Sahoo et al., 2015) - the
**HGB**algorithm to derive al**l non redundant high utility association rules**(Sahoo et al., 2015)

- the

### Association Rule Mining

These algorithms discover interesting associations between symbols (values) in a transaction database (database records with binary attributes).

- an algorithm for mining
**all association rules**with the**confidence measure**in a transaction database (Agrawal & Srikant, 1994) - an algorithm for mining
**all association rules**with the**lift measure**in a transaction database (adapted from Agrawal & Srikant, 1994) - an algorithm for mining the
**IGB informative and generic basis of association rules**in a transaction database (Gasmi et al., 2005) - an algorithm for mining
**perfectly sporadic association rules**(Koh & Roundtree, 2005) - an algorithm for mining
**closed association rules**(Szathmary et al. 2006). - an algorithm for mining
**minimal non redundant association rules**(Kryszkiewicz, 1998) - the
**Indirect**algorithm for mining**indirect association rules**(Tan et al. 2000; Tan et 2006) - the
**FHSAR**algorithm for**hiding sensitive association rules**(Weng et al. 2008) - the
**TopKRules**algorithm for mining the**top-k association rules**(Fournier-Viger, 2012b, powerpoint) - the
**TopKClassRules**algorithm for mining the**top-k class association rules**(a variation of TopKRules. This latter is described in Fournier-Viger, 2012b, powerpoint) - the
**TNR**algorithm for mining**top-k non-redundant association rules**(Fournier-Viger 2012d, powerpoint) - the
**HGB**and**HGB_All**algorithm to find**high utility association rules**or**non redundant high utility association rules**(Sahoo et al., 2015)

### Stream pattern mining

These algorithms discovers various kinds of patterns in a stream (an infinite sequence of database records (transactions))

- the
**estDec**algorithm for mining**recent frequent itemsets**in a data stream (Chang & Lee, 2003) - the
**estDec+**algorithm for mining**recent frequent itemsets**in a data stream (Shin et al., 2014) - the
**CloStream**algorithm for mining**frequent closed itemsets**in a data stream (Yen et al, 2009) - algorithms for mining the
**top-k high utility itemsets**from a**data stream**with a window- the
**FHMDS**and**FHMDS-Naive**algorithms (Dawar et al. 2017)

- the

### Clustering

These algorithms automatically find clusters in different kinds of data

- the original
**K-Means**algorithm (MacQueen, 1967) - the
**Bisecting K-Means**algorithm (Steinbach et al, 2000) - the
**K-Means++**algorithm (Arthur et al., 2007) - algorithms for
**density-based clustering**- the
**DBScan**algorithm (Ester et al., 1996) - the
**Optics**algorithm to extract a**cluster ordering of points,**which can then be use to generate DBScan style clusters and more (Ankerst et al, 1999)

- the
- a
**hierarchical clustering**algorithm - a tool called
**Cluster Viewer**for**visualizing clusters** - a tool called
**Instance Viewer**for**visualizing the input of clustering algorithms**

### Time series mining

These algorithms perform various tasks to analyze time series data

- an algorithm for
**converting a time series to a sequence**of symbols using the**SAX representation of time series.**Note that if one converts a set of time series with SAX, he will obtain a**sequence database**, which allows to then apply traditional algorihtms for sequential rule mining and sequential pattern mining on time series (SAX, 2007). - algorithms for calculating the
**prior moving average****time series**(to remove noise) - algorithms for calculating the
**cumulative moving average**f a**time series**(to remove noise) - algorithms for calculating the
**central moving average**of a**time series**(to remove noise) - an algorithm for calculating the
**median smoothing**of a**time series**(to remove noise) - an algorithm for calculating the
**exponential smoothing**of a**time series**(to remove noise) - an algorithm for calculating the
**min max normalization**of a**time series** - an algorithm for calculating the
**autocorrelation function**of a**time series** - an algorithm for calculating the
**standardization**of a**time series** - an algorithm for calculating the
**first and second order differencing**of a**time series** - an algorithm for calculating the
**piecewise aggregate approximation**of a**time series**(to reduce the number of data points of a time series) - an algorithm for calculating the
**linear regression of a time series**(using the least squares method) - an algorithm for
**splitting a time series**into**segments of a given length** - an algorithm for
**splitting a time series**into a**given number of segments** - algorithms to
**cluster time series (**group time-series according to their similarities). This can be done by applying the clustering algorithms offered in SPMF (**K-Means, Bisecting K-Means, DBScan, OPTICS, Hierarchical clustering**) on**time series**. - a tool called
**Time Series Viewer**for**visualizing time series**

### Classification

- the
**ID3**algorithm for building decision trees (Quinlan, 1986) - the
**KNN**(K-Nearest Neighbor) algorithm - Classification based on
**association rule mining**- the
**ACAC**algorithm (Huang et al, 2011) - the
**ACCF**algorithm (Li et al., 2008) - the
**ACN**algorithm (Kundu et al., 2008) - the
**ADT**algorithm (Wang et al., 2000) - the
**CBA**algorithm (Liu et al., 1998) - the
**CBA2**algorithm (Liu et al., 2001) - the
**CMAR**algorithm (Li et al, 2001) - the
**L3**algorithm (Baralis et al, 2002) - the
**MAC**algorithm (Abdelhamid et al., 2012)

- the
- A framework for comparing multiple classifiers using holdout and k-fold cross-validation

### Text mining

- an algorithm for
**classifying text documents**using a Naive Bayes classifier approach (S. Raghu, 2015) - an algorithm for
**clustering****texts**using the**tf*idf**measure (S. Raghu, 2015)

### Dataset generation tools

- A tool for generating a synthetic transaction database
- A tool for generating a synthetic sequence database
- A tool for generating a synthetic sequence database with timestamps
- A tool for generating datasets for clustering

### Dataset transformation tools

- A tool for converting a sequence database to a transaction database
- A tool for converting a transaction database to a sequence database
- A tool for converting a text file to a sequence database (each sentences becomes a sequence)
- A tool for converting a sequence database in various formats (CSV, KOSARAK, BMS, IBM...) to a sequence database in SPMF format
- A tool for converting a transaction database in various formats (CSV...) to a transaction database in SPMF format
- A tool for converting time-series to a sequence database
- A tool to generate utility values for a transaction database
- A tool to add timestamps to a sequence database
- A tool to fix a transaction database having some problems (with or without utility/time information)
- A tool for removing utility information from a database having utility information
- A tool to resize a database in SPMF format (a text file) using a percentage of lines of data from an original database.

### Dataset statistics tools

- A tool for calculating statistics about a transaction database
- A tool for calculating statistics about a transaction database with utility information
- A tool for calculating statistics about a sequence database
- A tool for calculating statistics about a graph database
- A tool for calculating statistics about a product transaction database.
- A tool for calculating statistics about a sequence database with cost and binary utility.
- A tool for calculating statistics about a sequence database with cost and numeric utility.
- A tool for calculating statistics about a sequence database with utility.
- A tool for calculating statistics about a time-extended sequence database.
- A tool for calculating statistics about a transaction database with cost and utility.
- A tool for calculating statistics about a transaction database with utility and period information.
- A tool for calculating statistics about a transaction database with utility and timestamps.
- A tool for calculating statistics about an event sequence.
- A tool for calculating statistics about an interval sequence database.
- A tool for calculating statistics about a multi-dimensional sequence database.
- A tool for calculating statistics about a multi-dimensional sequence database with timestamps.
- A tool for calculating statistics about an uncertain transaction database.
- A tool for calculating statistics about a file with double vectors (instances) for clustering.
- A tool for calculating statistics about time series.

### Dataset viewer tools

- A times series viewer to visualize time series.
- A cluster viewer to visualize clusters produced by clustering algorithms
- A graph viewer to view files containing graphs or subgraphs taken as input or produced as output by algorithms such as TKG, gSpan and cgSgpan.
- A simple tool to view the content of an ARFF file
- A tool to view the content of an an event sequence file
- A tool to view a sequence database cost binary utility file
- A tool to view a sequence database cost numeric utility file
- A tool to view a sequence database file
- A tool to view a time-extended sequence database
- A tool to view a multi-dimensional sequence database
- A tool to view a multi-dimensional time sequence database
- A tool to view a sequence utility database file
- A tool to view a cost utility transaction database file
- A tool to view a transaction database file
- A tool to view an uncertain transaction database file
- A tool to view a utility transaction database file
- A tool to view a utility time transaction database file
- A tool to view a utility period transaction database file
- A tool to view a product transaction database file
- A tool to view a graph database file
- A tool to view a sequence database file with time intervals

** ** GUI tools

- The
**Algorithm Explorer**tool to explore the algorithms offered in SPMF - The
**Memory Viewer**tool to observe the memory usage of algorithms in real-time - The
**Pattern Viewer**tool to view patterns found by algorithms and their frequency distributions - The
**Workflow Editor**tool to create a worfklow with several algorithms and run it. - A tool to run experiments where one or more algorithms are run and a parameter is varied.
- The SPMF text editor
- A tool to download an offline copy of the SPMF documentation

### Data structures

- red-black tree,
- itemset-tree,
- binary tree,
- KD-tree,
- triangular matrix.
- a collection of optimized primitive type data structures to replace hashmaps, lists, sets, etc.

### Visual map of algorithms

You can visualize the relationship between the various **data
mining algorithms** offered in **SPMF** by clicking
on this map** (last updated : 2015/09/12 - SPMF 0.97)**: