Documentation
This section provides examples of how to use the SPMF open-source data mining library. to perform various data mining tasks.
If you have any question or if you want to report a bug, you can check the FAQ, post in the forum or contact me. You can also have a look at the various articles that I have referenced on the algorithms page of this website to learn more about each algorithm.
List of examples
Itemset Mining (Frequent Itemsets, Rare Itemsets, etc.)
- Example 1 : Mining Frequent Itemsets by Using the Apriori Algorithm
- Example 2 : Mining Frequent Itemsets by Using the AprioriTID Algorithm
- Example 3 : Mining Frequent Itemsets by Using the FP-Growth Algorithm
- Example 4 : Mining Frequent Itemsets by Using the Relim Algorithm
- Example 5 : Mining Frequent Itemsets by Using the Eclat / dEclat Algorithm
- Example 6 : Mining Frequent Itemsets by Using the H-Mine Algorithm
- Example 7 : Mining Frequent Itemsets by Using the FIN Algorithm
- Example 8 : Mining Frequent Itemsets by Using the DFIN Algorithm
- Example 9 : Mining Frequent Itemsets by Using the NegFIN Algorithm
- Example 10 : Mining Frequent Itemsets by Using the PrePost / PrePost+ Algorithm
- Example 11 : Mining Frequent Itemsets by Using the LCMFreq Algorithm
- Example 12 : Mining Frequent Closed Itemsets Using the AprioriClose Algorithm
- Example 13 : Mining Frequent Closed Itemsets Using the DCI_Closed Algorithm
- Example 14 : Mining Frequent Closed Itemsets Using the Charm / dCharm Algorithm
- Example 15 : Mining Frequent Closed Itemsets Using the LCM Algorithm
- Example 16 : Mining Frequent Closed Itemsets Using the FPClose Algorithm
- Example 17 : Mining Frequent Closed Itemsets Using the NAFCP Algorithm
- Example 18 : Mining Frequent Maximal Itemsets Using the FPMax Algorithm
- Example 19 : Mining Frequent Maximal Itemsets Using the Charm-MFI Algorithm
- Example 20 : Mining Frequent Generator Itemsets Using the DefMe Algorithm
- Example 21 : Mining Frequent Itemsets and Identify the Generators Using the Pascal Algorithm
- Example 22 : Mining Frequent Closed Itemsets and Minimal Generators Using the Zart Algorithm
- Example 23 : Mining Minimal Rare Itemsets Using the AprioriRare Algorithm
- Example 24 : Mining Perfectly Rare Itemsets Using the AprioriInverse Algorithm
- Example 25 : Mining Rare Correlated Itemsets Using the CORI Algorithm
- Example 26 : Mining Rare Itemsets Using the RP-Growth Algorithm
- Example 27 : Mining Closed Itemsets from a Data Stream Using the CloStream Algorithm (source code version only)
- Example 28 : Mining Recent Frequent Itemsets from a Data Stream Using the estDec Algorithm (source code version only)
- Example 29 : Mining Recent Frequent Itemsets from a Data Stream Using the estDec+ Algorithm (source code version only)
- Example 30 : Mining Frequent Itemsets from Uncertain Data with the UApriori Algorithm
- Example 31 : Mining Erasable Itemsets from a Product Database with the VME algorithm
- Example 32 : Building, updating incrementally and using an Itemset-Tree to generate targeted frequent itemsets and association rules (source code version only)
- Example 33 : Building, updating incrementally and using a Memory-Efficient Itemset-Tree to generate targeted frequent itemsets and association rules (source code version only)
- Example 34 : Mining Frequent Itemsets with Multiple Support Thresholds Using the MSApriori Algorithm
- Example 35 : Mining Frequent Itemsets with Multiple Support Thresholds Using the CFPGrowth++ Algorithm
- Example 36 : Mining Fuzzy Frequent Itemsets in a quantitative transaction database using the FFI-Miner algorithm
- Example 37 : Mining Multiple Fuzzy Frequent Itemsets in a quantitatve transaction database using the MFFI-Miner algorithm
- Example 38 : Deriving Frequent Itemsets from Frequent Closed Itemsets using the LevelWise algorithm
- Example 39 : Deriving Frequent Itemsets from Frequent Closed Itemsets using the DFI-Growth algorithm
- Example 40 : Mining Self-Sufficient Itemsets using the Opus-Miner algorithm
High-Utility Pattern Mining
- Example 41 : Mining High-Utility Itemsets from a Transaction Database with Utility Information using the Two-Phase Algorithm
- Example 42 : Mining High-Utility Itemsets from a Transaction Database with Utility Information using the FHM Algorithm
- Example 43 : Mining High-Utility Itemsets from a Transaction Database with Utility Information using the EFIM Algorithm
- Example 44 : Mining High-Utility Itemsets from a Transaction Database with Utility Information using the HUI-Miner Algorithm
- Example 45 : Mining High-Utility Itemsets from a Transaction Database with Utility Information using the HUP-Miner Algorithm
- Example 46 : Mining High-Utility Itemsets from a Transaction Database with Utility Information using the UP-Growth / UP-Growth+ Algorithm
- Example 47 : Mining High-Utility Itemsets from a Transaction Database with Utility Information using the IHUP Algorithm
- Example 48 : Mining High-Utility Itemsets from a Transaction Database with Utility Information using the mHUIMiner Algorithm
- Example 49 : Mining High-Utility Itemsets from a Transaction Database with Utility Information using the HMiner Algorithm
- Example 50 : Mining High-Utility Itemsets from a Transaction Database with Utility Information using the ULB-Miner Algorithm
- Example 51 : Mining High-Utility Itemsets from a Transaction Database with Utility Information using the UFH Algorithm
- Example 52 : Mining High-Utility Itemsets from a Transaction Database with Utility Information using the d2HUP Algorithm
- Example 53 : Mining High-Utility Itemsets from a Transaction Database with Utility Information while considering Length Constraints, using the FHM+ algorithm
- Example 54 : Mining Correlated High-Utility Itemsets in a Transaction Database with Utility Information using the FCHM_bond algorithm
- Example 55 : Mining Correlated High-Utility Itemsets in a Transaction Database with Utility Information using the FCHM_allconfidence algorithm
- Example 56 : Mining Frequent High-Utility Itemsets from a Transaction Database with Utility Information using the FHMFreq Algorithm
- Example 57 : Mining High-Utility Itemsets from a Transaction Database with Positive or Negative Unit Profit using the FHN Algorithm
- Example 58 : Mining High-Utility Itemsets from a Transaction Database with Positive or Negative Unit Profit using the HUINIV-Mine Algorithm
- Example 59 : Mining On-Shelf High-Utility Itemsets from a Transaction Database using the FOSHU Algorithm
- Example 60 : Mining On-Shelf High-Utility Itemsets from a Transaction Database using the TS-HOUN Algorithm
- Example 61 : Incremental High-Utility Itemset Mining in a Transaction Database with utility information using the EIHI Algorithm (source code version only)
- Example 62 : Incremental High-Utility Itemset Mining in a Transaction Database with utility information using the HUI-LIST-INS Algorithm (source code version only)
- Example 63 : Mining Closed High-Utility Itemsets from a transaction database with utility information using the EFIM-Closed Algorithm
- Example 64 : Mining Closed High-Utility Itemsets from a transaction database with utility information using the CHUI-Miner Algorithm
- Example 65 : Mining Closed High-Utility Itemsets from a transaction database with utility information using the CHUI-Miner(Max) Algorithm
- Example 66 : Mining Closed High-Utility Itemsets from a transaction database with utility information using the CHUD Algorithm
- Example 67 : Mining Generators of High-Utility Itemsets from a transaction database with utility information using the GHUI-Miner Algorithm
- Example 68 : Mining High-Utility Generator Itemsets from a transaction database with utility information using the HUG-Miner Algorithm
- Example 69 : Mining Minimal High-Utility Itemsets from a transaction database with utility information using the MinFHM Algorithm
- Example 70 : Mining Skyline High-Utility Itemsets in a transaction database with utility information using the SkyMine Algorithm
- Example 71 : Mining High-Utility Sequential Rules from a Sequence Database with utility information using the HUSRM Algorithm
- Example 72 : Mining High-Utility Sequential Patterns from a Sequence Database with utility information using the USPAN Algorithm
- Example 73 : Mining High-Utility Probability Sequential Patterns from a Sequence Database with utility and probability information using the PHUSPM Algorithm
- Example 74 : Mining High-Utility Probability Sequential Patterns from a Sequence Database with utility information and probability using the UHUSPM Algorithm
- Example 75 : Mining High-Utility Itemsets based on Particle Swarm Optimization with the HUIM-BPSO algorithm
- Example 76 : Mining High Utility Itemsets Using a Genetic Algorithm with the HUIM-GA algorithm
- Example 77 : Mining High-Utility Itemsets based on Particle Swarm Optimization with the HUIM-BPSO-tree algorithm
- Example 78 : Discovery of High Utility Itemsets Using a Genetic Algorithm with the HUIM-GA-tree algorithm
- Example 79 : Mining High-Utility Itemsets based on Particle Swarm Optimization with the HUIF-PSO algorithm
- Example 80 : Mining High-Utility Itemsets Using a Genetic Algorithm with the HUIF-GA algorithm
- Example 81 : Mining High-Utility Itemsets Using a Bat Algorithm with the HUIF-BA algorithm
- Example 82 : Mining High-Utility Itemsets Using a Artificial Bee Colony Algorithm with the HUIF-ABC algorithm
- Example 83 : Mining Skyline Frequent-Utility Patterns using the SFUPMinerUemax algorithm
- Example 84 : Mining the Top-k high-utility itemsets using the TKU algorithm
- Example 85 : Mining the Top-k high-utility itemsets using the TKO (basic) algorithm
- Example 86 : Mining the Top-k high-utility itemsets in a data stream using the FHMDS algorithm
- Example 87 : Mining High Average-Utility Itemsets in a Transaction Database with Utility Information using the HAUI-Miner Algorithm
- Example 88 : Mining High Average-Utility Itemsets in a Transaction Database with Utility Information using the EHAUPM Algorithm
- Example 89 : Mining High Average-Utility Itemsets with Multiple Thresholds in a Transaction Database using the HAUI-MMAU Algorithm
- Example 90 : Mining High Average-Utility Itemsets with Multiple Thresholds in a Transaction Database using the MEMU Algorithm
- Example 91 : Mining Quantitative High Utility Itemsets in a Transaction Database using the VHUQI Algorithm
- Example 92 : Mining Irregular High-Utility Itemsets using the PHM_irregular algorithm
- Example 93 : Mining Local High Utility Itemsets in a Transaction Database using the LHUI-Miner Algorithm
- Example 94 : Mining Peak High Utility Itemsets in a Transaction Database using the PHUI-Miner Algorithm
Association Rule Mining
- Example 95 : Mining All Association Rules
- Example 96 : Mining All Association Rules with the lift measure
- Example 97 : Mining All Association Rules using the GCD algorithm
- Example 98 : Mining the IGB basis of Association Rules
- Example 99 : Mining Perfectly Sporadic Association Rules
- Example 100 : Mining Closed Association Rules
- Example 101 : Mining Minimal Non Redundant Association Rules
- Example 102 : Mining Indirect Association Rules with the INDIRECT algorithm
- Example 103 : Hiding Sensitive Association Rules with the FHSAR algorithm.
- Example 104 : Mining the Top-K Association Rules
- Example 105 : Mining the Top-K Class Association Rules (association rules with a fixed consequent)
- Example 106 : Mining the Top-K Non-Redundant Association Rules
Clustering
- Example 107 : Clustering using the K-Means algorithm
- Example 108 : Clustering using the DBScan algorithm
- Example 109 : Using Optics to extract a cluster-ordering of points and DB-Scan style clusters
- Example 110 : Clustering using the Bisecting K-Means algorithm
- Example 111 : Clustering using a Hierarchical Clustering algorithm
- Example 112 : Visualizing clusters using the Cluster Viewer
- Example 113 : Visualizing instances using the Instance Viewer
Sequential Pattern Mining
- Example 114 : Mining Frequent Sequential Patterns Using the PrefixSpan Algorithm
- Example 115 : Mining Frequent Sequential Patterns Using the GSP Algorithm
- Example 116 : Mining Frequent Sequential Patterns Using the SPADE Algorithm
- Example 117 : Mining Frequent Sequential Patterns Using the CM-SPADE Algorithm
- Example 118 : Mining Frequent Sequential Patterns Using the SPAM Algorithm
- Example 119 : Mining Frequent Sequential Patterns Using the CM-SPAM Algorithm
- Example 120 : Mining Frequent Sequential Patterns Using the FAST Algorithm
- Example 121 : Mining Frequent Sequential Patterns Using the LAPIN Algorithm
- Example 122 : Mining Frequent Closed Sequential Patterns Using the ClaSP Algorithm
- Example 123 : Mining Frequent Closed Sequential Patterns Using the CM-ClaSP Algorithm
- Example 124 : Mining Frequent Closed Sequential Patterns Using the CloFAST Algorithm
- Example 125 : Mining Frequent Closed Sequential Patterns Using the CloSpan Algorithm
- Example 126 : Mining Frequent Closed Sequential Patterns Using the BIDE+ Algorithm
- Example 127 : Mining Frequent Closed Sequential Patterns by Post-Processing using SPAM or PrefixSpan
- Example 128 : Mining Frequent Maximal Sequential Patterns Using the MaxSP Algorithm
- Example 129 : Mining Frequent Maximal Sequential Patterns using the VMSP Algorithm
- Example 130 : Mining Frequent Sequential Generator Patterns Using the FEAT Algorithm
- Example 131 : Mining Frequent Sequential Generator Patterns Using the FSGP Algorithm
- Example 132 : Mining Frequent Sequential Generator Patterns Using the VGEN Algorithm
- Example 133 : Mining Compressing Sequential Patterns Using the GoKrimp Algorithm
- Example 134 : Mining Frequent Top-K Sequential Patterns Using the TKS Algorithm
- Example 135 : Mining Frequent Top-K Sequential Patterns Using the TSP Algorithm
- Example 136 : Mining Frequent Multi-dimensional Sequential Patterns Using SeqDIM (with PrefixSpan and Apriori)
- Example 137 : Mining Frequent Closed Multi-dimensional Sequential Patterns Using SeqDIM/Songram (with Bide+ and AprioriClose)
- Example 138 : Mining Sequential Patterns with Time Constraints from a Time-Extended Sequence Database
- Example 139 : Mining Closed Sequential Patterns with Time Constraints from a Time-Extended Sequence Database
- Example 140 : Mining Sequential Patterns with Time Constraints from a Time-Extended Sequence Database containing Valued Items (source code version only)
- Example 141 : Mining Closed Multi-dimensional Sequential Patterns from a Time-Extended Sequence Database
- Example 142 : Mining Progressive Sequential Patterns using the ProSecCo algorithm
- Example 143 : Finding all occurrences of some sequential pattern(s) by post-processing using the Occur algorithm
- Example 144 : the QCSP algorithm for mining the top-k quantive cohesive sequential patterns in a single sequence or in multiple sequences (thanks to Lens Fereman et al.)
- Example 145: Mining Cost-Efficient Sequential Patterns Using CEPB Algorithm
- Example 146: Mining Cost-Efficient Sequential Patterns Using CorCEPB Algorithm
- Example 147: Mining Cost-Efficient Sequential Patterns Using CEPN Algorithm
Sequential Rule Mining
- Example 148 : Mining Sequential Rules Common to Several Sequences with the CMRules algorithm
- Example 149 : Mining Sequential Rules Common to Several Sequences with the CMDeo algorithm
- Example 150 : Mining Sequential Rules Common to Several Sequences with the RuleGrowth algorithm
- Example 151 : Mining Sequential Rules Common to Several Sequences with the ERMiner algorithm
- Example 152 : Mining Sequential Rules between Sequential Patterns with the RuleGen algorithm
- Example 153 : Mining Sequential Rules Common to Several Sequences with the Window Size Constraint using TRuleGrowth
- Example 154 : Mining the Top-K Sequential rules
- Example 155 : Mining the Top-K Non-Redundant Sequential rules
Sequence Prediction (source code version only)
- Example 156 : Perform Sequence Prediction using the CPT+ Sequence Prediction Model
- Example 157 : Perform Sequence Prediction using the CPT Sequence Prediction Model
- Example 158 : Perform Sequence Prediction using the PPM Sequence Prediction Model
- Example 159 : Perform Sequence Prediction using the DG Sequence Prediction Model
- Example 160 : Perform Sequence Prediction using the AKOM Sequence Prediction Model
- Example 161 : Perform Sequence Prediction using the TDAG Sequence Prediction Model
- Example 162 : Perform Sequence Prediction using the LZ78 Sequence Prediction Model
- Example 163 : Comparing Several Sequence Prediction Models
Periodic pattern mining
- Example 164 : Mining Periodic Frequent Patterns using the PFPM algorithm
- Example 165 : Mining Stable Periodic Frequent Patterns using the SPP-Growth algorithm
- Example 166 : Mining Periodic High-Utility Itemsets using the PHM algorithm
- Example 167 : Mining Periodic Patterns in Multiple Sequences using the MPFPS-BFS or MPFPS-DFS algorithms
- Example 168 : Mining Rare Correlated Periodic Patterns Common to Multiple Sequences using the MRCPPS algorithm
Episode Mining
- Example 169 : Mining the Top-K Frequent Episodes in a Complex Sequence using the TKE algorithm
- Example 170 : Mining Frequent Episodes in a Complex Sequence using the EMMA algorithm, which counts the support based on the head frequency
- Example 171 : Mining Frequent Episodes in a Complex Sequence using the MINEPI+ algorithm, awhich counts the support based on the head frequency
- Example 172 : Mining Frequent Episodes in a Complex Sequence using the MINEPI algorithm, which counts the support based on minimal occurrences
- Example 173 : Mining High Utility Episodes using the HUE-SPAN algorithm
- Example 174 : Mining High Utility Episodes using the UP-SPAN algorithm
- Example 175 : Mining the Top-K High Utility Episodes using the TUP algorithm
Graph Pattern Mining
- Example 176 : Mining the Top-K Frequent Subgraphs in a Labeled Graph Database using the TKG algorithm
- Example 177 : Mining Frequent Subgraphs in a Labeled Graph Database using the gSpan algorithm
- Example 178 : Mining Significant Trend Sequence in a Dynamic Attributed Graph using the TSeqMiner algorithm
Text Mining
- Example 180 : Clustering Texts with a text clusterer
- Example 181 : Classifying Text documents using a Naive Bayes approach (source code version only)
Time Series Mining
- Example 182 : Vizualize time series using the time series viewer
- Example 183 : Calculate the prior moving average of time series
- Example 184 : Calculate the cumulative moving average of time series
- Example 185 : Calculate the central moving average of time series
- Example 186 : Calculate the min max normalization of a time series
- Example 187 : Calculate the standardization of a time series
- Example 188 : Calculate the median smoothing of a time series
- Example 189 : Calculate the exponential smoothing of a time series
- Example 190 : Calculate the first order differencing of a time series
- Example 191 : Calculate the second order differencing of a time series
- Example 192 : Calculate the piecewise aggregate approximation of time series
- Example 193 : Calculate the autocorelation function of a time series
- Example 194 : Calculate the regression line of a time series using the least square method, and perform time series forecasting
- Example 195 : Split time series by length
- Example 196 : Split time series by number of segments
- Example 197 :
Convert time series to sequences using the SAX algorithm (useful to then apply sequential pattern mining/rule algorithms)
Besides the above example for time series mining, clustering algorithms such as K-Means can also be applied to time-series.
Classification
- Example 198 : Creating a decision tree with the ID3 algorithm to predict the value of a target attribute (source code version only)
Tools
- Example 199 : Converting a sequence database to SPMF format (CSV, KOSARAK, IBM, BMS, Snake...)
- Example 200 : Converting a transaction database to SPMF format (CSV...)
- Example 201 : Converting a sequence database to a transaction database
- Example 202 : Converting a transaction database to a sequence database
- Example 203 : Generating a synthetic sequence database
- Example 204 : Generating a synthetic sequence database with timestamps
- Example 205 : Generating a synthetic transaction database
- Example 206 : Generating synthetic utility values for a transaction database without utility values
- Example 207 : Calculating statistics for a sequence database
- Example 208 : Calculating statistics for a transaction database
- Example 209 : Calculating statistics for a transaction database with utility information
- Example 210 : Add consecutive timestamps to a sequence database without timestamps
- Example 211 : Using the ARFF format in the source code version of SPMF
- Example 212 : Using a TEXT file as input in the source code version of SPMF
- Example 213 : Fix a transaction database
- Example 214 : Fix item ids in a transaction database
- Example 215 : Remove utility information from a transaction database
- Example 216 : Resize a database in SPMF format (a text file)
Copyright © 2008-2024 Philippe Fournier-Viger. All rights reserved.