SPMF: A Java Open-Source Data Mining Library

Documentation

This section provides examples of how to use the SPMF open-source data mining library to perform various data mining tasks. If you have any question or if you want to report a bug,you can check the FAQ,post in the forum or contact me.You can also have a look at the various articles that I have referenced on the algorithms page of this website to learn more about each algorithm. Moreover, you can have a look at the page about videos and other resources related to SPMF, including a textbook in Thai language.

Itemset Mining (Frequent Itemsets, Rare Itemsets, etc.)

High-Utility Pattern Mining

Association Rule Mining

Clustering

Sequential Pattern Mining

Sequential Rule Mining

Sequence Prediction (source code version only)

Periodic pattern mining

Episode Mining

Graph Pattern Mining

Text Mining

Example 240 : Clustering Texts with a text clusterer
Example 241 : Classifying Text documents using a Naive Bayes approach (source code version only)

Time Series Mining

Example 242 : Vizualize time series using the time series viewer
Example 243 : Calculate the prior moving average of time series
Example 244 : Calculate the cumulative moving average of time series
Example 245 : Calculate the central moving average of time series
Example 246 : Calculate the min max normalization of a time series
Example 247 : Calculate the standardization of a time series
Example 248 : Calculate the median smoothing of a time series
Example 249 : Calculate the exponential smoothing of a time series
Example 250 : Calculate the first order differencing of a time series
Example 251 : Calculate the second order differencing of a time series
Example 252 : Calculate the piecewise aggregate approximation of time series
Example 253 : Calculate the autocorelation function of a time series
Example 254 : Calculate the regression line of a time series using the least square method, and perform time series forecasting
Example 255 : Split time series by length
Example 256 : Split time series by number of segments
Example 257 : Convert time series to sequences using the SAX algorithm (useful to be able to apply sequential pattern mining/rule algorithms to time series)

Besides the above example for time series mining, clustering algorithms such as K-Means can also be applied to time-series.

Classification

Example 258 : How to train the ID3 classifier to perform classification (source code version only)
Example 259 : How to train the KNN classifier to perform classification (source code version only)
Example 260 : How to train the CMAR classifier to perform classification (source code version only)
Example 261 : How to train the ACAC classifier to perform classification (source code version only)
Example 262 : How to train the ACCF classifier to perform classification (source code version only)
Example 263 : How to train the ACN classifier to perform classification (source code version only)
Example 264 : How to train the ADT classifier to perform classification (source code version only)
Example 265 : How to train the CBA classifier to perform classification (source code version only)
Example 266 : How to train the CBA2 classifier to perform classification (source code version only)
Example 267 : How to train the CMAR classifier to perform classification (source code version only)
Example 268 : How to train the L3 classifier to perform classification (source code version only)
Example 269 : How to train the CMAR classifier to perform classification (source code version only)
Example 270 : How to train the MAC classifier to perform classification (source code version only)
Example 271 : Run an experiment to compare many classifiers such as ID3, CMAR, ACCF, CBA and CBA2. (source code version only)

Dataset transformation tools

Dataset statistics tools

Dataset viewer tools

Other GUI tools

Experiments

Example 336 : Run experiments to compare the performance of one or more algorithms on a dataset (one parameter is varied)

Copyright © 2008-2024 Philippe Fournier-Viger. All rights reserved.