SPMFAn Open-Source Data Mining Library

Introduction

Algorithms

Download

Documentation

Datasets

FAQ

License

Contributors

Citations

Performance

Developers' guide

Forum

Blog

Other resources

494002 visitors
since 2010

-------------
* Upcoming edited Springer book on high utility pattern mining

* We are hiring a postdoctoral researcher (details here) and a Ph.D. student.  Send your CV with cover letter to Prof. Fournier-Viger.
-------------

 

Contributors

Project leaders

Algorithms

  • Philippe Fournier-Viger
  • Jerry Chun-Wei Lin, Ting Li, Lu Yang et al: implementation of the following algorithms: HAUI-Miner, HAUI-MMAU, FFI-Miner, MMFI-Miner, SFUPMiner, HUIM-BPSO, HUIM-BPSO-tree, HUIM_GA, HUIM-GA_Tree
  • Antonio Gomariz Peñalver - implementations of the following algorithms: Clasp, CloSpan, Spade (regular and parallelized version), GSP, SPAM (alternative implementation) and PrefixSpan (alternative implementation), and other algorithms..
  • Ted Gueniche - implementations of sequence prediction models : CPT+, CPT, PPM, DG, AKOM, TDAG, LZ78
  • Azadeh Soltani - implementations of the following algorithms: MISApriori, CFPGrowth++, estDec
  • and estDecPlus
  • Souleymane Zida - implementation of the EFIM and HUSRM algorithms with P. Fournier-Viger
  • Hoang Thanh Lam, Toon Calders, Fabian Moerchen, Dmitriy Fradkin - implementations of the GoKrimp and SeqKrimp algorithms
  • Prashant Barhate - implementations of the IHUP, UP-Growth and UPGrowth+ algorithms
  • Zhihong Deng - implementations of the PrePost, PrePost+ and FIN algorithms
  • Alan Souza - implementations of the LCM algorithms
  • Sabarish Raghu - implementation of the text clusterer, and document classifier
  • Vikram Goyal, Ashish Sureka, Dhaval Patel, Siddharth Dawar - implementation of the SkyMine algorithm
  • Ahmed El-Serafy, Hazem El-Raffiee - implementation of the GCD algorithm
  • Ryan Panos - implementation of a version of CMDeo with the lift measure
  • Yuriy Guskov - the time series viewer of SPMF includes some code from the simple java plot viewer by Y. Guskov
  • Peng et al. the code of mHUIMiner under GPL license obtained from the Github repository.

Code optimizations

  • Dan Cappucio - suggested an important optimization of the FPGrowth implementation

User interface

  • Hanane Amirat - provided feedback to improve the user interface design

Performance evaluation

  • Rincy N. Thomas - performance comparison of sequential pattern mining algorithms on various datasets, and found several errors on the website.

Datasets

  • Zhang Zhongjie - provided nine datasets with item labels in SPMF format, converted from the UCI repository (Skin, USCensus, PAMP, OnlineRetail, RecordLink, PowerC, SUSY, and kddcup99)
  • Ashwin Balani - provided MatLab code for dataset generation (available on the "datasets" page)

Installation and command line interface

  • Antonio Sergio Ando - feedback and some code for the command line interface, bug fix, and ANT script (to be included).

Bug reports,  bug fixes and other contributions

  • Matthieu Gousseff - reported a bug related to sequence identifiers in sequential pattern mining algorithms, and errors in the documentation
  • Benjamin Andow - reported a bug in the generation of closed association rule mining with FPClose algorithm
  • Bima Haryanto Putra - reported a bug in the TopKRules algorithm
  • Muhammad Yasir Chaudhry - reported a bug in the Apriori algorithm
  • with length constraint
  • Srikumar Krishnamoorthy - reported a problem in some utility mining datasets
  • Majdi Mafarja - reported a bug in the HUIM-bso algorithm
  • Tai Dinh - reported a bug in the USpan algorithm
  • Tin Truong Chi - reported a bug in the USpan algorithm
  • Tarannum Zaman - reported a bug in closed association rule mining with FPClose
  • Natalia Mord - reported a bug in the MaxSP algorithm
  • Andrey Shestakov - reported a bug in the command line interface
  • Antoine Pigeau - reported a bug in the VMSP algorithm
  • Himel Dev - reported a bug in the VMSP algorithm
  • Slimane Oulad Naoui - suggested to better handle incorrect algorithm parameters
  • Tin Truong Chui - fixed a bug in the ClaSP and CM-Clasp algorithms
  • Yimin Zhang - reported a problem in the FOSHU and TS-HOUN algorithms
  • Gehad Ahmed Soltan Abd-Elaleem - reported a bug in the FHSAR
  • algorithm
  • Preethy Varma - reported a bug in ClaSP and CM-Clasp algorithms
  • Mike Rostermund - reported a problem in the output of SPAM based algorithms
  • Jaroslav Fowkes and Thomas Christie - reported a bug in the GoKrimp implementation
  • Jamshi Nazeer - reported a bug in the FPClose implementation
  • Insu Yun - reported a bug in the FPClose implementation
  • Dharmen Punjani - reported a bug in the text clusterer
  • Martin Böckle - reported a bug in the GUI of SPMF
  • Pierre-Emmanuel Leroy - reported a bug in the Cori algorithm
  • Masanori Akiyoshi- reported a bug in the FPGrowth algorithm
  • Asmaa - reported a bug in the Zart algorithm
  • Choong Shin Siang and Wong Li Pei - reported a bug on the use of the "maxgap" constraint
  • for SPAM-based algorithms.
  • Ryan G. Benton - reported and fixed a bug in the Itemset-Tree and Memory Efficient Itemset-Tree
  • G. Gutierrez - reported a bug about how ID3 trees are printed to console
  • Mehran Memon - reported a bug in the BIDE+ implementation
  • Nahumi - reported a bug for running the "Fournier08-Closed+time" algorithm using the GUI
  • Wen Zhang - reported bugs: a bug that was generating headless exceptions, and a bug about command line arguments
  • Abdalghani Abujabal - reported bugs/improvement for the ECLAT
  • algorithm
  • C. Albert Thompson - reported a bug in association rule generation with CFPGrowth, and a bug in TNS/TopSeqRules
  • Manperta Negara Situmorang - reported a bug in association rule generation with CFPGrowth
  • Michael Witbrock - reported an issue with character encoding of source code files
  • Cheng Zhou - reported inconsistencies in sequential pattern mining algorithms source code
  • Vathsala.H - reported a bug in the ID3 implementation
  • Rai. A. - reported a bug in using the hierarchical clustering algorithm with the GUI
  • Arina Pramudita - reported some minsup rounding inconsistencies between some sequential pattern mining algorithms
  • Radhika Loombas - reported duplicated variables and unreachable code in FPGrowth
  • Srinivas K. - bug report for the sequence database generator and for the Charm algorithm
  • Faisal Feroz - bug report for the AbstractOrderedItemset and ItemsetTree classes
  • Peter Toth - bug report for association rule mining with FPGrowth, and bug report for TNS / TopSeqRules
  • saiph..@... - reported a bug in the GUI
  • Antonio Sergio Ando - reported a bug in FHSAR
  • Said Hamani - reported a bug in the INDIRECT
  • algorithm
  • Faezeh Jafari - reported a bug in Cluster.java
  • Dvijesh Bhatt - reported a bug in SPAM
  • shouwangji@... reported unused variables and other minor problems in PrefixSpan
  • Brock - reported a bug in BIDE +
  • G. Bruno - reported bugs in BIDE+ and FPGrowth
  • E. Schubert - reported a bug in DBScan
  • A. Pardeshi - reported a bug in ECLAT / CHARM

Antonio Gomariz Peñalver and Philippe Fournier-Viger

Copyright © 2008-2017 Philippe Fournier-Viger. All rights reserved.