|An Open-Source Data Mining Library|
Frequently Asked Questions
Please read the installation instructions on the download page.
For most data mining algorithms the memory usage depends on the parameters of the algorithm and the kind of data. For example, for the Apriori algorithm, the performance depends on (1) the number of transactions, (2) their length, (3) the number of items, (3) the dataset is dense or sparse, (4) the minsup parameter, etc.
By default, the Java virtual machine only offers 256 megabytes of RAM. This is very small. Therefore it is easy to run out of memory. To avoid running out of memory, it is possible to increase the memory that the Java virtual machine uses.
If you are using the release version of SPMF, you can launch the software from the command line and use the XMX parameter to increase the memory that SPMF can use:
If you are using the source code version with Eclipse, you can increase the memory by doing this:
If you have increased the memory and the algorithm still run out of memory, you should consider changing the parameters of the algorithm that you are using. For example, for Apriori, you may raise the minsup parameter if it is set too low.
If the algorithm is not in the list of algorithms, then I don't have it.
I usually choose the algorithms that I implement according to my interests. If you would like that I implement a particular algorithm, you can send me a suggestion by e-mail with (1) the name of the algorithm and (2) the article describing the algorithm. I will read your suggestion. Then, I will evaluate if I'm interested to implement it or not. Then if I'm interested by the algorithm, it can take days, weeks or months before I have time to implement it (it depends on my schedule). If I'm not interested, I will not implement it.
Yes. If you are interested to participate you can write me an e-mail. I'm interested in source code for algorithms that I have not implemented. You can send me the source code and I will then evaluate the quality of your code and if it is good enough and if I think that the algorithm is useful, I will include it in the next version of the software. If I include your code, I will add your name to the list of contributors on the website.
If you want to understand how the source code is organized in SPMF, you can read the developers guide. It provides some useful information to understand/modify the source code.
SPMF is licensed under the GNU GPL v3 license.
The GPL license provides four freedoms:
But if you want to redistribute the source code, you must:
For more details about what you can and cannot do, please read the GNU GPL license.
If you want to know how a particular algorithm works, you should read the original article describing the algorithm. If you have some questions but you cannot find the answer to your question in these articles, you can ask your question in my data mining forum. I will try to answer you if the question is simple and the answer is short.
The examples are in the documentation section of the website. You may also consider reading the article describing the algorithm for more information about the algorithm.
Do you have the C++, C# or <insert another programming language here> version of the XXXXXX algorithm?
No. If it is not on the website, then I don't have it.
You can check the "datasets" page of this website. It provides download links and information for obtaining several popular datasets used in the data mining literature that can be used with SPMF.
Please send me information about the bug by e-mail . I will try to fix it as soon as possible.
If you appreciate the software, the best way to say thank you is to cite the website in your thesis/papers/articles, post link to this website on the internet so that more people can find it and to recommend it to your colleagues.
Please cite SPMF as follows:
If you are using Latex, the BibTex reference is:
If you have a general question, please ask it in the data mining forum so that I can answer you and that the answer can be shared with everyone else. But you can also ask me by e-mail if the question has to be private. I try to answer as fast as possible. But if your question is long and requires a long answer or if I'm currently busy, I may takes a few days before I answer.
Copyright © 2008-2017 Philippe Fournier-Viger. All rights reserved.