Frequently Asked Questions

  1. How can I install the software?
  2. The software ran out of memory. What should I do?
  3. Do you have the source code of the XXXXXXX algorithm ?
  4. Could you implement the XXXXXXX algorithm for me?
  5. Could I participate in the development of your software?
  6. How the source code is organized?
  7. Can I use SPMF in commercial software? Can I include your source code in my software?
  8. Could you explain how the XXXXXX algorithm works?
  9. Could you give me some examples of how to use the XXXXXXX algorithm?
  10. Do you have the C++, C# or <insert another programming language here> version of the XXXXXX algorithm?
  11. Where can I find some large datasets?
  12. I have found a bug in your software!
  13. The software is very useful. How can I say thank you?
  14. How should I cite SPMF?
  15. Other questions

How can I install the software?

Please read the installation instructions on the download page.

The software ran out of memory. What should I do?

For most data mining algorithms the memory usage depends on the parameters of the algorithm and the kind of data. For example, for the Apriori algorithm, the performance depends on (1) the number of transactions, (2) their length, (3) the number of items, (3) the dataset is dense or sparse, (4) the minsup parameter, etc.

By default, the Java virtual machine only offers 256 megabytes of RAM. This is very small. Therefore it is easy to run out of memory. To avoid running out of memory, it is possible to increase the memory that the Java virtual machine uses.

If you are using the release version of SPMF, you can launch the software from the command line and use the XMX parameter to increase the memory that SPMF can use:

java -Xmx1024m -jar spmf.jar

This indicates that the software can now use up to 1 GB of RAM.

If you are using the source code version with Eclipse, you can increase the memory by doing this:

Go in the menu Run > Run Configurations >  then select the class that you usually run such as "MainTestApriori" for Apriori >  go to the "Arguments" tab >  Then paste the following text in the "VM Arguments" field:

-Xmx1024${build_files}m

Then press "Run".

If you have increased the memory and the algorithm still run out of memory, you should consider changing the parameters of the algorithm that you are using. For example, for Apriori, you may raise the minsup parameter if it is set too low.

Do you have the source code of the XXXXX algorithm ?

If the algorithm is not in the list of algorithms, then I don't have it.

Could you implement the XXXXXXX algorithm for me?

I usually choose the algorithms that I implement according to my interests. If you would like that I implement a particular algorithm, you can send me a suggestion by e-mail with (1) the name of the algorithm and (2) the article describing the algorithm. I will read your suggestion. Then, I will evaluate if I'm interested to implement it or not. Then if I'm interested by the algorithm, it can take days, weeks or months before I have time to implement it (it depends on my schedule). If I'm not interested, I will not implement it.

Could I participate in the development of your software

Yes. If you are interested to participate you can write me an e-mail. I'm interested in source code for algorithms that I have not implemented. You can send me the source code and I will then evaluate the quality of your code and if it is good enough and if I think that the algorithm is useful, I will include it in the next version of the software. If I include your code, I will add your name to the list of contributors on the website.

How the source code is organized?

If you want to understand how the source code is organized in SPMF, you can read the developers guide. It provides some useful information to understand/modify the source code.

Can I use SPMF in commercial software? Can I include your source code in my software?

SPMF is licensed under the GNU GPL v3 license.

The GPL license provides four freedoms:

  • Obtain and run the program for any purpose
  • Get a copy of the source code
  • Modify the source code
  • Re-distribute the modified source code

But if you want to redistribute the source code, you must:

  • provide access to the source code,
  • license derived work under the same GPL v3 license

For more details about what you can and cannot do, please read the GNU GPL license.

Could you explain how the XXXXXX algorithm works?

If you want to know how a particular algorithm works, you should read the original article describing the algorithm. If you have some questions but you cannot find the answer to your question in these articles, you can ask your question in my data mining forum. I will try to answer you if the question is simple and the answer is short.

Could you give me some examples of how to use the XXXXXX algorithm?

The examples are in the documentation section of the website. You may also consider reading the article describing the algorithm for more information about the algorithm.

Do you have the C++, C# or <insert another programming language here> version of the XXXXXX algorithm?

No. If it is not on the website, then I don't have it.

Where can I find some large datasets?

You can check the "datasets" page of this website. It provides download links and information for obtaining several popular datasets used in the data mining literature that can be used with SPMF.

I have found a bug in your software!

Please send me information about the bug by e-mail . I will try to fix it as soon as possible.

The software is very useful. How can I say thank you?

If you appreciate the software, the best way to say thank you is to cite the website in your thesis/papers/articles, post link to this website on the internet so that more people can find it and to recommend it to your colleagues.

How should I cite SPMF?

Please cite SPMF as follows:

Fournier-Viger, P., Lin, C.W., Gomariz, A., Gueniche, T., Soltani, A., Deng, Z., Lam, H. T. (2016). The SPMF Open-Source Data Mining Library Version 2. Proc. 19th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD 2016) Part III, Springer LNCS 9853,  pp. 36-40.

Other questions

If you have a general question, please ask it in the data mining forum so that I can answer you and that the answer can be shared with everyone else. But you can also ask me by e-mail if the question has to be private. I try to answer as fast as possible. But if your question is long and requires a long answer or if I'm currently busy, I may takes a few days before I answer.