fim is a collection of some popular frequent itemset mining algorithms implemented in Go.
fim is a collection of some popular frequent itemset mining
algorithms implemented in Go.
fim contains the implementations of the following algorithms:
Execute the following commands to build the fim tool:
git clone https://github.com/paulfedorow/fim.git
cd fim
make
To see which arguments fim supports execute the following command:
build/fim -h
The following example finds all frequent itemsets with a minimal support of 1% in the dataset contained
in datasets/retail.dat
:
build/fim -a fpgrowth -i datasets/retail.dat -s 0.01
fims dataset file format is as follows:
File = {Transaction}
Transaction = Item {" " Item} "\n"
Item = {"0" ... "9"}
Each line in the file is a transaction. A line is expected to be a series of integers separated by a single space. Each
integer is an item of the corresponding transaction.
To determine which algorithm is the most efficient, the runtime of each algorithm was measured with different datasets
and decreasing minimal support. The datasets that were used are retail.dat
and chess.dat
from FIMI Dataset Repository. The datasets are respectively sparse and dense. The
results are shown below. The algorithm fpgrowth is best in terms of runtime.