Python implementation of ECLAT algorithm for association rule mining.
This implementation mines rules , such that is an element in a transaction and is an element in hierarchy that a belongs to. This kind of rule is mined on the condition that there are transactions , where is an itemset belonging to an element in hierarchy .
$ conda env create -f environment.yml
$ conda activate eclat
Execute with default parameters:
$ python main.py
To execute for a predefined dataset:
$ python main.py --dataset=<dataset_id>
Possible dataset_id values:
To execute for a custom dataset:
$ python main.py --data=<path/to/transactions.txt> --taxonomy=<path/to/taxonomy.txt>
File with taxonomy is optional. Rules based on hierarchy of items are not mined if taxonomy is not provided.
Example of transactions.txt file format:
1 2 3
1 2
1 3
Example of taxonomy.txt file format:
1,11
2,11
3,22
11,111
22,111
An example of execution with ECLAT parametrization:
$ python main.py --min_sup=5 --min_conf=0.8 --min_len=3 --max_len=10
The options are:
- min_sup - minimum support of the base of mined rules (type=int, default=1),
- min_conf - minimum confidence of mined rules (type=float, default=0.5),
- min_len - minimum length of mined rules (type=int, default=1),
- max_len - maximum length of mined rules (type=int, default=None - not limited by default).
To execute unit tests run the following command in the main directory:
$ python -m unittest test.test_eclat
To run efficiency experiments:
$ python -m test.test_efficiency