Comparison of Transaction Reduction and Dynamic Itemset Counting Implementations of the Apriori algorithm in Python 3.8
RAYNE CARL B. PARCON, JAMES EUGENE L. SETIAS, MARY THERESE M. HOLIPAS, and EISEN ED BRIONES
Philippine Science High School Western Visayas Campus – Department of Science and Technology (DOST-PSHSWVC), Brgy. Bito-on, Jaro, Iloilo City 5000, Philippines
Abstract
The Apriori algorithm is a level-wise algorithm designed to find frequent itemsets. Various improvements called implementations were made to improve efficiency. These are classified into five categories based on how the algorithm processes data. Existing studies focus on the comparison of the classical Apriori with a different implementation. With dynamic itemset counting (DIC) under dataset partitioning algorithm and transaction reduction under incremental update algorithm, a comparison of the running time between algorithms from different categories might not be accurate; thus, the time complexities in Big-O notation were solved. Both algorithms had the same time complexity, O(n2); thus, a comparison of the running times was made. The running times were recorded after processing three datasets of varying size and complexity. It was determined that DIC is faster by 8434% while having the same accuracy. Both algorithms slowed down at large and complex datasets; however, the DIC was still faster.
Keywords: Big-O notation, Time complexity, Running time, Frequent itemsets, Level-wise algorithm