Fast Cluster-learning with Prior Probability from Big Dataset

Resource type
Authors/contributors
Title
Fast Cluster-learning with Prior Probability from Big Dataset
Abstract
Association Rule Mining by Aprior method has been one of the popular data mining techniques for decades, where knowledge in the form of item-association rules is harvested from a dataset. The quality of item-association rules nevertheless depends on the concentration of frequent items from the input dataset. When the dataset becomes large, the items are scattered far apart. It is known from previous literature that clustering helps produce some data groups which are concentrated with frequent items. Among all the data clusters generated by a clustering algorithm, there must be one or more clusters which contain suitable and frequent items. In turn, the association rules that are mined from such clusters would be assured of better qualities in terms of high confidence than those mined from the whole dataset. However, it is not known in advance which cluster is the suitable one until all the clusters are tried by association rule mining. It is time consuming if they were to be tested by brute-force. In this paper, a statistical property called prior probability is investigated with respect to selecting the best out of many clusters by a clustering algorithm as a pre-processing step before association rule mining. Experiment results indicate that there is correlation between prior probability of the best cluster and the relatively high quality of association rules generated from that cluster. The results are significant as it is possible to know which cluster should be best used for association rule mining instead of testing them all out exhaustively.
Date
2018-11
Proceedings Title
2018 5th International Conference on Soft Computing & Machine Intelligence (ISCMI)
Conference Name
2018 5th International Conference on Soft Computing & Machine Intelligence (ISCMI)
Pages
60-66
DOI
10.1109/ISCMI.2018.8703219
Library Catalog
IEEE Xplore
Extra
0 citations (Crossref) [2022-09-21] ISSN: 2640-0146
Citation
Li, T., Fong, S., Lobo Marques, J. A., & Wong, R. K. (2018). Fast Cluster-learning with Prior Probability from Big Dataset. 2018 5th International Conference on Soft Computing & Machine Intelligence (ISCMI), 60–66. https://doi.org/10.1109/ISCMI.2018.8703219