Effective Clustering Algorithm for Probabilistic Data Stream
DOI:
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    An effective clustering algorithm called “P-Stream” for probabilistic data stream is developed in this paper for the first time. For the uncertain tuples in the data stream, the concepts of strong cluster, transitional clusters and weak cluster are proposed in the P-Stream. With these concepts, an effective strategy of choosing candidate cluster is designed, which can find the sound cluster for every continuously arriving data point. Then, in order to further cluster on the high level and analyze the evolving behaviors of data streams, snapshots of micro-clusters are stored at every checkpoint. At last, an “aggressive” two-tier clustering model is introduced to judge whether the most recently arrived data point is fitting in with the first level clustering model or not. Probabilistic data streams in the experiments include KDD-CUP’98 and KDD-CUP’99 real data sets and synthetic data sets with changing Gaussian distributions. Comprehensive experimental results demonstrate that P-Stream is of high quality, fast processing rate and is efficiently fitting in with the evolving situations of data streams.

    Reference
    Related
    Cited by
Get Citation

戴东波,赵杠,孙圣力.基于概率数据流的有效聚类算法.软件学报,2009,20(5):1313-1328

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:November 13,2007
  • Revised:March 06,2008
  • Adopted:
  • Online:
  • Published:
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063