###
DOI:
Journal of Software:2007.18(4):905-918

基于滑动窗口的进化数据流聚类
常建龙,曹锋,周傲英
(复旦大学,计算机科学与工程系,上海,200433)
Clustering Evolving Data Streams over Sliding Windows
CHANG Jian-Long,CAO Feng,ZHOU Ao-Ying
()
Abstract
Chart / table
Reference
Similar Articles
Article :Browse 4006   Download 4475
Received:December 22, 2005    Revised:June 09, 2006
> 中文摘要: 提出了纳伪(false positive)和拒真(false negative)两种聚类特征指数直方图分别来支持纳伪误差和拒真误差窗口的聚类分析;然后,提出一种基于滑动窗口的数据流聚类方法.该方法在占用窗口大小的次线性内存空间前提下,及时保存最近数据记录的分布状况,从而实现对滑动窗口内的数据进行聚类.此外,它还可被扩展用于N-n窗口(滑动窗口的扩展模型)的数据聚类.实验采用KDD-CUP'99和KDD-CUP'98真实数据集以及变换高斯分布的人工数据集构造进化数据流.理论分析和
中文关键词: 进化数据流  聚类  滑动窗口
Abstract:To address the sliding window based clustering, two types of exponential histogram of cluster features, false positive and false negative, are introduced in this paper. With these structures, a clustering algorithm based on sliding windows is proposed. The algorithm can precisely obtain the distribution of recent records with limited memory, thus it can produce the clustering result over sliding windows. Furthermore, it can be extended to deal with the clustering problem over N-n window (an extended model of the sliding window). The evolving data streams in the experiments include KDD-CUP’99 and KDD-CUP’98 real data sets and synthetic data sets with changing Gaussian distribution. Theoretical analysis and comprehensive experimental results demonstrate that the proposed method is of high quality, little memory and fast processing rate.
文章编号:     中图分类号:    文献标志码:
基金项目:Supported by the National Natural Science Foundation of China under Grant Nos.60496325,60496327(国家自然科学基金) Supported by the National Natural Science Foundation of China under Grant Nos.60496325,60496327(国家自然科学基金)
Foundation items:
Reference text:

常建龙,曹锋,周傲英.基于滑动窗口的进化数据流聚类.软件学报,2007,18(4):905-918

CHANG Jian-Long,CAO Feng,ZHOU Ao-Ying.Clustering Evolving Data Streams over Sliding Windows.Journal of Software,2007,18(4):905-918