GPU-accelerated Clustering Algorithm for High-dimensional Vectors
Author:
Affiliation:

Clc Number:

TP311

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Clustering serves as one of the critical technologies for large-scale, high-dimensional vector data analysis. Recently, a density-based clustering algorithm DBSCAN (density-based spatial clustering of applications with noise) has been widely adopted in data analysis due to their advantages of not requiring pre-specified cluster numbers, discovering complex cluster structures, and identifying noise points. However, existing density-based clustering algorithms suffer from high computational costs when processing high-dimensional vectors. Meanwhile, these methods also face challenges like the “curse of dimensionality”, restricting their practical applications. With the rapid growth of high-dimensional vector data in the era of information technology, CPU-based clustering approaches encounter increasing challenges in time efficiency and scalability. To address these issues, this study proposes a GPU-accelerated clustering algorithm for high-dimensional vector data, introducing the K-nearest neighbor (KNN) graph index to accelerate DBSCAN. First, a GPU-accelerated parallel KNN graph construction algorithm is developed, significantly reducing the index construction overhead. Furthermore, to enhance the pipeline of DBSCAN and achieve highly concurrent vector clustering, a K-means tree partitioning algorithm with inter-layer parallelism and a parallel clustering algorithm based on breadth-first search and a core KNN graph are designed. Finally, extensive experiments are conducted on real-world datasets, and the proposed method is compared against existing approaches. Experimental results show that the proposed algorithm improves the efficiency of large-scale vector clustering by 5.7–2822.5 times while maintaining clustering accuracy.

    Reference
    Related
    Cited by
Get Citation

李忠根,龚盛豪,于浩然,朱轶凡,柳晴,高云君. GPU加速的高维向量聚类算法.软件学报,2026,37(3):1037-1057

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:May 01,2025
  • Revised:June 30,2025
  • Adopted:
  • Online: September 02,2025
  • Published: March 06,2026
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063