GPU-accelerated Clustering Algorithm for High-dimensional Vectors

doi:10.13328/j.cnki.jos.007512

微信小程序

微信服务号

微信订阅号

Home > Archive>Volume 37, Issue 3, 2026 >1037-1057. DOI:10.13328/j.cnki.jos.007512

PDF HTML XML Export Cite reminder

GPU-accelerated Clustering Algorithm for High-dimensional Vectors
DOI:
                        10.13328/j.cnki.jos.007512
                    
Author:
                        
                        
                    
Affiliation:
Clc Number:TP311
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Clustering serves as one of the critical technologies for large-scale, high-dimensional vector data analysis. Recently, a density-based clustering algorithm DBSCAN (density-based spatial clustering of applications with noise) has been widely adopted in data analysis due to their advantages of not requiring pre-specified cluster numbers, discovering complex cluster structures, and identifying noise points. However, existing density-based clustering algorithms suffer from high computational costs when processing high-dimensional vectors. Meanwhile, these methods also face challenges like the “curse of dimensionality”, restricting their practical applications. With the rapid growth of high-dimensional vector data in the era of information technology, CPU-based clustering approaches encounter increasing challenges in time efficiency and scalability. To address these issues, this study proposes a GPU-accelerated clustering algorithm for high-dimensional vector data, introducing the K-nearest neighbor (KNN) graph index to accelerate DBSCAN. First, a GPU-accelerated parallel KNN graph construction algorithm is developed, significantly reducing the index construction overhead. Furthermore, to enhance the pipeline of DBSCAN and achieve highly concurrent vector clustering, a K-means tree partitioning algorithm with inter-layer parallelism and a parallel clustering algorithm based on breadth-first search and a core KNN graph are designed. Finally, extensive experiments are conducted on real-world datasets, and the proposed method is compared against existing approaches. Experimental results show that the proposed algorithm improves the efficiency of large-scale vector clustering by 5.7–2822.5 times while maintaining clustering accuracy.

Reference

Cited by

Get Citation

李忠根,龚盛豪,于浩然,朱轶凡,柳晴,高云君. GPU加速的高维向量聚类算法.软件学报,2026,37(3):1037-1057

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:May 01,2025
Revised:June 30,2025
Adopted:
Online: September 02,2025
Published: March 06,2026

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063