向量数据库中近似最近邻搜索关键技术综述
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP311

基金项目:


Survey on Key Techniques of Approximate Nearest Neighbor Search in Vector Databases
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    高维向量近似最近邻搜索(Approximate Nearest Neighbor Search,ANNS)是向量数据库的基础和核心之一.随着人工智能的发展,向量数据库发挥了日益关键的作用,获得了广泛的关注,高效的ANNS方法对向量数据库的性能优化十分关键.在几十年的发展中,ANNS取得了一系列成果,诞生了很多优秀综述.近些年随着该领域的快速发展,涌现出来的新方法和研究成果亟需系统性梳理.本文首先介绍了ANNS的基本概念;其次在已有的综述框架的基础上,根据向量数据组织方式将当前的内容进一步归纳为基于图、层次、量化、哈希和混合数据组织五类,并结合代表性和最新的成果进行了介绍;然后我们从向量数据搜索优化方法的角度提出面向硬件加速、面向学习增强、面向距离比较操作、面向磁盘内存混合场景、面向数据访问优化、面向分布式场景、面向混合查询和理论分析八个方面的分类体系对最近的搜索方法进行综述;最后基于当前的研究成果和趋势,我们展望了未来的研究方向.

    Abstract:

    High-Dimensional Approximate Nearest Neighbor Search (ANNS) is one of the fundamental and core components of vector databases. With the advancement of artificial intelligence, vector databases have played an increasingly critical role and have garnered widespread attention. ANNS methods are essential for optimizing the performance of vector databases. Over decades of development, ANNS has achieved a series of milestones and inspired many comprehensive surveys. Rapid advancements in this field in recent years have led to a surge of novel methods and findings, necessitating systematic organization. In this survey, we first introduce the basic concepts of ANNS. Next, building upon existing survey frameworks, we further categorize current approaches into five groups based on vector data organization methods: graph-based, hierarchical, quantization-based, hashing-based, and hybrid data organization. Representative works and the latest research advances in the field are systematically discussed. Then, from the perspective of vector search optimization methods, we propose a classification system consisting of eight categories: hardware acceleration-oriented, learning-oriented, distance comparison operation-oriented, disk-oriented, data layout-oriented, distributed-oriented, hybrid query-oriented, and theoretical analysis, to review recent search achievements. Finally, based on current research achievements and trends, we outline potential future research directions.

    参考文献
    相似文献
    引证文献
引用本文

宋子文,王斌,张喜瑞,赵世豪,杨晓春.向量数据库中近似最近邻搜索关键技术综述.软件学报,2026,37(3):0

复制
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-05-06
  • 最后修改日期:2025-06-30
  • 录用日期:
  • 在线发布日期: 2025-09-02
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号