Abstract:Approximate nearest neighbor search (ANNS) is a foundational technology supporting applications such as vector databases, recommendation systems, and large language models (LLMs). Among these, the hierarchical navigable small world (HNSW) graph indexing technique constructs a hierarchical structure to quickly locate results within the target region, thus achieving high retrieval recall at low computational cost. However, existing HNSW algorithms are primarily designed for static data retrieval scenarios and fail to account for the impact of data updates on retrieval performance. Through research on real-world datasets, it is found that data in vector databases is typically updated in batches, and their similar characteristics weaken the effectiveness of heuristic pruning in HNSW algorithms and lead to sparsification issues in the connections among similar vectors, collectively causing a significant decline in retrieval recall. To address these issues, this study proposes an adaptive fine-grained pruning strategy based on local adjustments to the graph structure and constructs a comprehensive optimization scheme that integrates an identification and repair mechanism. First, in the identification phase, the regional neighbor distance is calculated to quantify local topological density, thereby precisely locating the dense regions requiring intervention. Second, in the repair phase, for hub nodes in dense regions, a dual pruning neighbor selection strategy is adopted: native and modified heuristic pruning rules are applied synergistically, and the results of both rules are merged to enhance neighbor connection diversity while maintaining retrieval accuracy, effectively alleviating over-pruning and connection sparsification issues. Experimental results on multiple public datasets show that the proposed method demonstrates good adaptability in scenarios with frequent data updates, achieving a 1%–4% improvement in recall while maintaining stable query latency and throughput.