大图数据的统一查询处理机制
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP311

基金项目:

国家重点研发计划(2022YFB2702100); 国家自然科学基金(62225203, U21A20516)


Unified Query Processing Mechanism over Large-scale Graph Data
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    现实世界中许多应用场景都可以用图数据表示, 图上的查询也具有广泛的应用, 如可达、最短路径、关键字、图匹配、PageRank、SimRank、k-core、k-truss和Clique等. 针对特定的查询问题, 目前的研究方法可概括为: 提出相应的查询处理算法, 并构建索引结构来加速查询. 然而, 现实应用中需求的多样化以及图数据规模爆炸式的增长为该研究方法带来了两方面挑战. 第一, 同一个图数据在应用中会涉及多种查询, 但针对不同查询问题的处理机制和索引结构均不相同, 因此在设计图数据库时需构建多个索引和相应的查询算法; 第二, 索引的规模通常比原图数据的规模大, 多个索引同时存在会占用大量的系统空间, 导致图数据库的性能急剧下降, 不能被真正的应用. 为解决上述挑战, 提出一种统一的查询处理机制, 即为大图数据构建统一且高效的索引结构, 并基于统一索引结构设计可达、最短路径、关键字和图匹配这4种查询处理算法. 为构建统一索引结构, 对大图数据进行划分, 并根据可达、最短路径、关键字和图匹配这4种查询的特点提取出图数据中的重要顶点, 该统一索引结构规模比图数据规模小, 并且能高效地支持上述4种查询. 最后, 通过在4组真实数据上的实验验证了统一索引结构和4种查询处理算法的高效性和扩展性.

    Abstract:

    Graph data can represent a wide range of real-world application scenarios, and query processing over graphs plays a crucial role in various tasks, such as reachability, shortest path, keyword search, graph pattern matching, PageRank, SimRank, k-core, k-truss, and Clique. For specific query problems, existing approaches typically propose corresponding query processing algorithms and build index structures to speed up the query. However, the diversification of application demands and the explosive growth in graph data scale present two major challenges to this methodology. First, a single graph dataset may involve multiple types of queries in practice, yet each query type often requires distinct processing mechanisms and index structures. Consequently, multiple indexes and corresponding query algorithms need to be constructed when designing a graph database. Second, index structures are often larger than the original graph data, and maintaining multiple indexes simultaneously can lead to significant space overhead, resulting in sharp performance degradation and limited practical applicability. To address these challenges, this study proposes a unified query processing mechanism. A unified and efficient index structureis constructed for large-scale graph data, upon which four query processing algorithms are designed, supporting reachability, shortest path, keyword search, and graph pattern matching. To build the unified index structure, the graph data is partitioned, and important vertices are extracted based on the characteristics of the four queries. The resulting unified index is smaller in size than the original graph and efficiently supports all four queries. Finally, the effectiveness and scalability of the unified index and the proposed algorithms are validated through experiments on four real-world datasets.

    参考文献
    相似文献
    引证文献
引用本文

陈迪,袁野,潘雅妮,王国仁.大图数据的统一查询处理机制.软件学报,2026,37(5):2235-2256

复制
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-01-15
  • 最后修改日期:2025-04-11
  • 录用日期:
  • 在线发布日期: 2025-12-17
  • 出版日期: 2026-05-06
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号