PG-RAC: 基于PostgreSQL的共享缓存多写事务处理数据库
作者:
作者单位:

作者简介:

通讯作者:

胡卉芪, E-mail: hqhu@dase.ecnu.edu.cn

中图分类号:

TP311

基金项目:

国家自然科学基金(92270202); 上海市自然科学基金(23ZR1418300); 中兴通讯研究基金(HC-CN-20220721010)


PG-RAC: PostgreSQL-based Database with Shared Cache for Multi-write Transaction
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    云原生数据库的主流设计采用一主多从架构, 集群中从节点可以分担主节点的只读请求, 写请求由主节点处理. 在此基础上, 为了进一步满足大规模交易扩展的需求, 一些云数据库尝试实现多写事务扩展. 多写扩展的一种实现路径是在计算节点间实现共享缓存, 支持跨节点的数据访问. 在基于共享缓存的数据库系统中, 跨节点远程访问的开销远大于本地访问, 因此缓存协议的设计是影响系统性能和可扩展性的关键因素. 对缓存协议提出了两个创新性改进, 并基于PostgreSQL实现了支持多写事务处理的共享缓存数据库PG-RAC. 一方面, PG-RAC提出一种新型的分布式链式路由策略, 将路由信息分散在各计算节点. 相比单点目录管理的路由策略, 事务平均延迟降低了约20%. 另一方面, 还改进了副本页失效机制, 将失效操作从事务路径分离, 减小了事务处理关键路径的延迟. 在此基础上, PG-RAC利用多版本并发控制的特性, 进一步提出推迟副本页失效时机, 有效提高了缓存利用率. TPC-C实验结果显示, 在配备4台计算节点的集群中, 吞吐率为PostgreSQL的近2倍, 为分布式数据库Citus的1.5倍.

    Abstract:

    Single-master multi-slave is the mainstream architecture of cloud-native databases. In the cluster, slave nodes can share the read-only requests of the master node, while write requests are handled by the master node. Based on this, to further meet the demands of large-scale transaction expansion, some cloud databases attempt to implement multi-write transaction expansion. One possible approach to multi-write expansion is to introduce shared cache among computing nodes to support cross-node data access. For shared-cache database systems, the overhead of cross-node remote access is significantly higher than that of local access. Therefore, the design of cache protocol is a crucial factor that affects system performance and scalability. This study proposes two innovative improvements to the coherence protocol and implements PG-RAC, a shared-cache database, which supports multi-write transactions based on PostgreSQL. On one hand, PG-RAC proposes a new distributed chained routing strategy, which disperses routing information among computing nodes. Compared to the routing strategy that utilizes single-node directory management, it reduces the average transaction latency by approximately 20%. On the other hand, this study also enhances the duplicate page invalidation mechanism by separating invalidation operations from the transaction path, reducing the latency of the critical path in the transaction. Based on this, PG-RAC takes advantage of the characteristics of multi-version concurrency control (MVCC) and further proposes to delay the invalidation point of duplicate pages, which effectively improves cache utilization. TPC-C experimental results show that for a cluster with 4 compute nodes, the throughput is nearly 2 times that of PostgreSQL and 1.5 times that of the distributed database Citus.

    参考文献
    相似文献
    引证文献
引用本文

印钰杰,史浩洋,范自豪,周华辉,刘晟驰,胡卉芪,魏星,陈河堆,屠要峰,蔡鹏,周烜. PG-RAC: 基于PostgreSQL的共享缓存多写事务处理数据库.软件学报,2025,36(3):1-19

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-05-27
  • 最后修改日期:2024-07-16
  • 录用日期:
  • 在线发布日期: 2024-09-13
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号