Multi-stage Method for Online Vertical Data Partitioning Based on Spectral Clustering
Author:
Affiliation:

Clc Number:

TP311

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Vertical data partitioning technology logically stores database table attributes satisfying certain semantic conditions in the same physical block, so as to reduce the cost of data accessing and improve the efficiency of query processing. Every query is usually related only to the table’s some attributes in the database, so a subset of the table’s attributes can be used to get the accurate query results. Reasonable vertical data partitioning can make most queries answered without scanning the whole table, so as to reduce the amount of data accessing and improve the efficiency of query processing. Traditional database vertical partitioning methods are mainly based on heuristic rules set by experts. The granularity of partitioning is coarse, and it can not provide different partition optimizations according to the characteristics of workload. Besides, when the scale of workload or the number of attributes becomes large, the execution time of the existing methods are too long to meet the performance requirements of online real-time tuning of database. Therefore, a method called spectral clustering based vertical partitioning (SCVP) is proposed for the online environment. The idea of phased solution is adapted to reduce the time complexity of the algorithm and speed up partitioning. Firstly, SCVP reduces the solution space by increasing the constraint conditions, that is, generating initial partitions by spectral clustering. Secondly, SCVP designs the algorithm to search solution space, that is, the initial partitions are optimized by combining frequent itemset mining and greedy search. In order to further improve the performance of SCVP under high-dimensional attributes, a new method called special clustering based vertical partitioning redesign (SCVP-R) is proposed which is an improved version of SCVP. SCVP-R optimizes the partitions combiner component of SCVP by introducing sympatric-competition mechanism, double-elimination mechanism, and loop mechanism. The experimental results on different datasets show that SCVP and SCVP-R have faster execution time and better performance than the current state-of-the-art vertical partitioning method.

    Reference
    Related
    Cited by
Get Citation

刘鹏举,李好洋,王天一,刘欢,孙路明,任逸飞,李翠平,陈红.基于谱聚类的在线数据库垂直分区多阶段生成方法.软件学报,2023,34(6):2804-2832

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:June 25,2021
  • Revised:August 21,2021
  • Adopted:
  • Online: November 16,2022
  • Published:
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063