SPBSpark: Distributed Trajectory Indexing Method with High Query Efficiency
Author:
Affiliation:

Clc Number:

TP311

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    The popularization of GPS mobile devices and 5G Internet technology has led to the rapid growth of trajectory data. How to efficiently store, manage, and analyze massive trajectory data has become a hot research issue in the current environment. The traditional single-node trajectory index is limited by memory capacity, disk I/O speed, and other factors, and is no longer capable of managing large-scale trajectory data. Spark, as a distributed framework based on in-memory computing, has natural advantages in processing massive data. Therefore, this study proposes a distributed trajectory data indexing and query scheme based on the Spark platform. To improve the data storage capacity of a single node in a distributed cluster and the efficiency of trajectory queries, a trajectory encoding technique, Z-order trajectory encoding (ZTE), is proposed. This technique encodes the minimum adjacent subspaces covered by the trajectory minimum bounding rectangle (MBR), which can represent trajectories of different granularities and their movement directions, and is used to determine the relationship between a trajectory and the query space. Based on this technique, this study further organizes the ZTE codes of trajectories into a partial-order structure and designs a subspace partial-order branch (SPB). Combined with the hash mapping table IDMap, a local index is constructed. This index avoids the inefficiency caused by the dead space formed by the overlapping of minimum bounding rectangles in R-tree-like indexes and enables fast pruning. To support efficient retrieval of massive trajectory data, the study designs a distributed trajectory index named SPBSpark based on the SPB-branch local index. SPBSpark mainly consists of three components: data partition, local index, and global index. The proposed index effectively supports three types of queries: spatiotemporal range query, k-nearest neighbor query, and moving object trajectory query. Finally, the study selects the distributed trajectory indexes TrajSpark and LocationSpark, which are also based on the Spark framework, as comparison systems. Through comparative simulation experiments, the spatial utilization of the SPBSpark index is improved by about 15% compared with LocationSpark. In terms of query performance, SPBSpark achieves a 2–3 times performance improvement compared with TrajSpark and LocationSpark.

    Reference
    Related
    Cited by
Get Citation

汤娜,蔡锶淇,李晶晶,罗凯原,汤庸. SPBSpark: 高查询效能的分布式轨迹索引方法.软件学报,,():1-22

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:May 28,2025
  • Revised:June 24,2025
  • Adopted:
  • Online: January 07,2026
  • Published:
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063