###
DOI:
Journal of Software:2007.18(4):919-932

XML信息检索中最小子树根节点问题的分层算法
孔令波,唐世渭,杨冬青,王腾蛟,高军
(北京大学,计算机科学技术系,北京,100871;北京大学,计算机科学技术系,北京,100871;北京大学,视觉与听觉信息处理国家重点实验室,北京,100871)
Layered Solution for SLCA Problem in XML Information Retrieval
KONG Ling-Bo,TANG Shi-Wei,YANG Dong-Qing,WANG Teng-Jiao,GAO Jun
()
Abstract
Chart / table
Reference
Similar Articles
Article :Browse 3757   Download 3724
Received:October 20, 2005    Revised:April 27, 2006
> 中文摘要: 最小子树根节点问题(smallest lowest common ancestor,简称SLCA)是实现XML信息检索研究中关键字查询的一个基本问题,其主旨就是求解所有包含给定关键字的紧致子树的根节点.XU等人给出了3种算法-基于索引的搜索算法(indexed lookup eager,简称ILE)、基于堆栈的算法以及基于扫描的算法(scan eager,简称SE),并通过实验证明ILE算法具有最好的表现.与基于B+树索引结构的ILE算法不同,所给出的新算法,称为LISA(layered intersec
中文关键词: XML索引  Dewey编码  XML信息检索  关键字查询  SLCA  ILE
Abstract:SLCA (smallest lowest common ancestor) problem is a basic task of keyword search in XML information retrieval. It means to find all the nodes corresponding to the tightest subtrees in XML data, which involves the given. Xu, et al., illustrate three algorithms-Indexed lookup eager (ILE), stack algorithm and scan eager (SE), and manifest that ILE has the best performance. Different from the complicated-B+-tree-based ILE algorithm, this paper proposes a layered solution for SLCA problem, named as LISA (layered intersection scan algorithm). It benefits from the distribution rule of SLCA nodes in XML tree, and calculates the SLCA nodes level by level (the deepest level runs first). That is, based on the retrieved Dewey codes corresponding to given keywords, the Dewey codes of SLCA nodes can be gotten by intersecting the prefix Dewey codes of each level. Compared with the ILE algorithm, LISA solutions need not sophisticated data structures, and have comparatively runtime performance. There are two instances following the LISA idea, called LISA I and LISA II respectively. They are distinguished from each other according to whether keeping Dewey codes in computation or transforming Dewey codes into integer sequences. Extensive experiments evaluate the performance of algorithms and prove the efficiency of LISA II.
文章编号:     中图分类号:    文献标志码:
基金项目:Supported by the National Natural Science Foundation of China under Grant No.60503037 (国家自然科学基金); the National High-Tech Research and Development Plan of China under Grant No.2005AA4Z307 (国家高技术研究发展计划(863)); the Beijing Natural Science Foundation of China under Grant No.4062018 (北京市自然科学基金) Supported by the National Natural Science Foundation of China under Grant No.60503037 (国家自然科学基金); the National High-Tech Research and Development Plan of China under Grant No.2005AA4Z307 (国家高技术研究发展计划(863)); the Beijing Natural Science Foundation of China under Grant No.4062018 (北京市自然科学基金)
Foundation items:
Reference text:

孔令波,唐世渭,杨冬青,王腾蛟,高军.XML信息检索中最小子树根节点问题的分层算法.软件学报,2007,18(4):919-932

KONG Ling-Bo,TANG Shi-Wei,YANG Dong-Qing,WANG Teng-Jiao,GAO Jun.Layered Solution for SLCA Problem in XML Information Retrieval.Journal of Software,2007,18(4):919-932