P.O.Box 8718, Beijing 100080, China Journal of Software,  February  2008,19(2):237-245
E-mail: jos@iscas.ac.cn ISSN 1000-9825,  CODEN RUXUEW,  CN 11-2560/TP
http://www.jos.org.cn  Copyright © 2008 by Journal of Software

基于本体的Deep Web数据标注

袁 柳, 李战怀, 陈世亮

 Full-Text PDF    Submission   Back


袁 柳, 李战怀, 陈世亮
(西北工业大学 计算机学院,陕西 西安 710072)
作者简介: 袁柳(1979-),女,陕西西安人,博士生,主要研究领域为语义Web,信息检索.李战怀(1961-),男,博士,教授,博士生导师,CCF高级会员,主要研究领域为数据管理技术.陈世亮(1968-),男,博士生,主要研究领域为多媒体信息管理.
联系人:
袁 柳  Phn: +86-29-88495821 ext 112, E-mail: yuanl@mail.nwpu.edu.cn, http://www.nwpu.edu.cn
Received 2007-08-31; Accepted 2007-10-19

Abstract
A semantic annotation method for Web database query result is proposed in this paper by adopting the deep annotation procedure in semantic Web. As a global schema Web database should be followed, domain ontology is introduced to the annotation procedure for a completed and consistent annotation result. The query interface and the query result features are analyzed in detail, the strategy of query condition reconfigured is adopted, and then the semantic markups of query result are determined. By collecting Web database from different domains, the experiments indicate that the approach proposed can annotate the Web database query result properly under the support of domain ontology.

Yuan L, Li ZH, Chen SL. Ontology-Based annotation for deep Web data. Journal of Software, 2008,19(2): 237-245. 
DOI: 10.3724/SP.J.1001.2008.00237
http://www.jos.org.cn/1000-9825/19/237.htm


摘要
借鉴语义Web领域中深度标注的思想,提出了一种对Web数据库查询结果进行语义标注的方法.为了获得完整且一致的标注结果,将领域本体作为Web数据库遵循的全局模式引入到查询结果语义标注过程中.对查询接口及查询结果特征进行详细分析,并采用查询条件重置的策略,从而确定查询结果数据的语义标记.通过对多个不同领域Web数据库的测试,在具有领域本体支持的条件下,该方法能够对Web数据库查询结果添加正确的语义标记,从而验证了该方法的有效性.

基金项目:Supported by the National Natural Science Foundation of China under Grant No.60573096 (国家自然科学基金); the NSFC-JST Major International (Regional) Joint Research Project under Grant No.60720106001 (NSFC-JST重大国际(地区)合作项目)

References: 

[1] Bergman MK. The deep Web: Surfacing hidden value. White Paper on the Deep Web. 2001. http://www.brightplanet.com/pdf/deepwebwhitepaper.pdf

[2] Liu W, Meng XF, Meng WY. Deep Web data integration. Technical Report, WAMDM-TR-2006-3, WAMDM, 2006 (in Chinese with English abstract). http://idke.ruc.edu.cn/reports/report2006/seminar%20summary/Deep%20Web.pdf

[3] Arlotta L, Crescenzi V, Mecca G, Merialdo P. Automatic annotation of data extracted from large Web sites. In: Christophides V, Freire J, eds. Proc. of the 6th Int'l Workshop on Web and Databases. San Diego: ACM Press, 2003. 7-12.

[4] Wang JY, Lochovsky FH. Data extraction and label assignment for Web databases. In: Proc. of the 12th Int'l World Wide Web Conf. Budapest: ACM Press, 2003. 187-196.

[5] He H, Meng WY, Lu YY, Yu C, Wu ZH. Towards deeper understanding of the search interfaces of the deep Web. World Wide Web, 2007,10(2):133-155.

[6] Lu YY, He H, Zhao HK, Meng WY, Yu C. Annotating structured data of the deep Web. In: Proc. of the IEEE 23rd Int'l Conf. on Data Engineering. Istanbul: IEEE Computer Society Press, 2007. 376-385.

[7] Wang JY, Lochovsky FH. Data-Rich section extraction from HTML pages. In: Keong W, Ling TW, eds. Proc. of the 3rd Int'l Conf. on Web Information Systems Engineering. Singapore: IEEE Computer Society Press, 2002. 313-322.

[8] Handschuh S, Staab S, Volz R. On deep annotation. In: Proc. of the 12th Int'l World Wide Web Conf. San Diego: ACM Press, 2003. 431-438.

[9] Yuan L, Li ZH, Chen SL. Inference rules guided ontology alignment. Journal of Computational Information Systems, 2006,2(3): 1085-1090.

附中文参考文献:
[2] 刘伟,孟小峰,孟卫一.Deep Web数据集成问题研究.科技报告,WAMDM-TR-2006-3,WAMDM,2006. http://idke.ruc.edu.cn/reports/report2006/seminar%20summary/Deep%20Web.pdf