Semantic-Based Focused Crawling Approach
Author:
Affiliation:

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    An approach of semantic-based focused crawling is proposed in order to use semantic resource efficiently. In this paper, a domain-ontology is used to describe the topic of Web crawling. Lexicon of the keywords list are mapped to ontology, and semantic of words are obtained through mapping. Inference services about assertion set expanding and domain-range relation are defined. The semantic relation among keywords can be inferred by inference services. At the same time, the definition of concept about Web page is given. A semantic computational model is proposed by combining inference services mentioned above. In the end, the order of URLs corresponding to their Web page is decided according to the subsumption of topic concepts. The result show that this approach is advanced in harvest-rate and crawling efficiency and is better than some classical algorithms.

    Reference
    Related
    Cited by
Get Citation

叶育鑫,欧阳丹彤.基于语义的主题爬行策略.软件学报,2011,22(9):2075-2088

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:May 13,2009
  • Revised:August 26,2009
  • Adopted:
  • Online:
  • Published:
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063