Template-Driven Semantic Parsing for Focused Web Crawler
[ 1 ] Instytut Automatyki i Inżynierii Informatycznej, Wydział Elektryczny, Politechnika Poznańska | [ P ] pracownik
2015
referat
angielski
- template
- parsing
- focused web crawler
- Semantic Web
- expression language
EN We present Template-Driven Semantic Parser (TDSP) capable to represent, at least to some degree, the semantics of Web pages being processed. Data extraction process realized by means of TDSP is driven by a set of instructions stored in an easily modifiable XML-based template. In order to enhance the precision of Web page data extraction, the TDSP template format allows to use a specialized Expression Language (EL). The template may be easily created and modified using a tool called Visual Template Designer. TDSP provides an output document containing an RDF graph composed of triples that represent the website resources under exploration. In accordance to the Semantic Web paradigm, each resource has its semantics assigned and is connected to other resources by means of one or many relations. The semantic types of the resources and the relations between them are predefined in an ontology of Web artifacts.
351 - 358
WoS (15)