iDocument: Using Ontologies for Extracting and Annotating Information from Unstructured Text
2009; Springer Science+Business Media; Linguagem: Inglês
10.1007/978-3-642-04617-9_32
ISSN1611-3349
AutoresBenjamin Adrian, J.J. van Hees, Ludger van Elst, Andreas Dengel,
Tópico(s)Topic Modeling
ResumoDue to the huge amount of text data in the WWW, annotating unstructured text with semantic markup is a crucial topic in Semantic Web research. This work formally analyzes the incorporation of domain ontologies into information extraction tasks in iDocument. Ontology-based information extraction exploits domain ontologies with formalized and structured domain knowledge for extracting domain-relevant information from un-annotated and unstructured text. iDocument provides a pipeline architecture, an extraction template interface and the ability of exchanging domain ontologies for performing information extraction tasks. This work outlines iDocument's ontology-based architecture, the use of SPARQL queries as extraction templates and an evaluation of iDocument in an automatic document annotation scenario.
Referência(s)