Webpage understanding
2009; Association for Computing Machinery; Volume: 37; Issue: 4 Linguagem: Inglês
10.1145/1519103.1519111
ISSN1943-5835
AutoresZaiqing Nie, Ji-Rong Wen, Wei‐Ying Ma,
Tópico(s)Spam and Phishing Detection
ResumoIn this paper we introduce the webpage understanding problem which consists of three subtasks: webpage segmentation, webpage structure labeling, and webpage text segmentation and labeling. The problem is motivated by the search applications we have been working on including Microsoft Academic Search, Windows Live Product Search and Renlifang Entity Relationship Search. We believe that integrated webpage understanding will be an important direction for future research in Web mining.
Referência(s)