Similarity Search over Personal Process Description Graph
2015; Springer Science+Business Media; Linguagem: Inglês
10.1007/978-3-319-26190-4_35
ISSN1611-3349
AutoresJing Ouyang Hsu, Hye-Young Paik, Li-Ming Zhan,
Tópico(s)Data Quality and Management
ResumoPeople are involved in various processes in their daily lives, such as cooking a dish, applying for a job or opening a bank account. With the advent of easy-to-use Web-based sharing platforms, many of these processes are shared as step-by-step instructions (e.g., “how-to guides” in eHow and wikiHow) on-line in natural language form. We refer to them as personal process descriptions. In our early work, we proposed a graph-based model named Personal Process Description Graph (PPDG) to concretely represent and query the personal process descriptions. However, in practice, it is difficult to find identical personal processes or fragments for a given query due to the free-text nature of personal process descriptions. Therefore, in this paper, we propose an idea of similarity search over the “how-to guides” based on PPDG. We introduce the concept of “similar personal processes” which defines the similarity between two PPDGs by utilizing the features of both PPDG nodes and structure. Efficient and effective algorithms to process similarity search over PPDGs are developed with novel pruning techniques following a filtering-refinement framework. We present a comprehensive experimental study over both real and synthetic datasets to demonstrate the efficiency and scalability of our techniques.
Referência(s)