Text summarization contribution to semantic question answering: New approaches for finding answers on the web

Artigo Acesso aberto Revisado por pares

Text summarization contribution to semantic question answering: New approaches for finding answers on the web

2011; Wiley; Volume: 26; Issue: 12 Linguagem: Inglês

10.1002/int.20502

ISSN

1098-111X

Autores

Elena Lloret, Héctor Llorens, Paloma Moreda, Estela Saquete, Manuel Palomar,

Tópico(s)

Service-Oriented Architecture and Web Services

Resumo

International Journal of Intelligent SystemsVolume 26, Issue 12 p. 1125-1152 Research Article Text summarization contribution to semantic question answering: New approaches for finding answers on the web Elena Lloret, Corresponding Author Elena Lloret [email protected] Department of Software and Computing Systems, University of Alicante, Apartado de correos 99, Alicante E-03080, SpainDepartment of Software and Computing Systems, University of Alicante, Apartado de correos 99, Alicante E-03080, SpainSearch for more papers by this authorHector Llorens, Hector Llorens [email protected] Department of Software and Computing Systems, University of Alicante, Apartado de correos 99, Alicante E-03080, SpainSearch for more papers by this authorPaloma Moreda, Paloma Moreda [email protected] Department of Software and Computing Systems, University of Alicante, Apartado de correos 99, Alicante E-03080, SpainSearch for more papers by this authorEstela Saquete, Estela Saquete [email protected] Department of Software and Computing Systems, University of Alicante, Apartado de correos 99, Alicante E-03080, SpainSearch for more papers by this authorManuel Palomar, Manuel Palomar [email protected] Department of Software and Computing Systems, University of Alicante, Apartado de correos 99, Alicante E-03080, SpainSearch for more papers by this author Elena Lloret, Corresponding Author Elena Lloret [email protected] Department of Software and Computing Systems, University of Alicante, Apartado de correos 99, Alicante E-03080, SpainDepartment of Software and Computing Systems, University of Alicante, Apartado de correos 99, Alicante E-03080, SpainSearch for more papers by this authorHector Llorens, Hector Llorens [email protected] Department of Software and Computing Systems, University of Alicante, Apartado de correos 99, Alicante E-03080, SpainSearch for more papers by this authorPaloma Moreda, Paloma Moreda [email protected] Department of Software and Computing Systems, University of Alicante, Apartado de correos 99, Alicante E-03080, SpainSearch for more papers by this authorEstela Saquete, Estela Saquete [email protected] Department of Software and Computing Systems, University of Alicante, Apartado de correos 99, Alicante E-03080, SpainSearch for more papers by this authorManuel Palomar, Manuel Palomar [email protected] Department of Software and Computing Systems, University of Alicante, Apartado de correos 99, Alicante E-03080, SpainSearch for more papers by this author First published: 27 July 2011 https://doi.org/10.1002/int.20502Citations: 10Read the full textAboutPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onEmailFacebookTwitterLinkedInRedditWechat Abstract As the Internet grows, it becomes essential to find efficient tools to deal with all the available information. Question answering (QA) and text summarization (TS) research fields focus on presenting the information requested by users in a more concise way. In this paper, the appropriateness and benefits of using summaries in semantic QA are analyzed. For this purpose, a combined approach where a TS component is integrated into a Web-based semantic QA system is developed. The main goal of this paper is to determine to what extent TS can help semantic QA approaches, when using summaries instead of search engine snippets as the corpus for answering questions. In particular, three issues are analyzed: (i) the appropriateness of query-focused (QF) summarization rather than generic summarization for the QA task, (ii) the suitable length comparing short and long summaries, and (iii) the benefits of using TS instead of snippets for finding the answers, tested within two semantic QA approaches (named entities and semantic roles). The results obtained show that QF summarization is better than generic (58% improvement), short summaries are better than long (6.3% improvement), and the use of TS within semantic QA improves the performance for both named-entity-based (10%) and, especially, semantic-role-based QA (47.5%). © 2011 Wiley Periodicals, Inc. REFERENCES 1 Witten IH, Bell TC, Moffat A. Managing gigabytes: compressing and indexing documents and images. 2nd ed. San Francisco, CA: Morgan Kaufmann Publishers; 1999. 2 Baeza-Yates R, Ribeiro-Neto B. Modern information retrieval. Wokingham, UK: Addison-Wesley; 1999. 3 Llopis F. IR-n: Un Sistema de Recuperación de Información Basado en Pasajes. PhD Thesis, Departamento de Lenguages y Sistemas Informáticos, Universidad de Alicante; 2003. 4 Gómez JM, Montes-y Gómez M, Sanchis E, Rosso P. A passage retrieval system for multilingual question answering. In: Proc 8th Int Conf Text Speech Dialogue (TSD 2005). LNCS, Vol 3658. Berlin/Heidelberg: Springer; 2005. pp 443–450. 5 Maybury MT. New directions in question answering. Menlo Park, CA: AAAI Press; 2004. 6 Green B, Wolf A, Chomsky C, Laugherty K. BASEBALL: an automatic question answerer. In: Computers and thought. New York: McGraw Hill; 1963. pp 207–216. 7 Simmons RF. Natural language question-answering systems: 1969. Commun ACM 1970; 13(1): 15–30. 8 Woods WA. Lunar rocks in natural english: explorations in natural language question answering. In Linguist structures proce. Elsevier North-Holland; 1977. pp 521–569. 9 Burton RR. Semantic grammar: a technique for efficient language understanding in limited domains, PhD thesis, University of California, Irvine, CA; 1976. 10 Lehnert WG. The process of question answering: a computer simulation of cognition. New York: Lawrence Erlbaum; 1978. 11 Schank RC. Sam—a story understander. Technical Report 43, Department of Computer Sci, Yale University, New Haven; 1975. 12 Cullingford RE. Script application: computer understanding of newspaper stories, PhD thesis, Yale University, New Haven, CT; 1977. 13 Warren DHD, Pereira FCN. An efficient easily adaptable system for interpreting natural language queries. Comput Linguist 1982; 8(3–4): 110–122. 14 Kupiec J. Murax: a robust linguistic approach for question answering using an on-line encyclopedia. In: SIGIR. New York: ACM; 1993. pp 181–190. 15 Voorhees EM. Overview of the TREC-15 question answering track. In: Proc Fourteenth Text Retr Conf; 2006. pp 54–68. 16 Katz B, Lin JJ, Felshin S. The start multimedia information system: current technology and future directions. In Int Workshop Multimedia Inf Syst; 2002. pp 117–123. 17 Strzalkowski T, Harabagiu S. Advances in open domain question answering. Dordrecht, The Netherlands: Springer; 2006. 18 Harabagiu S, Moldovan D. Question answering. In: The Oxford handbook of computational linguistics. New York: Oxford University Press; 2005. pp 560–582. 19 Pizzato LAS, Mollá-Aliod D. Extracting exact answers using a meta question answering system. In Austral Lang Technol Workshop 2005 (ALTW05) Sidney, Australia; 2005. pp 105–111. 20 Gildea D, Jurafsky D. Automatic labeling of semantic roles. Comput Linguist 2002; 28(3): 245–288. 21 Narayanan S, Harabagiu S. Question answering based on semantic structures. In: Proc 20th Int Conf Comput Linguist (COLING), Morristown, NJ; 2004. p 693. 22 Moschitti A, Quarteroni S, Basili R, Manandhar S. Exploiting syntactic and shallow semantic kernels for question answer classification. In: Proc Deep Linguist Process Workshop 45th Annu Meeting Assoc Comput Linguist, Prague, Czech Republic; 2007. pp 776–783. 23 Kaisser M. Question answering based on semantic roles. In: Proc Deep Linguist Proc Workshop ACL2007, Prague, Czech Republic; 2007. pp 41–48. 24 Moreda P, Llorens H, Saquete E, Palomar M. The influence of semantic roles in QA: a comparative analysis. In: Procesamiento del lenguaje natural (SEPLN), Vol 41, Madrid, Spain; 2008. pp 55–62. 25 Ofoghi B, Yearwood J, Ghosh R. A semantic approach to boost passage retrieval effectiveness for question answering. In: Computer Sci 2006, 29th Austral Comput Sci Conf, Darlinghurst, Australia; 2006. pp 95–101. 26 Stenchikova S, Hakkani-Tur D, Tur G. QASR: Question answering using semantic role for speech interface. In: Proc Int Conf Spoken Lang Proce, Pittsburgh, PA; 2006. pp 1185–1188. 27 Fliedner G. Linguistically informed question answering. In: Saarbrücken dissertations in computational linguistic and language technology. Vol. XXIII. Universität des Saarlandes und DFKI GmbH, Saarbrücken; 2007. 28 Mani I. Summarization evaluation: an overview. In: Proc North Am Chapter Assoc for Comput Linguist (NAACL). Workshop Autom Summarization; 2001. 29 Spärck Jones K. Automatic summarizing: factors and directions. In Advances in Automatic Text Summarization. Cambridge, MA: MIT Press; 1999. pp 1–14. 30 Hovy E, Lin C-Y. Automated multilingual text summarization and its evaluation. Technical report, ISI, University of Southern California; 1999. 31 McCargar V. Statistical approaches to automatic text summarization. Am Soc Inf Sci Technol 2005; 30(4): 21–25. 32 Kacprzyk J, Wilbik A, Zadrożny S. An approach to the linguistic summarization of time series using a fuzzy quantifier driven aggregation. Int J Intell Syst 2010; 25: 411–439. 33 Tatar D, Tamaianu-Morita E, Mihis A, Lupsa D. Summarization by logic segmentation and text entailment. In: Proc Conf Intell Text Proce Comput Linguist; 2008. pp 15–26. 34 Lloret E, Ferrández O, Muñoz R, Palomar M. Integración del reconocimiento de la implicación textual en tareas automáticas de resúmenes de textos. Procesamiento del lenguaje natural (SEPLN) 2008, Madrid, Spain; 41: 183–190. 35 Lin C-Y, Hovy E. The automated acquisition of topic signatures for text summarization. In: Proc 18th Conf Comput Linguist, Stroudsburgh, PA; 2000. pp 495–501. 36 Harabagiu S, Lacatusu F. Topic themes for multidocument summarization. In: Proc 28th ACM SIGIR Conf, New York; 2005. pp 202–209. 37 Kuo J-J, Chen H-H. Multidocument summary generation: using informative and event words. ACM Trans Asian Lang Inf Process 2008; 7(1): 1–23. 38 Erkan G, Radev DR. Lexrank: graph-based lexical centrality as salience in text summarization. J Artifi Intell Res 2004; 22: 457–479. 39 Mihalcea R. Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: Proc Assoc for Comput Lingusit, New York; 2004. pp 170–173. 40 Wan X, Yang J, Xiao J. Towards a unified approach based on affinity graph to various multidocument summarizations. In Res Adv Technol Digit Libr, 11th Eur Conf, ECDL 2007; 2007. pp 297–308. 41 Plaza L, Díaz A, Gervás P. Concept-graph based biomedical automatic summarization using ontologies. In: Proc Graph-based Algorithms Nat Lang Process Workshop, Manchester, UK; August 2008. pp 53–56. 42 Giannakopoulos G, Karkaletsis V, Vouros G. Testing the use of n-gram graphs in summarization sub-tasks. In: Proc Text Anal Conf (TAC), Gaithersburgh, MD; 2008. 43 Capus L, Tourigny N. A case-based reasoning approach to support story summarization. Int J Intell Syst 2003; 18(8): 877–891. 44 Marcu D. Discourse trees are good indicators of importance in text. In I Mani, M Maybury, editors. Advances in automatic text summarization. Cambridge, MA: MIT Press; 1999. pp 123–136. 45 Fuentes M, Rodríguez H, Ferrés D. FEMsum at DUC 2007. In: Proc Doc Underst Workshop, Gaithersburgh, MD; 2007. 46 Toutanova K, Brockett C, Gamon M, Jagarlamudi J, suzuki H, Vanderwende L. The PYTHY summarization system: Microsoft Research at DUC 2007. In: Proc Doc Underst Workshop, Gaithersburgh, MD; 2007. 47 Gotti F, Lapalme G, Nerima L, Wehrli E. GOFAISUM: a symbolic summarizer for DUC. In: Proc Doc Underst Workshop, Gaithersburgh, MD; 2007. 48 Chali Y, Joty SR. Answering complex questions using query-focused summarization technique. Int Conf Tools Artif Intell 2008; 2: 131–134. 49 Zhao L, Wu L, Huang X. Using query expansion in graph-based approach for query-focused multidocument summarization. Inf Proc Manag 2009; 45(1: 35–41. 50 Sweeney SO, Crestani F. Effective search results summary size and device screen size: is there a relationship? Inf Proce Manag 2006; 42(4: 1056–1074. 51 Sweeney SO, Crestani F. Supporting searching on small screen devices using summarisation. In: Proc Int Workshop Mobile Ubiquitous Inf Access; 2003. pp 187–201. 52 Biryukov M, Angheluta R, Moens M-F. Multidocument question answering text summarization using topic signatures. J Digital Inf Manag 2005; 3(1: 27–33. 53 Demner-Fushman D, Lin J. Answer extraction, semantic clustering, and extractive summarization for clinical question answering. In: Proc Assoc Comput Linguist, Stroudsburgh, CA; 2006. pp 841–848. 54 Dali L, Rusu D, Fortuna B, Mladenič D, Grobelnik M. Question answering based on semantic graphs. In: Proc Semant Search 2009 Workshop, Located at the 18th Int. World Wide Web Conference WWW2009, Madrid, Spain; 2009. 55 Shi Z, Melli G, Wang Y, Liu Y, Gu B, Kashani MM, Sarkar A, Popowich F. Question answering summarization of multiple biomedical documents. In: Proc Can Soc Comput Stud Intell Adv Artif Intell; 2007. pp 284–295. 56 Torres-Moreno J-M, St-Onge P-L, Gagnon M, El-Bèze M, Bellot P. Automatic summarization system coupled with a question-answering system (QAAS). NLP News Comput Lang, 2009]. 57 Mori T, Nozawa M, Asada Y. Multi-answer-focused multidocument summarization using a question-answering engine. ACM Trans Asian Lang Inf Process 2005; 4(3: 305–320. 58 Lloret E, Palomar M. A gradual combination of features for building automatic summarisation systems. In: Proc 12th Int Conf Text Speech Dialogue, Plisen, Czech Republic; 2009. pp 16–23. 59 Lin D. Dependency-based evaluation of minipar. In Workshop Eval Parsing Syst, Granada, Spain; 1998. 60 Moreda P, Navarro B, Palomar M. Corpus-based semantic role approach in information retrieval. Data Knowl Eng 2007; 61(3: 467–483. 61 Moreda P, Llorens H, Saquete E, Palomar M. Automatic generalization of a QA answer extraction module based on semantic roles. In AAI - IBERAMIA. Vol 5290, LNAI, LNCS: Springer; 2008. pp 233–242. 62 Pasca M. Open-domain question answering from large text collections. Standford, CA: CSLI Publications; 2003. 63 Miller GA. Wordnet: a lexical database for english. Commun ACM 1995; 38(11: 39–41. 64 Church KW, Hanks P. Word association norms, mutual information, and lexicography. Comput Linguist, 1990; 16(1: 22–29. 65 Glickman O. Applied textual entailment, PhD Thesis, Bar Ilan University, Ramat-Gom, Israel; 2006. 66 Ferrández O, Micol D, Muñoz R, Palomar M. A perspective-based approach for solving textual entailment recognition. In: Proc ACL-PASCAL Workshop Textual Entailment Paraphrasing, Prague, Czech Republic; June 2007. pp 66–71. 67 Luhn HP. The automatic creation of literature abstracts. In Advances in automatic text summarization. Cambridge, MA: MIT Press; 1958. pp 15–22. 68 Givón T. Syntax: A functional-typological introduction, II. John Benjamins; 1990. 69 Nenkova A, Vanderwende L, McKeown K. A compositional context sensitive multidocument summarizer: exploring the factors that influence summarization. In: Proc ACM SIGIR Conf Res Deve Inf Retr, New York; 2006. pp 573–580. 70 Teng Z, Liu Y, Ren F, Tsuchiya S, Ren F. Single document summarization based on local topic identification and word frequency. In: Proc 7th Mexican Int Conf Artifi Intell; 2008. pp 37–41. Citing Literature Volume26, Issue12December 2011Pages 1125-1152 ReferencesRelatedInformation

Ver no editor

Altmetric

PlumX

Entrar

Lembrar minha senha

Receber meu e-mail de confirmação

Text summarization contribution to semantic question answering: New approaches for finding answers on the web