Artigo Revisado por pares

Contextual confusability leads to targeted hyperarticulation

2014; Wiley; Volume: 36; Issue: 36 Linguagem: Inglês

ISSN

1551-6709

Autores

Esteban Buz, T. Florian Jaeger, Michael K. Tanenhaus,

Tópico(s)

Phonetics and Phonology Research

Resumo

Contextual confusability leads to targeted hyperarticulation Esteban Buz (ebuz@bcs.rochester.edu) Department of Brain and Cognitive Sciences, Meliora Hall, Box 270268 Rochester, NY 14627-0268 T. Florian Jaeger (fjaeger@bcs.rochester.edu) Department of Brain and Cognitive Sciences, Meliora Hall, Box 270268 Rochester, NY 14627-0268 Michael K. Tanenhaus (mtan@bcs.rochester.edu) Department of Brain and Cognitive Sciences, Meliora Hall, Box 270268 Rochester, NY 14627-0268 Abstract A central question in the field of language production is the extent to which the speech production system is organized for robust communication. One view holds that speakers’ deci- sion to produce more or less clear signals or to speak faster or slower is primarily or even exclusively driven by the demands inherent to production planning. The opposing view holds that these demands are balanced against the goal to be understood. We investigate the degree of hyperarticulation in the presence of easily confusable minimal pair neighbors (e.g., saying pill when bill is contextually co-present and thus a plausible alter- native). We directly test whether production difficulty alone can explain such hyperarticulation. The results argue against production-centered accounts. We also investigate how spe- cific hyperarticulation is to the segment that contrasts the tar- get against the contextually plausible alternative. Our evidence comes from a novel web-based speech recording paradigm. Keywords: Psychology; Linguistics; Communication; Lan- guage understanding; Speech recognition; Human experimen- tation Introduction One of the central debates in the field of language produc- tion centers around the extent to which speech is designed for robust communication. For example, what determines how fast we talk and how clearly we articulate? Similarly, what determines speakers’ lexical and structural decisions, such as whether they articulate optional words or not (e.g., the op- tional that in I think (that) it is true)? One broadly held view states that the (implicit) decisions speakers make during lan- guage production are mostly or wholly dominated by the at- tentional and memory demands inherent to linguistic encod- ing (e.g., Arnold, 2008; Bard et al., 2000). Following the literature, we refer to this as the production-centered view. This view is called into question by recent work on hyperarticulation. In a series of experiments Baese-Berk and Goldrick (2009) found that speakers hyperarticulate the voiceless stop consonants of target words that have lexical neighbors which only differ from the target in voicing. For example /p/ is longer in pill than pipe, attributable to the fact that pill has the voice contrastive neighbor bill while pipe has no minimal pair *bipe (see also Kirov & Wilson, 2012; Schertz, 2013). Moreover, hyperarticulation of voiceless stop consonants increases when the minimal pair neighbor (i.e., bill) is contextually co-present (e.g., by presenting both words on the same screen, Baese-Berk & Goldrick, 2009; Kirov & Wilson, 2012). One interpretation of these findings (though not necessarily shared by the authors of the above studies) appeals to the fact that one common and important goal of speaking is communication (e.g., Jaeger, 2013; Lindblom, 1990). Just as task-relevant errors drive learning and behavior in non-linguistic motor tasks (Wei & K¨ording, 2009, among others), preferences during language production are taken to be the consequence of implicit learning with the goal to re- duce task-relevant error (Jaeger & Ferreira, 2013). This al- lows the systems underlying language production to strike a balance between production ease and successful information transfer. This trade-off account provides a straightforward explanation for the results of Baese-Berk and Goldrick (2009) and Kirov and Wilson (2012): the likelihood of successful information transfer increases if more confusable words are produced with more distinguishable signals and if hyperar- ticulation is further increased when the word would be even more confusable in its current context. This interpretation seems to be supported by other studies finding that words with more phonological neighbors in the lexicon (words that differ from the target by one phoneme) tend to be hyperarticulated compared to words with fewer phonological neighbors (e.g., Scarborough, 2010). These latter studies found words with a greater number of phonological neighbors are produced with longer vowel durations and vowels that are further from the center of the first and second formant vowel space (greater vowel dispersion), both results suggesting that speakers pro- vide a more distinguishable signal for (a priori) more confus- able words. However, alternative interpretations of the above results have been advanced under the production-centered view (e.g., Baese-Berk & Goldrick, 2009; Bell, Brenier, Gregory, Gi- rand, & Jurafsky, 2009; Gahl, Yao, & Johnson, 2012). Ac- cording to this view, lexical or contextual presence of phono- logically similar words increases production difficulty, which is reflected in hyperarticulation. For example Baese-Berk and Goldrick (2009) argue that competition between phonolog- ically similar forms increases the difficulty of phonological encoding (see also the discussion in Kirov & Wilson, 2013). The idea that difficulty during the planning of a word results

Referência(s)