Artigo Acesso aberto

Case-sensitive letter and bigram frequency counts from large-scale English corpora

2004; Springer Science+Business Media; Volume: 36; Issue: 3 Linguagem: Inglês

10.3758/bf03195586

ISSN

1532-5970

Autores

Michael N. Jones, D. J. K. Mewhort,

Tópico(s)

Neurobiology of Language and Bilingualism

Resumo

We tabulated upper- and lowercase letter frequency using several large-scale English corpora (∼183 million words in total). The results indicate that the relative frequencies for upper- and lowercase letters are not equivalent. We report a letter-naming experiment in which uppercase frequency predicted response time to uppercase letters better than did lowercase frequency. Tables of case-sensitive letter and bigram frequency are provided, including common nonalphabetic characters. Because subjects are sensitive to frequency relationships among letters, we recommend that experimenters use case-sensitive counts when constructing stimuli from letters.

Referência(s)