Artigo Acesso aberto Revisado por pares

A Note on the Statistical Analysis of Sentence-Length as a Criterion of Literary Style

1940; Oxford University Press; Volume: 31; Issue: 3/4 Linguagem: Inglês

10.2307/2332615

ISSN

1464-3510

Autores

C. B. Williams,

Tópico(s)

Authorship Attribution and Profiling

Resumo

SOME years ago I made a number of calculations of the frequency distribution of words of different length in different books to see to what extent authorskept to a definite distribution and so perhaps might be identified by such a method.The results obtained, however, were not striking and the work was put at one side.Mr Udny Yule (1939), however, has attacked the problem of authorship from the angle of the variation in sentence length, and this appears to be a much more fertile method of approach.Mr Yule shows that the frequency distribution of sentence length (i.e.number of words between successive full stops) is of the skew type and by comparing in two different manuscripts, the mean, the median, quartiles and deciles he is able to produce convincing mathematical evidence on the identity or otherwise of their authorship.Mr Yule does not comment on the skew distribution further than to state (p.371) "they are not of the Poisson type, but of the type in which the square of the standard deviation largely exceeds the mean".When I converted some of Yule's tables into diagrams I was struck by their general resemblance to certain skew distributions with which I have recently been dealing in some Entomological problems, and which distributions, I found, became normal and symmetrical if the logarithm of the number was taken as a basis for subdivision into groups instead of the number itself (see Williams, 1927).I was unable to test this transformation on Yule's figures as he unfortunately does not give the original data, but only the word length of sentences in groups of five; so it was necessary to obtain some new data.These I obtained by counting the number of words in each of 600 sentences from the following three books:(1) G. K. Chesterton, A Short History of England, 1917.() H. G. Wells, The Work, Wealth and Happiness of Mankind.(3) G. Bernard Shaw, An Intelligent Woman's Guide to Socialism.All three works deal with the exposition of somewhat similar sociological subjects and none of them are in the "conversational" style.The selection of the sentences was randomized as follows.Each of the books is divided up into chapters, sections or both.In Chesterton's book the first 30 * *-** *.*4 ...

Referência(s)