A type-token identity in the Simon-Yule model of text
1989; Wiley; Volume: 40; Issue: 1 Linguagem: Inglês
10.1002/(sici)1097-4571(198901)40
ISSN1097-4571
AutoresYe-Sho Chen, Ferdinand F. Leimkuhler,
Tópico(s)Linguistics and language evolution
ResumoJournal of the American Society for Information ScienceVolume 40, Issue 1 p. 45-53 Research A type-token identity in the Simon-Yule model of text Ye-Sho Chen, Corresponding Author Ye-Sho Chen Department of Quantitative Business Analysis, Louisiana State University, Baton Rouge, LA 70803Department of Quantitative Business Analysis, Louisiana State University, Baton Rouge, LA 70803Search for more papers by this authorFerdinand F. Leimkuhler, Ferdinand F. Leimkuhler School of Industrial Engineering, Purdue University, West Lafayette, IN 47907Search for more papers by this author Ye-Sho Chen, Corresponding Author Ye-Sho Chen Department of Quantitative Business Analysis, Louisiana State University, Baton Rouge, LA 70803Department of Quantitative Business Analysis, Louisiana State University, Baton Rouge, LA 70803Search for more papers by this authorFerdinand F. Leimkuhler, Ferdinand F. Leimkuhler School of Industrial Engineering, Purdue University, West Lafayette, IN 47907Search for more papers by this author First published: January 1989 https://doi.org/10.1002/(SICI)1097-4571(198901)40:1 3.0.CO;2-SCitations: 11AboutPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onEmailFacebookTwitterLinkedInRedditWechat Abstract There are three significant results in this paper. First, we establish a type-token identity relating the type-token ratio and the bilogarithmic type-token ratio. The plays of Shakespeare and other interesting texts serve as demonstrative examples. Second, the Simon-Yule model of Zipf's law is used to derive the type-token identity and provide a promising statistical model of text generation. Third, a realistic refinement of the Simon-Yule model is made to allow for a decreasing entry rate of new words. Simulation methods are used to show that the type-token identity is preserved with this change in assumptions. © 1989 John Wiley & Sons, Inc. References 1 Chen, Y. S. Statistical Models of Text: A System Theory Approach. Ph.D. dissertation, Purdue University; 1985. Google Scholar 2 Chotolos, J. “Studies in Language Behavior.” Psychology Monograph. V56; 1944. Google Scholar 3 Devooght, J. “Sur la loi de Zipf-Mandelbrot.” Bull. Cl. Sci. Acad. Roy. Belg. 4; 1957. Google Scholar 4 Edmunson, H. P. “Statistical Inference in Mathematical and Computational Linguistics.” International Journal of Computer and Information Sciences. 95–129, V6, N2; 1977. Google Scholar 5 Edmumson, H. P. “Mathematical Models of Text.” Information Processing & Management. 20(l-2): 261–268; 1984. 10.1016/0306-4573(84)90058-X Web of Science®Google Scholar 6 Efron, B. and Thisted, R. “Estimating the Number of Unseen Species: How Many Words Did Shakespeare Know?” Biometrika 63, 435–437. Google Scholar 7 Guiraud, P. Les Caracteres Statistiques du Vocabulaire. Presses Universitaires de France, Paris; 1954. Google Scholar 8 Herdan, G. Type-Token Mathematics: A Textbook of Mathematical Linguistics. Moutor & Co., The Hague; 1960. Google Scholar 9 Ijiri, Y.; Simon, H. A. Skew Distributions and the Sizes of Business Firms. North-Holland Publishing Company; 1977. Google Scholar 10 Johnson, B. D. Analysis and Simulation of the information Productivity of Scientific Journals. Master Thesis, School of Industrial Engineering, Purdue University; 1983. Google Scholar 11 Lelyveld, J. “ A Scholar's Find: Shakespearean Lyric.” The New York Times, (November 24, 1985); 1–12. With corrections of “Editors's Note”, (November 25, 1985); 2. Google Scholar 12 Mandelbrot, B. “ An Information Theory of the Statistical Structure of Language.” Proceedings of the Symposium on Applications of Communication Theory. London, September 1952. London: Butter-worths; 1953: 486–500. Google Scholar 13 Mandelbrot, B. “Final Note on a Class of Skew Distribution Functions: Analysis and Critique of a Model Due to H. A. Simon.” Information and Control. 4, 198–216; 1961. 10.1016/S0019-9958(61)80008-9 Web of Science®Google Scholar 14 Miller, G. and Chomsky, N. “ Finitary Models of Language Users.” Handbook of Mathematical Psychology (Edited by R. Lute, R. Bush and E. Galanter,) Vol. II, pp. 419–491. Wiley, New York (1963). Google Scholar 15 Parunak, A. “Graphical Analysis of Ranked Counts (of words).” Journal of the American Statistical Association. 74(365): 25–30; 1979. 10.2307/2286716 Web of Science®Google Scholar 16 Simon, H. A. “On a Class of Skew Distribution Function.” Biometrika. 42: 425–440; 1955. 10.1093/biomet/42.3-4.425 Web of Science®Google Scholar 17 Simon, H. A. “Some Further Notes on a Class of Skew Distribution Functions.” Information and Control. 3: 80–88; 1960. 10.1016/S0019-9958(60)90302-8 Google Scholar 18 Simon, H. A. “Some Monte Carlo Estimates of the Yule Distribution.” Behavior Science. 8: 203–210; 1963. 10.1002/bs.3830080305 Google Scholar 19 Spevack, M. “ A Complete and Systematic Concordance to the Works of Shakespeare.” Vols. I-IV. George Olms, Hildesheim; 1968. Google Scholar 20 Taylor, G. “ Shakespeare's New Poem: A Scholar's Clues and Conclusions.” New York Times Book Review (December 15), 11–14. Google Scholar 21 Tesitelova, M. “ On the So-called Vocabulary Richness.” Prague Studies in Mathematical Linguistics. 103–120; 1971. Google Scholar 22 Thisted, R. and Efron, B. “ Did Shakespeare Write a Newly-Discovered Poem?” Technical Report No. 244. Department of Statistics, Stanford University. April 1986. Google Scholar 23 Thorndike, E. L. “ Book Review — National Unity and Disunity by G.K. Zipf.” Science. 4 July, 1941, V94, p. 19. Google Scholar 24 Yule, G. U. “A Mathematical Theory of Evolution, Based on the Conclusions of Dr. J.C. Willis, E R. S.”, Philosophical Transactions of the Royal Society of London, Series B. 213: 21–87; 1924. 10.1098/rstb.1925.0002 Google Scholar 25 Yule, G. U. “ A Statistical Study of Vocabulary.” Cambridge, England: Cambridge University Press; 1944. Google Scholar 26 Zipf, G. K. “ Human Behavior and the Principle of Least Effort.” Reading, MA: Addison Wesley, 1949. Google Scholar Citing Literature Volume40, Issue1January 1989Pages 45-53 ReferencesRelatedInformation
Referência(s)