Outro Acesso aberto

References

2018; Wiley; Linguagem: Inglês

10.1002/9781119439868.refs

ISSN

1940-6347

Autores

Steven W. Knox,

Tópico(s)

Neural Networks and Applications

Resumo

Free Access References Steven W. Knox, Steven W. KnoxSearch for more papers by this author Book Author(s):Steven W. Knox, Steven W. KnoxSearch for more papers by this author First published: 25 March 2018 https://doi.org/10.1002/9781119439868.refsBook Series:Wiley Series in Probability and Statistics AboutPDFPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShareShare a linkShare onFacebookTwitterLinked InRedditWechat References Agresti, A. (1984). Analysis of Ordinal Categorical Data. John Wiley & Sons, Inc. Akaike, H. (1974). Information theory and an extension of the maximum likelihood principle. In: 2nd International Symposium on Information Theory, reprinted in Breakthroughs in Statistics, vol. 1, pp. 599– 624, edited by S. Kotz and N. L. Johnson, 1992. Anguita, D., Ghelardoni, L., Ghio, A., Oneto, L., and Ridella, S. (2012). The “K” in K-fold cross validation. In: Proceedings of the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, Bruges, Belgium, pp. 441– 446. Aristotle. (350 BCE). Politics, translated by B. Jowett in 1885. The Internet Classics Archive. Available at classics.mit.edu/Aristotle/politics.mb.txt Arlot, S. and Celisse, A. (2010). A survey of cross-validation procedures for model selection. Statistics Surveys, 4, 40– 79. Armstrong, J. (2001). Principles of Forecasting: A Handbook for Researchers and Practitioners. Springer. Ash, R. (1965). Information theory. In: Interscience Tracts in Pure and Applied Mathematics, vol. 19. Interscience Publishers. Austen, J. (2004). Pride and Prejudice. Dalmation Press. Battiti, R. (1989). Accelerated backpropagation learning: Two optimization methods. Complex Systems, 3, 331– 342. Bartholomew, D. (2013). Unobserved Variables: Models and Misunderstandings. Springer. Bellman, R. (1957). Dynamic Programming. Princeton University Press. Bengio, Y. and Grandvalet, Y. (2004). No unbiased estimator of the variance of K-fold cross-validation. Journal of Machine Learning Research, 5, 1089– 1105. Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B (Methodological), 57(1), 289– 300. Benjamini, Y. and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. Annals of Statistics, 29(4), 1165– 1188. Bentley, J. L. (1975). Multidimensional binary search trees used for associative searching. Communications of the ACM, 18(9), 509– 517. Biau, G. and Devroye, L. (2010). On the layered nearest neighbor method, the bagged nearest neighbor estimate, and the random forest method in regression and classification. Journal of Multivariate Analysis, 101(10), 2499– 2518. Biau, G., Devroye, L., and Lugosi, G. (2008). Consistency of random forests and other averaging classifiers. Machine Learning, 9, 2015– 2033. Billingsley, P. (1986). Probability and Measure, 2nd ed. John Wiley & Sons, Inc. Borg, I. and Groenen, P. (1997). Modern Multidimensional Scaling: Theory and Applications. Springer. Box, G. E. P. and Draper, N. R. (1987). Empirical Model Building and Response Surfaces. Wiley. Bradbury, R. (1953). Fahrenheit 451. Ballantine Books. Breiman, L. (1995). Better subset selection using the nonnegative garrote. Technometrics, 37(4), 373– 384. Breiman, L. (1996a). Bagging predictors. Machine Learning, 26, 123– 140. Breiman, L. (1996b). Bias, variance and arcing classifiers. Technical Report 460, Statistics Department, University of California at Berkeley, pp. 801– 849. Breiman, L. (1998). Arcing classifiers. Annals of Statistics, 23(3), 801– 849. Breiman, L. (2001a). Random forests. Machine Learning, 45, 5– 32. Breiman, L. (2001b). Statistical modeling: The two cultures. Statistical Science, 16(3), 199– 231. Breiman, L. (2004). Consistency for a simple model of random forests. Technical Report 670, Statistics Department, University of California at Berkeley. Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. (1984). Classification and Regression Trees. Wadsworth International Group. Brown, C. (2013). hash: Full feature implementation of hash/associated arrays/dictionaries. R package version 2.2.6. Available at https://CRAN.R-project.org/package=hash Buck, R. C. (1943, November). Partition of space. The American Mathematical Monthly, 50(9), 541– 544. Burges, C. J. C. (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2, 121– 167. Bylander, T. (2002). Estimating generalization error on two-class datasets using out-of-bag estimates. Machine Learning, 48(1–3), 287– 297. Carlisle, A. and Dozier, G. (2000). Adapting particle swarm optimization to dynamic environments. In: International Conference on Artificial Intelligence, Las Vegas, NV, vol. 1, pp. 429– 434. Casella, G. and Berger, R. L. (2002). Statistical Inference, 2nd ed. Duxbury. Chawla, N. V., Hall, L. O., Bowyer, K. W., and Kegelmeyer, W. P. (2004). Learning ensembles from bites: A scalable and accurate approach. Journal of Machine Learning Research, 5, 421– 451. Clark, A. and Kubrick, S. (1968). 2001: A Space Odyssey. MGM. Conover, W. J. (1999). Practical Nonparametric Statistics, 3rd ed. John Wiley & Sons, Inc. Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. (2009). Introduction to Algorithms, 3rd ed. MIT Press. Cover, T. and Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21– 27. Cover, T. M. and Thomas, J. A. (2006). Elements of Information Theory. John Wiley & Sons, Inc. Cox, D. R. and Lewis, P. A. W. (1966). The Statistical Analysis of Series of Events. Chapman & Hall. D'Agostino, R. B. and Stephens, M. A. (1986). Goodness-of-Fit Techniques. Dekker. Dasgupta, S. and Anupam, G. (2003). An Elementary Proof of a Theorem of Johnson and Lindenstrauss. Random Structures and Algorithms, 22(1), 60– 65. DeGroot, M. (1970). Optimal Statistical Decisions. McGraw-Hill. Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B (Methodological), 39(1), 1– 38. Devroye, L., Györfi, L., and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Springer. Dietterich, T. G. (2000). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40, 139– 157. Domingos, P. (1999). The role of Occam's razor in knowledge discovery. Data Mining and Knowledge Discovery, 3(4), 409– 425. Domingos, P. (2000). A unified bias-variance decomposition for zero-one and squared loss. In: Proceedings of the 17th International Conference on Machine Learning, Stanford, CA, pp. 231– 238. Duda, R. O., Hart, P. E., and Stork, D. G. (2001). Pattern Classification, 2nd ed. John Wiley & Sons, Inc. Eckert, C. and Young, G. (1936). The Approximation of One Matrix by Another of Lower Rank. Psychometrika, 1(3), 211– 219. Efron, B. (1979). Bootstrap methods: Another look at the jackknife. Annals of Statistics, 7, 1– 26. Efron, B. (1982). The Jackknife, the Bootstrap, and Other Resampling Plans, vol. 38. Philadelphia Society for Industrial and Applied Mathematics. Efron, B. (1983, June). Estimating the error rate of a prediction rule: Improvement on cross-validation. Journal of the American Statistical Association, 78, 316– 331. Efron, B. (1986, June). How biased is the apparent error of a prediction rule? Journal of the American Statistical Association, 81(394), 461– 470. Efron, B. (2004, September). The estimation of prediction error: Covariance penalties and cross-validation. Journal of the American Statistical Association, 99, 619– 632. Efron, B. and Tibshirani, R. (1997, June). Improvements on cross-validation: The .632+ bootstrap method. Journal of the American Statistical Association, 92(438), 548– 560. Epanechnikov, V. A. (1967). Non-parametric estimation of a multivariate probability density. Theory of Probability and Its Applications, 14(1), 153– 158 (translated by B. Seckler). Feller, W. (1967). An Introduction to Probability Theory and Its Applications, vols. 1 ( 3rd ed.) and 2 (2nd ed.). John Wiley & Sons, Inc. Ferguson, T. (1996). A Course in Large Sample Theory. Chapman & Hall. Feynman, R. (2001). What Do You Care What Other People Think? W. W. Norton & Co. Fisher, R. A. (1937). The Design of Experiments, Second Edition. Oliver and Boyd. Fisher, R. A. (1938). Presidential Address. Sankhyā: The Indian Journal of Statistics, 4(1), 14– 17. Fisher, R. A. (1950). Statistical Methods for Research Workers, 11th ed. Hafner Publishing Co. Fix, E. and Hodges, J. L. (1951). Discriminatory analysis, nonparametric discrimination: Consistency properties. Technical Report 4, USAF School of Aviation Medicine. Freund, Y. and Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119– 139. Friedman, J. H. (1997a). Data mining and statistics: What's the connection? In: Proceedings of the 29th Symposium on the Interface between Computer Science and Statistics, Houston, TX. Friedman, J. H. (1997b). On bias, variance, 0/1-loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery, 1(1), 55– 77. Friedman, J. H. and Tukey, J. W. (1974). A projection pursuit algorithm for exploratory data analysis. IEEE Transactions in Computing, C-23(9), 881– 889. Friedman, J., Hastie, T., and Tibshirani, R. (2000). Additive logistic regression: A statistical view of boosting. Annals of Statistics, 28(2), 337– 407. Fung, G. and Mangasarian, O. L. (2001). Proximal support vector machines. In: KDD 2001: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, pp. 77– 86. Furnival, G. M. and Wilson Jr., R. W. (1974, November). Regression by leaps and bounds. Technometrics, 16(4), 499– 511. Galton, F. (1886). Regression towards mediocrity in hereditary stature. Anthropological Miscellanea, 15, 246– 263. Galton, F. (1907). Vox populi. Nature, 75, 450– 451. Gan, G., Ma, C., and Wu, J. (2007). Data Clustering: Theory, Algorithms, and Applications. SIAM. Geman, S., Bienenstock, E., and Doursat, R. (1992). Neural networks and the bias/variance dilemma. Neural Computation, 4(1), 1– 58. Gentle, J. E. (1998). Numerical Linear Algebra for Applications in Statistics. Springer. Geurts, P., Ernst, D., and Wehenkel, L. (2006). Extremely randomized trees. Machine Learning, 63(1), 3– 42. Golub, G. H. and Van Loan, C. F. (1996). Matrix Computations, 3rd ed. Johns Hopkins. Gonick, L. (1993). The Cartoon Guide to Statistics. Harper Perennial. Green, P. J. (1995, December). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika, 82(4), 711– 732. Grenander, U. (1993). General Pattern Theory: A Mathematical Study of Regular Structures. Clarendon Press. Grenander, U. (1996). Elements of Pattern Theory. Johns Hopkins University Press. Hamming, R. (1995). n-Dimensional Space. Online video. Youtube.securitylectures, August 8, 2012. Hand, D. J. and Yu, K. (2007). Idiot's Bayes—Not so stupid after all? International Statistical Review, 69(3), 385– 398. Hart, P. (1968). The condensed nearest neighbor rule. IEEE Transactions on Information Theory, 18(1), 515– 516. Hartigan, J. A. (1975). Clustering Algorithms. John Wiley & Sons, Inc. Hastie, T., Tibshirani, R., and Friedman, J. (2001). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer. Hinton, G. E. and Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786), 504– 507. Ho, T. K. (1995). Random decision forests. In: Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, Quebec, Canada, pp. 278– 282. Howard, J. (2012, December). Down with Experts. New Scientist. Huber, P. J. (1985). Projection Pursuit. The Annals of Statistics, 13(2), 435– 475. Huff, D. (1954). How to Lie with Statistics. Norton. Jacobs, R. A., Jordan, M. I., Nowlan, S. J., and Hinton, G. E. (1991). Adaptive mixtures of local experts. Neural Computation, 3, 79– 87. James, G. M. (2003). Variance and bias for general loss functions. Machine Learning, 51, 115– 135. James, G. M. and Hastie, T. (1997). Generalizations of bias/variance decompositions for prediction error. Technical Report, Department of Statistics, Stanford University. Jeffreys, H. (1931). Scientific Inference. Cambridge University Press. Jeffreys, H. (1948). Theory of Probability. Clarendon Press. Johnson, N. L., Kotz, S., and Balakrishnan, N. (1993). Discrete Univariate Distributions, 2nd ed. John Wiley & Sons, Inc. Johnson, N. L., Kotz, S., and Balakrishnan, N. (1994). Continuous Univariate Distributions, vol. 1, 2nd ed. John Wiley & Sons, Inc. Johnson, N. L., Kotz, S., and Balakrishnan, N. (1997). Discrete Multivariate Distributions. John Wiley & Sons, Inc. Johnson, W. B. and Lindenstrauss, J. (1984). Extensions of Lipschitz Mappings into a Hilbert Space. Contemporary Mathematics, 26, 189– 206. Kaufman, L. and Rousseeuw, P. J. (1990). Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, Inc. Kennedy, J. and Eberhart, R. (1995). Particle swarm optimization. In: Proceedings of the IEEE International Conference on Neural Networks, Perth, Western Australia, pp. 1942– 1948. Kernighan, B. W. and Ritchie, D. M. (1988). The C Programming Language. Prentice Hall. Kleinberg, J. (2002). An impossibility theorem for clustering. In: Advances in Neural Information Processing Systems 15. Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the International Joint Conference on Artificial Intelligence, Montreal, Quebec, Canada, pp. 1137– 1143. Kohavi, R. and Wolpert, D. H. (1996). Bias plus variance decomposition for zero-one loss functions. In: Proceedings of the Thirteenth International Conference on Machine Learning, Bari, Italy, pp. 275– 283. Kohonen, T. (1986). Learning vector quantization for pattern recognition. Technical Report TKK-F-A601, Helsinki University of Technology. Kolmogorov, A. N. (1956). Foundations of the Theory of Probability. Chelsea. Kolmogorov, A. N. (1968). Three approaches to the quantitative definition of information. International Journal of Computer Mathematics, 2, 157– 168. Kong, E. B. and Dietterich, T. G. (1995). Error-correcting output coding corrects bias and variance. In: International Conference on Machine Learning, pp. 313– 321. Kotz, S., Balakrishnan, N., and Johnson, N. L. (2000). Continuous Multivariate Distributions, vol. 1, 2nd ed. John Wiley & Sons, Inc. Kullback, S. and Leibler, R. A. (1951, March). On information and sufficiency. Annals of Mathematical Statistics, 22(1), 79– 86. Le Borgne, Y.-A. (2005). Bias–variance trade-off characterization in a classification problem: What differences with regression? Technical Report 534, Machine Learning Group, Université Libre de Bruxelles. Liaw, A. and Wiener, M. (2002). Classification and regression by randomForest. R News, 2(3), 18– 22. Liu, F. T., Ting, K. M., Yu, Y., and Zhou, Z.-H. (2008). Spectrum of variable-random trees. Journal of Artificial Intelligence Research, 32, 355– 384. Mardia, K. V., Kent, J. T., and Bibby, J. M. (1979). Multivariate Analysis. Academic Press. Meyer, D. (2015). svm() internals. In: E. Dimitriadou, K. Hornik, F. Leisch, D. Meyer, and A. Weingessel, e1071: Misc Functions of the Department of Statistics (e1071), TU Wien Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., and Leisch, F. (2017). e1071: Misc functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien. R package version 1.6-8. Available at https://CRAN.R-project.org/package=e1071 Miller, D. S. and Day, C. H. (1986). Berry Finder: A Guide to Native Plants with Fleshy Fruits, 2nd ed. Nature Study Guild Publishers. Mosteller, F. and Wallace, D. L. (1963, June). Inference in an authorship problem. Journal of the American Statistical Association, 58(302), 275– 309. Pagano, M. and Anoke, S. (2013). Mommy's baby, daddy's maybe: A closer look at regression to the mean. Chance, 26(3), 4– 9. Paulos, J. A. (2001). Innumeracy: Mathematical Illiteracy and Its Consequences. Hill & Wang. Pearson, K. (1901). On Lines and Planes of Closest Fit to Systems of Points in Space. Philosophical Magazine, 2, 559– 572. Press, W., Teukolsky, S., Vetterling, W., and Flannery, B. (1992). Numerical Recipes in C: The Art of Scientific Computing, 2nd ed. Cambridge University Press. Radicati, S. and Levenstein, J. (2013). Email Statistics Report, 2013–2017. The Radicati Group. Rawlings, J. O., Pantula, S. G., and Dickey, D. A. (2001). Applied Regression Analysis: A Research Tool, 2nd ed. Springer. R Core Team. (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Available at https://www.R-project.org/ Rebonato, R. (2007). Plight of the Fortune Tellers: Why We Need to Manage Financial Risk Differently. Princeton University Press. Rissanen, J. (1978). Modeling by shortest data description. Automatica, 14(5), 465– 471. Rissanen, J. (1983). A Universal Prior for Integers and Estimation by Minimum Description Length. The Annals of Statistics, 11(2), 416– 431. Rodriguez, J. D., Perez, A., and Lozano, J. A. (2010). Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3), 569– 575. Ross, S. M. (1996). Stochastic Processes. John Wiley & Sons, Inc. Sakamoto, Y., Ishiguro, M., and Kitagawa, G. (1986). Akaike Information Criterion Statistics. D. Reidel Publishing Co. Salamon, P., Sibani, P., and Frost, R. (2002). Facts, Conjectures, and Improvements for Simulated Annealing. SIAM. Schapire, R. E. (2013). Explaining AdaBoost. In: Empirical Inference: Festschrift in Honor of Vladimir N. Vapnik, pp. 37– 52, edited by B. Schölkopf, Z. Luo, and V. Vovk. Springer. Schapire, R. E. and Freund, Y. (2012). Boosting: Foundations and Algorithms. MIT Press. Schclar, A. and Rokach, L. (2009). Random projection ensemble classifiers. Lecture Notes in Business Information Processing, 24, 309– 316. Schervish, M. (1995). Theory of Statistics. Springer. Schwartz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461– 464. Scott, D. W. (1992). Multivariate Density Estimation: Theory, Practice and Visualization. John Wiley & Sons, Inc. Scrucca, L., Fop, M., Murphy, T. B., and Raftery, A. E. (2016). mclust 5: Clustering, classification and density estimation using Gaussian finite mixture models. The R Journal, 8(1), 289– 317. Searle, S. R. (1971). Linear Models. John Wiley & Sons, Inc. Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27, 379– 423. Additional content from October 1948 issue, pp. 623–656. Silver, N. (2012). The Signal and the Noise. Penguin. Smith, E. E. (1943). Second Stage Lensmen, Fantasy Press. Reprinted in Smith, E. E. (1999). Chronicles of the Lensmen, Vol. 2, SFBC. Solomonoff, R. J. (1964a). A formal theory of inductive inference: Part I. Information and Control, 7(1), 1– 22. Solomonoff, R. J. (1964b). A formal theory of inductive inference: Part II. Information and Control, 7(2), 224– 254. Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society, Series B (Methodological), 36(2), 111– 147. Stone, M. (1977). An asymptotic equivalence of choice of model by cross-validation and Akaike's criterion. Journal of the Royal Statistical Society, Series B (Methodological), 39(1), 44– 47. Strang, G. (1988). Linear Algebra and Its Applications, 3rd ed. Harcourt Brace Jovanovich. Sugar, C. A. and James, G. M. (2003). Finding the number of clusters in a dataset: An information-theoretic approach. Journal of the American Statistical Association, 98(463), 750– 763. Suits, D. B. (1957, December). Use of dummy variables in regression equations. Journal of the American Statistical Association, 52(280), 548– 551. Taylor, H. M. and Karlin, S. (1994). An Introduction to Stochastic Modeling, revised ed. Academic Press. Tenenbaum, J. B., De Silva, V., and Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500), 2319– 2323. Therneau, T., Atkinson, B., and Ripley, B. (2017). rpart: Recursive partitioning and regression trees. R package version 4.1-11. Available at https://CRAN.R-project.org/package=rpart Thode Jr., H. C. (2002). Testing for Normality. Marcel Dekker. Tibshirani, R. (1996a). Bias, variance, and prediction error for classification rules. Technical Report, Department of Statistics, University of Toronto. Tibshirani, R. (1996b). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B, 58(1), 267– 288. Tibshirani, R. and Knight, K. (1999). Model search by bootstrap “bumping.” Journal of Computational and Graphical Statistics, 8(4), 671– 686. Torgerson, W. S. (1952). Multidimensional scaling: I. Theory and method. Psychometrika, 17(4), 401– 419. Tufte, E. R. (2001). The Visual Display of Quantitative Information, 2nd ed. Graphics Press. Tufte, E. R. (2003a). The Cognitive Style of PowerPoint. Graphics Press. Tufte, E. R. (2003b). Visual and Statistical Thinking. Graphics Press. Tukey, J. (1962). The Future of Data Analysis. Annals of Mathematical Statistics, 33(1), 1– 67. Tukey, J. (1986). Sunset Salvo. The American Statistician, 40(1), 72– 76. Turing, A. (1950, October). Computing Machinery and Intelligence. Mind, LIX(236), 443– 460. van der Vaart, A. W. (2000). Asymptotic Statistics. Cambridge University Press. Venables, W. N. and Ripley, B. D. (2001). Modern Applied Statistics with S-PLUS, Third Edition, Springer. Verhulst, P. F. (1838). Notice sur la loi que la population suit dans son accroissement. In: Mathématique et Phsyique de L'Observatoire de Bruxelles, Tome Quatrième, pp. 113– 121. Hauman & Co. Verhulst, P. F. (1844). Recherches in mathématiques sur la loi d'accroissement de la population. Nouveaux Mémoires de l'Académie Royale des Sciences et Belles-Letters de Bruxelles, Tome XVIII, pp. 1– 42. Voss, W. and Evers, L. (2005, August). Course Notes for M.Sc. in Bioinformatics Module 13: Statistical Data Mining, Oxford Bioinformatics Programme, University of Oxford. Wasserman, L. (2004). All of Statistics: A Concise Course in Statistical Inference. Springer. Wasserman, L. (2006). All of Nonparametric Statistics. Springer. Watterson, B. (1998, June 2). Calvin and Hobbes. Webb, G. I. (1996). Further experimental evidence against the utility of Occam's razor. Journal of Artificial Intelligence Research, 4, 397– 417. Wilks, S. S. (1938). The large-sample distribution of the likelihood ratio for testing composite hypotheses. Annals of Mathematical Statistics, 9(1), 60– 62. Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241– 259. Young, G. and Householder, A. S. (1938). Discussion of a Set of Points in Terms of their Mutual Distances. Psychometrika, 3(1), 19– 22. Zamyatin, Y. (1924). We. Translated by M. Ginsburg (1972). Bantam Books. Zhu, J., Rosset, S., Zou, H., and Hastie, T. (2009). Multi-class AdaBoost. Statistics and Its Interface, 9, 349– 360. Machine Learning: a Concise Introduction ReferencesRelatedInformation

Referência(s)