The Minimum Description Length Principle

Paperback
$80.00 US
On sale Mar 23, 2007 | 736 Pages | 9780262529631

A comprehensive introduction and reference guide to the minimum description length (MDL) Principle that is accessible to researchers dealing with inductive reference in diverse areas including statistics, pattern classification, machine learning, data mining, biology, econometrics, and experimental psychology, as well as philosophers interested in the foundations of statistics.

The minimum description length (MDL) principle is a powerful method of inductive inference, the basis of statistical modeling, pattern recognition, and machine learning. It holds that the best explanation, given a limited set of observed data, is the one that permits the greatest compression of the data. MDL methods are particularly well-suited for dealing with model selection, prediction, and estimation problems in situations where the models under consideration can be arbitrarily complex, and overfitting the data is a serious concern. This extensive, step-by-step introduction to the MDL Principle provides a comprehensive reference (with an emphasis on conceptual issues) that is accessible to graduate students and researchers in statistics, pattern classification, machine learning, and data mining, to philosophers interested in the foundations of statistics, and to researchers in other applied sciences that involve model selection, including biology, econometrics, and experimental psychology.

Part I provides a basic introduction to MDL and an overview of the concepts in statistics and information theory needed to understand MDL. Part II treats universal coding, the information-theoretic notion on which MDL is built, and part III gives a formal treatment of MDL theory as a theory of inductive inference based on universal coding. Part IV provides a comprehensive overview of the statistical theory of exponential families with an emphasis on their information-theoretic properties. The text includes a number of summaries, paragraphs offering the reader a "fast track" through the material, and boxes highlighting the most important concepts.

Peter D. Grünwald is a researcher at CWI, the National Research Institute for Mathematics and Computer Science, Amsterdam, the Netherlands. He is also affiliated with EURANDOM, the European Research Institute for the Study of Stochastic Phenomena, Eindhoven, the Netherlands.

About

A comprehensive introduction and reference guide to the minimum description length (MDL) Principle that is accessible to researchers dealing with inductive reference in diverse areas including statistics, pattern classification, machine learning, data mining, biology, econometrics, and experimental psychology, as well as philosophers interested in the foundations of statistics.

The minimum description length (MDL) principle is a powerful method of inductive inference, the basis of statistical modeling, pattern recognition, and machine learning. It holds that the best explanation, given a limited set of observed data, is the one that permits the greatest compression of the data. MDL methods are particularly well-suited for dealing with model selection, prediction, and estimation problems in situations where the models under consideration can be arbitrarily complex, and overfitting the data is a serious concern. This extensive, step-by-step introduction to the MDL Principle provides a comprehensive reference (with an emphasis on conceptual issues) that is accessible to graduate students and researchers in statistics, pattern classification, machine learning, and data mining, to philosophers interested in the foundations of statistics, and to researchers in other applied sciences that involve model selection, including biology, econometrics, and experimental psychology.

Part I provides a basic introduction to MDL and an overview of the concepts in statistics and information theory needed to understand MDL. Part II treats universal coding, the information-theoretic notion on which MDL is built, and part III gives a formal treatment of MDL theory as a theory of inductive inference based on universal coding. Part IV provides a comprehensive overview of the statistical theory of exponential families with an emphasis on their information-theoretic properties. The text includes a number of summaries, paragraphs offering the reader a "fast track" through the material, and boxes highlighting the most important concepts.

Author

Peter D. Grünwald is a researcher at CWI, the National Research Institute for Mathematics and Computer Science, Amsterdam, the Netherlands. He is also affiliated with EURANDOM, the European Research Institute for the Study of Stochastic Phenomena, Eindhoven, the Netherlands.

Three Penguin Random House Authors Win Pulitzer Prizes

On Monday, May 5, three Penguin Random House authors were honored with a Pulitzer Prize. Established in 1917, the Pulitzer Prizes are the most prestigious awards in American letters. To date, PRH has 143 Pulitzer Prize winners, including William Faulkner, Eudora Welty, Josh Steinbeck, Ron Chernow, Anne Applebaum, Colson Whitehead, and many more. Take a look at our 2025 Pulitzer Prize

Read more

Books for LGBTQIA+ Pride Month

In June we celebrate Lesbian, Gay, Bisexual, Transgender, Queer, Intersex, and Asexual + (LGBTQIA+) Pride Month, which honors the 1969 Stonewall riots in Manhattan. Pride Month is a time to both celebrate the accomplishments of those in the LGBTQ+ community and recognize the ongoing struggles faced by many across the world who wish to live

Read more