PhD Thesis: When Data Compression and Statistics Disagree

Abstract

Unlike most statistical methods, which are based on assumptions about a ``true'' underlying probability distribution, Minimum Description Length (MDL) methods are designed to optimize an information theoretic criterion. Although it is known that both design criteria tend to lead to similar statistical performance, there do exist cases where they disagree. In my thesis, I analyze two such cases.

In the first case it is found that a standard MDL method can be improved, both from a information theoretic and a probabilistic point of view, after which the two criteria turn out to agree after all. In the second case the disagreement turns out to be fundamental.

Contents

ChapterDescription
1General introduction to the Minimum Description Length principle
2The catch-up phenomenon in Bayesian model selection
3 & 4Switching between prediction strategies (online learning, related to the catch-up phenomenon)
5Convergence results for MDL parameter estimation
6Overview of the basic properties of Rényi divergence