[Download]

Unlike most statistical methods, which are based on assumptions about

a “true” underlying probability distribution, Minimum Description

Length (MDL) methods are designed to optimize an information theoretic

criterion. Although it is known that both design criteria tend to lead

to similar statistical performance, there do exist cases where they

disagree. In my thesis, I analyse two such cases.

In the first case it is found that a standard MDL method can be

improved, both from a information theoretic and a probabilistic point of

view, after which the two criteria turn out to agree after all. In the

second case the disagreement turns out to be fundamental.

## Contents

Chapter | Description |
---|---|

1 | General introduction to the Minimum Description Length principle |

2 | The catch-up phenomenon in Bayesian model selection |

3 & 4 | Switching between prediction strategies (online learning, related to the catch-up phenomenon) |

5 | Convergence results for MDL parameter estimation |

6 | Overview of the basic properties of Rényi divergence |