Contemporary Analysis

Data Science

Tadd Wood

Occam's Razor and Model Complexity

When using predictive analytics to develop a model it is important to understand the principles of model complexity.  Occam's Razor is a concept that is frequently stated, but not always fully understood.  The basic idea is that "All else being equal, simpler models should be favored over more complex ones."  It is concept we both embrace and approach with caution so that it is not misused.

First, let's flesh out the concept of Occam's Razor beyond the simple aphorism given above as it can apply to predictive analytics.

Suppose I flip a coin ten times, and I get a run that goes "HHTTTHHTTT".  After observing the coin flips I assess that there are two possible models for the behavior of the coin:

(A) The coin is fair and has a 50/50 chance of getting either heads or tails on each flip.  The observed run was just one of 1024 possible results of the ten coin flips.

(B) The coin flips are deterministic and will land in a repeating pattern of "HHTTT" which perfectly fits with the results of our sample of coin flips.

Without further experimentation I have no certain way of knowing which model is actually true.  If I were to flip the coin five more times, if I got anything other than "HHTTT" all confidence in (B) would be gone, the same cannot be said for (A).  This is because (B) is a much more complex model then (A).  It other words, it would take much more evidence to be confident in (B) over (A).

Keeping this concept in mind is important when developing predictive models.


See all