Hillstrom’s MineThatData Email Analytics Challenge: An Approach Using Uplift Modelling
|Download paper (404K)|
Kevin Hillstrom, through his MineThatData blog, made available a dataset describing two email campaigns and a control group and issued a challenge to analyse that dataset and answer various questions. This paper uses Uplift Modelling to take up (and, in fact, win) Hillstrom’s challenge. In the paper, we look at three different formulations of the problem and use Uplift Models to analyse each of the campaigns using those formulations. Broadly, our conclusions are that both campaigns had a positive impact overall and that both can be modelled successfully, allowing us to identify subpopulations particularly suitable and unsuitable for these campaigns. We also identified some segments for which the Women’s mailing, in particular, appeared to reduce rather than increase spending; such effects were less prominent with the Men’s campaign. We also observed that while average spend among purchasers increased significantly with the Women’s campaign, it declined slightly for recipients of the Men’s campaign. Furthermore, an extremely small number of people (of the order of 50) were responsible for over half of the incremental spend, making modelling challenging. Indeed, the problem in tackling this entire analysis was the difficulty of estimating most statistics reliably given the relatively small samples available and the low purchase rate. Despite these obstacles, by using fairly simple models and a variety of methods for controlling noise, we believe that the insights we present are fairly robust.