Uplift models, which predict an individual’s change in behavior due to a specific intervention such as a marketing campaign, have become increasingly popular in both business and scientific fields because they allow direct comparison between the cost and potential return of a discretionary action. Because the change in behavior due to an action can not be observed for any isolated individual (since they cannot be both subject to the action and not), traditional approaches both for fitting models and evaluating model quality do not apply directly. Although a variety of methods for fitting uplift models continue to be proposed, surprisingly little work has been done on metrics for assessing and comparing model quality.
This paper reviews potential metrics by analogy with traditional models, including nominal, ranked and parametric measures. In particular, we recommend the Qini coefficient (Radcliffe, 2004) and establish a strong mathematical correspondence with the traditional Gini coefficient which inspired it. The Qini is shown to measure the strength of (anti-) correlation between uplift rate and targeting depth as ordered by model score. Several equivalent geometric interpretations of Gini and Qini are demonstrated, including a generalization of the traditional Lorenz curve. This yields both a more effective practical approach for calculating the Qini coefficient, as well as suggesting an extension of the Kolmogorov-Smirnov statistic to uplift models.