Why the retention profile is more useful than the churn metric in freemium

Grinding_Gears1

I’ve stated a number of times that, within the context of the freemium model, retention is the most important metric a product manager must track; in the Minimum Viable Metrics methodology, retention is the top-most of the top-line metrics. Retention is not only a proxy measurement for customer delight, but it’s also the input for calculating customer lifetime, which is a fundamental component of lifetime customer value, without which paid acquisition campaigns cannot be defensibly run.

Another popular metric used to quantify the same phenomenon – a user abandoning the product – is churn. Churn is most commonly employed in cohort analysis; it captures the percentage of new users in a cohort that left the product (known as churning) after some given amount of time — a more rigorous exploration of the metric is given at the Shopify blog. Churn is tracked as a function of the period of time over which it is calculated; for instance, one-week churn is the percentage of users that historically have abandoned the product on a rolling, 7-day basis.

Churn can be used to calculate a simulacrum of customer lifetime by dividing the churn rate into 1. For instance, a weekly churn rate of 5%, referring to the average percentage of users that have historically left the product each week, produces a 20-week total lifetime for the cohort when divided into 1 (1 / .05 = 1 / 1 / 20 = 20). This means that, after 20 weeks, an entire cohort should be expected to have left the product, although which users churned at what point over that 20 weeks is left unknown.

Churn can be a useful metric under the auspices of recurring subscription revenues since every customer’s revenue contribution is uniform over the billing cycle. But in freemium, churn is useless: it describes average behavior, and average behavior isn’t capable of informing decisions in freemium product development. The “average user” doesn’t exist in freemium.

Churn approaches user abandonment from a broad-brush perspective that can’t be applied to the underlying mechanics of freemium, in which revenue, rather than being consistent, is anomalous, as are the behaviors of the users contributing that revenue. In a subscription model, such as enterprise SaaS software, the degree of fit between the product’s use case a user’s needs is irrelevant as users generally vet a paid subscription product before using it. Outside of a free promotional first period of use, churn in a subscription product probably does revert to a mean value that can be captured in a metric.

But in freemium, because a given product’s price point is $0, its user base can’t be expected to decay at a consistent pace. And while both a churn graph and a retention graph display negative power law curvature, the churn graph’s fixed rate of decay can’t accommodate the variability of freemium user behavior, which shifts to much less pronounced abandonment after the first few days of use (because a greater proportion of new users in freemium do not experience a fit between their needs and the product’s use case).

Observe, for example, the difference between how 25% Day 7 retention is treated in the Churn and Retention Profile models:

churn_graph

retention_graph

(the retention graph is fitted to the 50% – 25% – 13% retention profile. More info here)

The Churn model doesn’t adapt to survivorship, or a changing user profile over time, because it presumes all users in the initial user base to behave through average underlying assumptions.

Not to mention the fact that cohort analysis defies the most fundamental principal of freemium, which is that predictions are only useful at the individual behavioral level. A cohort in freemium is a collection of data points across a broad spectrum of profiles, assumed to cluster near the Y-axis and exhibit a very long tail. Taking any cohort as a monolith in freemium won’t result in conservative predictions around a universal, actionable mean but rather wildly inaccurate, spurious predictions based on a descriptor – time of product adoption – that, at best communicates, very little about the behaviors of its users (and at worst is a highly-misleading indicator of future behavior).