The problem with analyzing payer data in freemium

Given the inherently low percentage of users in a freemium product that convert into payers, as well as the skew of the distribution of lifetime values that freemium developers hope to achieve in their userbases, it might be tempting for a freemium developer to focus most of their product analysis on monetization.

But there are serious problems with putting too much emphasis on optimizing for monetization in freemium products, especially with analytical exercises that attempt to segment paying users. Beyond the practical issues with this type of analysis — some of which are addressed below — a development mentality that myopically attunes to monetization could potentially neglect the priority of scale in freemium and the dynamic between monetization and retention. Especially for mobile apps, because of platform charts and the nature of discoverability on mobile, optimizing for monetization may be sub-optimal if it produces bottlenecks in user activity that lead to increases in churn (and thus decreases in virality) that limit the overall size of the user base.

But more concretely, the reason attempting to draw actionable cues from the data generated by paying users is problematic is that there simply aren’t a lot of them in a freemium product. Assuming a freemium product with 1MM DAUs has a conversion rate (to paying users) of 5% — which is high — and that the presence of users that have paid at least once are distributed somewhat evenly within the product from day-to-day, on any given day only 50,000 revenue-contributing users are present in the user base.

50,000 users may seem like a lot, but unless the product is specific to some demographic or platform (eg. it is available only on iOS in the US), the curse of dimensionality can quickly whittle such a group down into smaller sub-groups that aren’t statistically viable for analysis.

And dimensionality is exactly what makes a data set around player activity valuable for the purposes of marketing: what use is, for instance, an LTV profile if it can’t be used to generate more users with that profile (in other words, to set targeting parameters for marketing campaigns)? In-product monetization data is interesting, of course — for instance, whether payers that pay in their first session ultimately end up spending more than payers that don’t — but unless it can be broken down by the dimensions that ad campaigns can be targeted with, such as device type, location, acquisition source, etc., it’s not really anything more than trivia, at least for the purposes of acquiring more users of a certain profile (there’s obviously a strong case for using monetization data for product optimizations, but the focus of this article is marketing).

This is what makes retention data such a powerful point of analysis: retention profiles can be built for every DAU a product has. And retention, of course, is a pre-requisite for monetization, anyway: users can’t pay if they’re not using the product.

Conventional wisdom generally holds that soft launches for freemium apps should be evaluated on the basis of retention (and not monetization) precisely because of this fact. In a soft launch, data is scarce and the product’s economy is considered in flux. Optimizing the product around what makes users stick around (to potentially pay later) makes sense in this context because, by definition, the product is being changed frequently and evaluated. But this shouldn’t really change after hard launch; a freemium product (“product-as-a-service”) should constantly be evaluated and improved upon, and doing so through the lens of monetization attaches an unnecessary immediacy to a process that should be viewed, at least in the abstract, as interminable.