Freemium conversion and the problem with classifiers


Given the fundamentally low conversion rate of the freemium model, optimizing for revenues can be seen as essentially a targeting exercise: either marketing campaigns are targeted very judiciously, reaching only the people deemed most likely to convert and reducing the size of the funnel, or marketing campaigns are targeted broadly, increasing the size of the funnel, and the users deemed most likely to convert are recognized early in their product tenures using some form of a classification algorithm.

Because of network effects, virality, and product portfolio dynamics, the second option — a wider marketing funnel coupled with a method for classifying a user as a potential payer / non-payer — is in most cases preferable to the first. But classifiers are easy to implement haphazardly in freemium, especially given the massive distortional effects of low (<10%) conversion.

The most basic problem posed by classifiers in freemium is the ease by which conversion data can lend itself to base rate fallacies. Because freemium conversion rates are low, the probability of any given incoming user contributing revenues is therefore likewise low. If a product has a 5% conversion rate, a classification algorithm will be correct 95% of the time if it simply classifies every single user as having no potential to pay.

A second, and more subtle, problem with classifiers is a manifestation of confusion of the inverse. Given the very small portions of freemium user bases that monetize, it’s easy to interpret data from non-monetizing users as meaningless noise. But doing so invites a problem: data from monetizing users loses scope and becomes seen as the greater context of the user base, as opposed to the tiny portion it represents. This context is important, as it forms the basis of the classification, which is the conditional probability of a user converting given some data about them being observed.

Consider two conditional probability statements:


the probability of X being true given that Y was observed;


the probability of observing Y given that X is true.

These two conditional probabilities are not necessarily equivalent  — that is, it can’t be said that P(X|Y) = P(Y|X). Bayes’ theorem states that P(X|Y) is calculated as:


Most classification systems use demographic data to build profiles of users for the purposes of grouping them into the payer / non-payer buckets; on mobile, this demographic data is oftentimes no more informative than the user’s geographic location and device type. Consider the following user base:


Applying the conditional probability format outlined above, a classification system training on the above data set could potentially build rules to identify two conditional probabilities:


the probability of a user becoming a payer given that they match some profile;


the probability of this profile being observed given that the user will become a payer.

The second conditional probability statement is seductive but ultimately useless: the number of US iPhone Payers, 11, represents 50% of the group of 22 payers, but that information has no bearing on whether that specific new user will convert (the US iPhone conversion rate is lower, for instance, than the Canadian iPhone conversion rate).

In other words, by classifying users as payers / non-payers using only the scope of the paying users (numbering 22 out of 500 total users, or 4.4%), the classification system — using what might appear to be reasonable logic — has completely distorted the probability of conversion.

Using demographic data to build this kind of a classifier lends itself to base rate neglect — that is, the base rate of freemium conversion — and can misdirect marketing spending. In the above scenario, an increase in the marketing budget apportioned to campaigns directed at US iPhone users might result from an examination of the wrong conditional probability for converting.

Low universal conversion rates across the board in freemium further compound the issue: large user base numbers and small paying users numbers can force flawed comparisons and specious conclusions about a product’s user base.