One of the more frustrating and resilient conspiracy theories circulated in the wake of the Facebook / Cambridge Analytica revelations is that Facebook sells user data. The notion is aggravating because it is patently and obviously false to anyone who has ever advertised on Facebook; despite its fiction, the notion is resilient because it serves to render Facebook -- and by the transitive property of digital advertising, most other large consumer technology companies -- such an egregious bogeyman.
Facebook doesn't sell user data. This is not a nuanced syntactical or semantic differentiation; Facebook doesn't "rent" user data or "time-share" user data, either. What Facebook sells, as a publisher of owned advertising inventory, are impressions that can be targeted by non-individualized parameters. But what's even more valuable than that inventory -- because many companies have inventory to sell, and almost none of those companies are as successful as Facebook -- are its algorithms. Access to these algorithms is what an advertiser is buying when it chooses to spend its budget with Facebook instead of anywhere else. These algorithms are the real product; they are what imbue Facebook's inventory with so much value and drive Facebook's revenue ever higher quarter after quarter.
In understanding this, one must consider how advertisers interact with Facebook versus any other owned-inventory or brokered-inventory network. What does Facebook offer that other advertising inventory sources don't? Yes, it's true that Facebook possesses more user data than most other networks by virtue of the nature of the service: Facebook users fill out full, personal profiles and then click "like" on a bunch of pieces of content, user groups, pages, etc. All of that activity creates data, but very little of that data is useful for advertisers -- almost no one would pay for that data because it's not very valuable. Facebook stands apart from other inventory sources because of two advertising features:
Lookalike audiences. Facebook can parse through its massive trove of data and find users that "look" like -- have similar profiles to -- an advertiser's most valuable users. The significance of lookalike audiences can't be overstated: it's how most direct-response, value-driven advertisers build audiences on Facebook. For the unaware, this article does a good job of summarizing how lookalike audiences work.
Facebook is able to create lookalike audiences because it has so much behavioral and demographic data for such a wide swath of the world's population; Facebook has the volume of data that allows for the high dimensionality that enables the company to categorize people in many different ways.
These categories power lookalikes, but they are specific to Facebook: no other company would be able to do anything with those categories, and if advertisers were "sold" that categorization data, it'd be useless to them. The data used to create lookalike audiences is valuable but only to Facebook; Facebook is the only company that can use those categories to group users together in a way that's beneficial to advertisers.
Event-based bidding. Facebook and Google have in the past 18 months shifted their product focus for mobile advertisers to event-based bidding. While the duopoly's preoccupation with these products was received tepidly by some mobile marketers at first, it has become obvious that this approach represents a permanent paradigm shift: it's a better way of targeting users for performance-oriented marketers.
Event-based bidding means that advertisers define certain actions within their products -- like a tutorial completion in a game, a registration, or the addition of something to an e-commerce shopping cart -- and then set their advertising bid on the basis of that action being taken by the user (in other words, they commit to paying a certain amount for that "event" to happen). Facebook receives those events from the advertiser's product and then looks for commonalities across the users that complete those events; it then classifies those users and advertises the product to other users that look similar.
In understanding the importance of these two advertising systems, it becomes obvious that the underlying data that powers them is only useful to an extent: without the data, the models wouldn't work, but without the models, the data wouldn't be actionable. And the model coefficients certainly aren't commercially compelling: even if a competitor were able to get its hands on the model parameters for some company's value-optimized Facebook campaign, they wouldn't be able to do anything with them. Facebook couldn't sell this data, either: without Facebook's infrastructure, it simply can't be utilized to any meaningful effect.
Just as Coca-Cola doesn't sell sugar and Rolex doesn't sell oystersteel, Facebook doesn't sell user data. The idea that Facebook (or Google) makes money from selling data is facile and dishonest: it preys on the sensitivities that have recently been activated by bad actors. Facebook's classification algorithms are what bring value to its advertising products; if the company's data centers were all simultaneously incinerated tomorrow, by next week its advertising systems would be matching ads to users with some degree of relevance.