The soft paywall: Some more numbers to chew on

OK, one more post about the "soft paywall" concept and then I'll move on to something else.

Paid-content discussions tend to be dominated by religious wars -- declarations of belief, not fact -- so I want to do what I can to inject some facts when I can.

As I've pointed out repeatedly, averages are useless and segmentation is essential if we're going to understand human behavior and discover whether there is any real reader-revenue opportunity left in local journalism.

We're in the process of squeezing some more segment-based detail out of our metrics tools, but here's a preliminary bit of information.

On a wide array of local news websites, we're finding that heavy users -- people who visit more than 20 sessions a month, roughly equivalent to once every workday -- account for a disproportionately large percentage of the pageviews delivered on the sites. These people are in that funny lump at the right side of the usage curve that I described in a December blog post.

In terms of generating pageviews (and therefore sellable advertising inventory), they're roughly ten times as important as the general site user population. They might account for only 2 or 3 percent of the total unique users each month, but 20 to 30 percent of the pageviews.

And the data probably understates the picture. The cookie-clearing phenomenon that I described Saturday inappropriately shifts a bunch of people from the right side (frequent) of a segmentation graph to the left (infrequent). So, for the sake of argument, lets just say that 3.5 percent of your audience accounts for 35 percent of your pageviews. Your site's actual figures might be different, but this is a reasonable number, so let's use it.

Let's apply this to a good-size newspaper site with, say, a million unique visitors each month.

If you're thinking about applying a soft paywall targeted at your heavy users, the potential market isn't a million people. It's 35,000.

Of those 35,000, what percentage do you realistically hope to persuade to pay? Ten percent would be 3,500. Fifty percent would be 17,500. Five percent would be ... oh, let's not go there.

But that should help you understand the limits of your online reader revenue model.

At risk: 35 percent of your advertising inventory.

But it's worse than that. If you start cross-referencing with geolocation data, you'll find that these heavy users are far more likely to live in your market than your light users, especially the one-time visitors who spike your monthly metrics.

So pretty soon your risk doesn't look like 35 percent any more. Maybe 45, 55 percent or more. And these heavy users are your advertisers' best targets for campaigns (an effective ad campaign requires repeat exposures to a message). This is your primary revenue stream placed at significant risk.

The trouble with numbers is that they have to be interpreted, and it's hard to keep our biases out of our interpretations.

If you believe that readers should pay for content and that the people who built free news websites are all flaming new-age idiots, you're going to estimate high when you think about the percentage who might pay in this model. If you believe that readers will simply flee to alternative news sources and that the people who want to charge are all flaming luddite idiots, you're going to guess low.

The best I can do is encourage a healthy skepticism, a focus on the data instead of desires, and someone else to put their finger on the stove to discover whether it's hot.


Hey Steve, Let’s add to your numbers (helping push the debate intelligently forward). Using an amalgamation of US News Media sites, I get the following numbers: Month 1 – 1,1% Heavy Users account for 20,1% of all page views Month 2 – 1,2% Heavy Users account for 18,5% of all page views This very much concludes the, perhaps obvious, idea of these heavy users being 15 to 20 times more valuable (at least as defined by advertising inventory increase) than light users. *Heavy users defined as your 20 visits or more per month. *Light users defined at 19 or less visits or less per month. Perhaps I should do a full post on the findings, well, anywho – here you have the data, which, if nothing else, supports that you are not necessarily crazy :-) Cheers @DennisMortensen

Very cool simulator at Hah-vahd lets you play with some assumptions and get visual feedback.

I was hoping you'd read it ;) Hopefully this will move the debate. And I hope it also demonstrates the power of an interactive story. The medium of the web isn't text-and-graphics, it's software. - Jonathan

Have you considered that the underlying principle of 'pay-to-read' that you're assuming is unnatural and entirely due to copyright? It is possible that people instinctively know they shouldn't have to pay to read something, or rather, they have been successfully indoctrinated contrary to instinct by copyright into believing that they should. When the Internet reveals the more natural basis on which knowledge is exchanged (freely), we arrive back (as the free software industry has already discovered) at the principle that authors are paid to write. The work is in the writing. No work is extracted from the author by their readers. That's why the future of business models for intellectual work, is in the payment of the WORK, the labour of production. Not the far more apparently lucrative (but unnatural) prospect of selling the same product of that work over and over again. I suggest that hoping readers will pay to read - is simply the begging phase of the old, unnatural model. The new, more natural model is to make a bargain with one's readers. "Your money in exchange for my labour". "I will produce X if you pay me Y". But, once the labour has been paid for, the product, the writing is free to all.