The Parks and Recreation model for measuring the economic value of Data
Could survey-based techniques borrowed from Choice Modelling help us understand the value of "Data" ?
There's been a lot of buzz around justifying the ROI of Data Teams in the new economy.
Measuring a Data team's value is very difficult because the impact of Data on business outcomes is usually indirect. In that, it is akin to the impact of good software or people practices on a company’s revenue. It's hard to attribute business outcomes correctly to these underlying factors, Data being one of them.
As Nate Sooter puts it in this tweet
Barry McCardel, the CEO of Hex echoes the same sentiment in his bog post. He says:
"...the best way to tell the ROI story is for other people to tell it."
In other words, Data has whatever value the customers (of data) perceive it to have.
Data organizations put an enormous amount of effort in ROI measurement. Many of these metrics are based on cost reduction or productivity improvement for ex. “number of requests served by Data teams” or “average time taken by an analyst to serve a request”. This is similar to the kind of metrics that Engineering orgs seek to measure.
But there is a fundamental difference between Engineering and Data.
Engineering is an essential function (and cost) for value creation in a product. In other words, the cost of Engineering can never be zero. It’s not a choice. This means that one could measure how practices like Lean manufacturing and DevOps bring down the cost of production over time.
Data on the other hand is largely a discretionary expense. As an analogy, it’s similar to how one might think of spending money on a vacation. Common sense dictates that vacations are good for our physical and mental well-being, but what’s the opportunity cost of a vacation vs other activities that may have a similar outcome (i.e., a net positive effect on one’s mental health). As a vacationer, one may still want to monitor and optimize the cost of this expenditure. But it still doesn’t say anything about the “value” of the said activity.
This, I think, is the fundamental problem with Data. It’s a discretionary component of value creation, thus its relative value must be weighed against other similar components. Investing in Data is a choice, not a necessity.
This sounds counter-intuitive (who runs a business without Data these days?), but from an economic perspective, Data is not a necessary building block for business value creation.
But some economist postulate that it is not the goods themselves that give utility to the consumer; the goods possess characteristics, and these characteristics give rise to utility. In other words, it is not Data that we care about, it’s the problems it helps solve. Seen this way, if the same utility is provided by another “good”, then as a consumer, I am confronted with a choice between the alternatives.
For businesses, Data is an aid for good decision making. Business leaders take several inputs into account to make meaningful business decisions - Data is one (but not the only - therefore discretionary) input in this process. The intention of all decision making is to maximize shareholder value and minimize cost. The value of Data in this process is to allow decision makers to optimize for this intention.
How does one measure the value of Knowledge
It's therefore important that we "frame" the Data value problem correctly. A thought-experiment I like to perform is to think of Data as a Public Good such as a public library, city parks, free wifi access or street lighting.
Leslie Knope: Gentlemen, I realize that times are tough, and the budget is tight. But if the people of this town have nothing else to do but sit in their houses and play video games, then Pawnee will die. And we refuse to let that happen. [whispers] Now. This town was historically known for two things. [Chariots of Fire theme plays] Widespread obesity and the annual Pawnee Harvest Festival....We lost that festival a few years ago, due to another round of budget cuts. And I propose we bring the festival back. With ticket sales and corporate sponsorship, we'll earn all that money back. And believe me, people will come.
Ben: What if they don't?
Leslie Knope: Well... Then you eliminate the Parks Department.
Ben: And you guys are all on board with this.
All: [murmurs of agreement]
Leslie Knope: Look, we're not just pencil pushers. We are a reflection of the community. And we believe that we can strengthen that community. Because in the end, the reason why we're all here is to bring people together.
-- From Parks and Recreation, Season 3, Episode 1
Governments and public policy planners all over the world rake their brains to decide the economic value of non-market goods like a park, library or waste-water-treatment plan; it helps them justify the cost of these products/services and decide how to allocate their annual budget. Budget allocation for public goods is a complex problem. In order to equitably allocate tax-payers’ money, they need to to do a careful cost-benefit analysis to assess a fair allocation.
It turns out that this is the subject of what economists call Non-Market valuation, an area of economics and choice-theory that is popular with public policy makers such as environmental policy planners.
There are primarily two ways to measure this value.
Revealed Preference Theory basically measures value by looking at a customer’s past purchase behaviour. If they “paid” for a goods/service in the past, they probably see value in it. Economists study past behaviour of customers and use it as a predictor of demand.
Stated Preference Theory - Another common method to measure non-market value (for ex. that of a Public Library) is the Stated Preference Theory, also called the Contingent Valuation Model where the "perceived" value of an asset is measured by how much money the users would be willing to pay (WTP) for that asset or willing to accept (WTA) as compensation in lieu of the asset. CVM is a survey-based technique that is widely used in measuring value of public goods.
These methods fall under the umbrella of Choice Experiments - i.e., survey based approaches to study how consumers make choices and what value they put on them.
Although flawed (what models aren’t?), these are formally recognized and scientifically proven models for attributing values to a set of competing choices.
Thought Experiment - Data as a public good
What if we could think of Data as a public good - like a library or a park?
Data in a business can be reasoned about as a collective resource (i.e., shared by all), non-excludable (i.e., within the boundaries of a business, it’s available to everyone equally) and non-rivalrous (i.e., one person’s use does not prevent others from using it simultaneously).
Seeing Data this way unlocks a different way to measure its value. Using scientifically proven survey methods like Revealed Preference or Stated Preference techniques, we could get a better understanding of the economic value of Data and its constituents.
If this sounds like mumbo-jumbo (it did to me when I first explored it), remember that we already employ industry-recognized, survey-based methods abundantly in other areas - NPS, customer interviews, employee engagement surveys and many more.
If we could employ formal choice experiments to Data by surveying the users both within an organization but also perhaps across organizations, we may be able to get empirical evidence of what economic value consumers of Data put on it as compared to other tools they have at their disposal for making decisions. This could then help businesses and executives answer that very important question - What indeed is the ROI of Data?