TL;DR: Data science has moved beyond relying exclusively on data collection, and the marketing industry can too.
Fueling Digital Marketing
With GDPR set to take effect in the EU in a few days, digital marketers are understandably concerned with how the new law might impact targeted advertising. Under GDPR, user consent is required before any personal data is collected for processing, giving EU citizens greater power over their own data. This puts current data-driven marketing techniques in jeopardy.
Digital marketing is built on the concept of a single customer view, a combined record of all known interactions of a user with a brand. This allows brands to build one to one relationships with customers through personalized communications. Such high levels of hyper-targeting has drawn controversy in recent months as concerns over privacy have been raised. GDPR was adopted to protect privacy by restricting processing of personal identifiable information - id est, data that can be linked back to a single person - without explicit consent.
In 2006, Tesco’s Clive Humby called data the new oil. More recently, PageFair’s Johnny Ryan took the metaphor further by likening personal data specifically to fossil fuels, citing the need for better, “cleaner” alternatives.
There is a clear need for an alternative to large-scale personal data collection that delivers results without infringing on citizens’ right to privacy. To find this alternative, the marketing industry must look back to how its love affair with data began and then turn to other industries to see how else data can transform business.
A Brief History of the Data Age
A few years ago, the phrases big data, analytics, data science, and machine learning were all the rage. These buzzwords were overused and were often (mis)used interchangeably as data rose in popularity.
The rise in importance of data coincided with the rise in Internet usage due to the popularity of social media. As people shared more of their lives online, resourceful marketers grasped at the opportunity to match people with products they will love.
The challenge was making sense of the deluge of data - users were sharing more data than could possibly be analyzed using traditional methods. The mathematics of statistics supplied the solution: quantitative models could be used to summarize large datasets. Good models only record useful characteristics of a dataset and rely on statistical laws to generalize the rest. The digital register of human behavior could be distilled to a mathematical function that could be optimized along a large but finite number of controllable axes. This was the scale of big data meeting the (data) science of machine learning. Mathematics provided a stopgap until hardware could cope with the magnitude of the data.
As the Internet simulated human society more and more, data scientists realized that machines can be taught how to act with near-human intelligence. A model that could predict human behavior could also be trained to behave like a human. Artificial intelligence paradigms that before had been infeasible due to lack of computational resources and training data came into vogue. While early models such as least squares regression had limited capacity to imitate patterns of human behavior; artificial neural networks - theorized to be universal function estimators - could accurately model the nuances of human decision-making.
Noted AI researcher Andrej Karpathy dubbed neural networks as software 2.0: whereas before programmers had to hard code intelligence into machines, data scientists now design programs to learn intelligent behavior from data.
Before programmers had to hard code intelligence into machines, data scientists now design programs to learn intelligent behavior from data.
Neural networks have since been used in countless artificial intelligence applications, from fraud detection to image recognition to self-driving cars and even automated content generation. In 2017, AlphaGo, an AI that uses an artificial neural network to learn, beat the world number one ranked player in a series of matches of Go, a game so complex that there are more atoms in the universe than there are possible outcomes.
It is therefore no surprise that McKinsey estimates that AI techniques have the potential to create between $3.5 trillion and $5.8 trillion in value annually.
From Observation to Inference
Fast forward a couple of years, and the standard operating procedure for a typical digital marketer today is to rely primarily on cookies to track every single user interaction with a brand and build personalized journeys for each user. Advances in distributed computing mean that hardware has somewhat caught up with the magnitude of consumer data being collected and as a result, the mathematical stopgap of using quantitative models as a proxy for actual customer behavior is generally seen as an undesirable last resort.
This is in sharp contrast to other industries with limited available data such as aerospace engineering where machine learning algorithms are used in lieu of expensive real life experiments. Because of the costs attached to collecting data, data science has developed ways to work around a lack of quality data when generating quantitative models.
Because of the costs attached to collecting data, data science has developed ways to work around a lack of quality data when generating quantitative models.
In fact, the latest iteration of AlphaGo didn’t learn from human Go matches but instead developed its database of moves just from playing against itself.
And therein lies the answer.
The mathematics behind machine learning is agnostic to data granularity - it doesn’t care whether data is personal identifiable information or not as long as the raw data is a representative sample of the full population.
When GDPR comes into effect, digital marketers should move from observation to insight as quality data will be harder to come by. Fortunately, data science has developed techniques of working around a lack of data to glean insights through mathematical modeling.
Data science has more to offer than just data collection, and, hopefully, so does digital marketing.