What’s going on with AI in media?
Brendan Hughes
CCO | eDesk | Strategic Operations & Growth Leader | Scaling Teams and Transforming Business Strategy | Driving Revenue and Operational Excellence
We were invited to participate in the discussion at the World News Media Network (WNMN) conference on Big Data and AI in Media this past week. This was a gathering of leading news media organisations from around the globe, discussing how big data, AI and machine learning can be leveraged to the benefit of the media sector.
Robot Wars
One of the underlying themes in many discussions around AI is whether the robots will replace us human in the workforce. The more balanced discussions at the moment focus on how AI can supplement human cognition – doing the things that are too hard for us, or that we are very slow at processing. If you've got 5 minutes to spare watch this video from former Google [X] executive Mo Gawdat on his view into the future of our relationship with machines:
When it comes to the role of editors at least, the algorithms are not yet replacing the humans. Bing News, led by Ting Cai, is very clear on the importance of editors in curating the major stories of the day. Once the editors decide on what is important for the balanced mix of news, they then let the personalisation algorithms kick in to present stories in the order that will be most engaged with by each user. Ranking and personalisation is fully automated.
There are many great examples of how content creators are using big data analytics to bring more efficiency into the creative process. Natural Language Processing is at work in many newsrooms around the globe already to assist with identifying trends, contextual tagging and content recommendations. Parse.ly CTO Andrew Montalenti, talked to me about their beta big data project which is creating a universe of contextual tagging across their network of hundreds of publishers capturing the attention of over 150 million people reading 850,000 articles each day. Get in touch with Andrew if you want to access this powerful data set.
The Washington Post is now as SAAS house but they do eat their own dog food. Joey Marburger outlined how they have built a 'heliograph' – automated storytelling technology – that editors use to assist them in their story-telling. The heliograph is already writing automated match reports on for example high school football, allowing journalists to spend more time preparing in-depth analyses.
Clavis is their automated content recommendations engine that uses natural language processing to serve the right content to the right user at the right time. Their average daily click-through rate for users who are served content via Clavis is more than double their nearest competitor in the market.
One clever tool Wash Po is working on right now is an automated video creation tool that lifts both text and image highlights from a story and combines them into a meaningful motion graphics representation of the story. This reduces the video creation time from around half a day to as little as 5 minutes.
Zhiyi Liu of Toutiao media in Bejing, shared how they use AI to assist in combatting rumours – or what we might call “fake news”. They run algorithms across news stories submitted by their network of hundreds of journalists and writers, to find signals that would indicate a story is more likely to be rumour than fact. The AI is used to trigger an alert, but the real work of discerning veracity is left to the human fact checkers.
Unruly’s Kenneth Suh shared how big data is used to create video content that appeals to one of the 12 key emotions felt by most humans. They start with the personas that they want to engage with and send out more than 20k surveys every month to help identify more about those personas - their personality profiles including what attributes and emotions do they over-index on. Videos are created to tap into the specific emotions that are identified within a target persona and then further analysis is undertaken to examine what emotions are elicited by different video content consumed by different audiences – measured by an EQ score. The overall goal is to engage with audiences on an emotional level through the creation of great video content.
Money Machines
Ourselves in INM and The Globe and Mail group from Canada appear to be on the same trajectory when it comes to using big data to power advertising. Hibo Griffin who is Director for Data Optimisation there, spoke about how they put data at the centre of pitches to clients. They start with the client’s specified target audience and before making a pitch get their data science team to uncover everything they can about that audience segment on their platform. The insights are then fed into the creative team to put a data-led campaign together that addresses the specific behavioural insights around those audiences.
They will sometimes create a seed campaign around a given audience segment in order to assess the behaviours of unfamiliar audience segments. They create what they call a Taste Graph which visually represents the range of interests exhibited by the specific audience. Surprising results can often emerge. For example a seed group built around the BMW brand, uncovered an over-indexing interest in small business which was fed into the full campaign when launched. By focusing the campaign delivery to BMW’s highest propensity responders, they were able to demonstrate a 3x improvement in engagement levels with the campaign.
I spoke to the conference about our own experiences using machine learning to identify audiences that had a higher intent around a specific category. We worked with local company Heystaks to feed a seed group of “travel enthusiasts” – people who were highly engaged with travel content - into a machine learning engine. This engine was set up to identify broader behavioural patterns - around different categories of content and around time, location and device - to build out wider audience cohorts. Over the course of the pilot campaign, the machine became much better at figuring out what user behaviours correlate with people likely to engage with a specific ad campaign. The machine learning approach found an audience nearly 5 times larger with a 3.9x improvement in engagement when compared with the behavioural segmentation approach.
Ciarán Cody-Kenny (an Irishman in Barcelona) and Eivind Fiskerud both of Schibsted, shared how they have built propensity models to predict the types of content most likely to convert users to subscriptions. In addition they modelled the types of user behaviours that are the strongest indicators of a customer who might be willing to pay for content. With over 175k active subscribers on Schibsted’s main Norwegian news sites, they had a lot of data to sift through. They needed machine learning to help them here. The machine learning looks at behaviours of those who converted previously and then predicts who which users will be most likely to convert over the next 7 to 14 days.
There were some surprises here for the team in the insights that emerged about which people were more likely to convert. In addition to predictable patterns of behaviour such as frequency and recency of activity, the machines started to identify high propensities amongst cohorts of users who visited from multiple devices and from multiple sources which is something the team hadn’t initially identified. By applying conversion propensity scores to logged in users, the data team were able to provide much more qualified leads to the outbound sales teams which saw a five-fold improvement in their conversion rates.
Styria media in Croatia established a large in house data science team under the watchful eye of data scientist Marko Velic. On their classified platforms, they used Computer Vision machine learning approaches to create unique and powerful image recognition tools. Now the machine can intelligently find products that are similar to the one you are looking at across the rest of their site – think of how complex a challenge it is to find that perfect red evening dresses. Their solution which performs better even than Google’s image matching toolkit has won them awards and more importantly has greatly increased engagement with users and reduced the time spend by advertisers creating ads.
Whose Data Is It?
As you might expect, there was much discussion at this particular conference around personal data. The theme was set by Tobias Bennett of the Local Media Consortium which represents over 1,700 media sites across the U.S. Tobias talked about the shift to People-Based Marketing where email is the unique identifier of a real person across the multitude of devices and platforms they use. With that in mind an ambitious target of capturing the email addresses of 150 million people is being set across these publishers.
LMC data shows that email newsletter subscribers are 3x more valuable than socially referred visitors. Furthermore, personalisation of subject lines of email news digests based on stated and implied content interests drives 50% uplift in email open rates.
Capturing email is the foundation of people-based marketing as it is the single most useful identifier of a real person. Across the Local Media Consortium a range of tactics to drive email capture are now being rolled out. The main message is to focus on capturing that one piece of data and keep the experience as simple as possible for the user. Giving strong value back to the user such as the capability to follow certain sections or topics has the highest impact on email performance.
With the advent of GDPR it was curious to see how aware North American publishers were of this regulation and how it is changing their own approaches to data capture and management. The Facebook Cambridge Analytica debacle is top of everyone’s mind right now and while it may drift out of the popular news cycle quickly, it will certainly have a lasting impact on how everyone in media thinks about data.
It is very welcomed that over the coming months more and more people will become aware of their personal data online and how it is being used by third parties. People will increasingly “own” their own data. Premium media companies are well positioned to provide a strong privacy-value exchange to consumers – offering meaningful and relevant personalised content services in return for access to personal or behavioural data. It’s time to kill off the murky underworld of clandestine harvesting of personal data to be sold to the highest bidder on intransparent marketplaces.
As was nicely summed up by Jodi Hopperton emcee of the event, there is “no downside to having a deep understanding of owned data”. Regulation will play a more important role in protecting privacy, but that won’t limit reputable organisations from sensitively analysing those large data sets to provide better services to their audiences.
Founder and CEO at Twipe
6 年Great summary, thanks for sharing Brendan Hughes! Interesting to learn about various AI efforts and to see email is not dead.
CCO | eDesk | Strategic Operations & Growth Leader | Scaling Teams and Transforming Business Strategy | Driving Revenue and Operational Excellence
6 年Hey David. The GDPR is going to force everyone to take stock of their practices in this area. If you are capturing and processing personal data or data that you can trace back to an individual human, then you need to have positive opt in consent. That's generally okay if people are registering to use your service as you'll generally be providing a consent mechanism as part of the sign up process. But what happens if the customer doesn't consent to profiling? One possible approach would be to fully anonymise the data and ensure that you can't reverse engineer that. Otherwise if I do a data access request you're going to have to prove to me that you don't have a way to access that profile data.
Senior Business Analyst at Premier Lotteries Ireland
6 年Great read Brendan! I suppose one of the questions I have around AI and GDPR is the regulation against profiling and the multitude of marketing and personalisation services that are tied into that practice.