Define Your Worth Publishers
Sarah Thompson
Marketer; Publisher and Media Consultant; Lesbian Business Leader; Champion of Local Media; CMDC Media Leader of the Year 2020
Along time ago, before CASL and CAN-SPAM, businesses used to scrape the internet for any information they could find about individuals. They were specifically looking for email addresses and phone numbers to use for marketing and advertising to connect with new consumers. They would compile lists and sell them. Even though legislation has been introduced to prevent this practice, scraping for email addresses still occurs globally.
Today, bots are scraping for AI models, to build smarter tools for mass use and generate revenue, albeit mostly from advertising. This infringes on the copyright and valuation of our Canadian media companies and content creators. In one search, I can work with a company to scrap all the data from legitimate publishers in Canada and build a data product. In another search, I can work with a company to block as many bots as possible from scraping my site. And I can even go and put in place a data poisoning solution, like Nightshade and others, to ensure the scraping bots are confused.??
And people who love AI get upset that they cannot continue their work as they demand more data. Media, publishers, and creators get upset because they don't get revenue for the work they have done. We know as humans we typically value more in newness and innovation than creation, hard work and history. And secondly, typing in a bot is cooler than a journalist working years on an investigative story. Just today, I had an entire economic outlook on one screen for the US. This was cool to play with until I read, "Powered by proprietary AI technology, Zeta’s Data Cloud ingests trillions of consumer signals spanning all key drivers of the U.S. economy..."
Yet, it all begs what is the value of the history of a publication in Canada, including its images, text, and video content, as well as the value of the work of creators, journalists, and others who strive to provide factual information to the public? It also explores the idea of ownership of ideas, particularly in the context of AI-generated content. AI models cannot own copyrights, so the output of a generative AI model is not copyrightable, but it can be monetized through different business models including advertising.
When unstructured data from different publishers is combined, the resulting product is owned by the entity that created it. This content is very similar to the original, and AI creators argue that using copyrighted works to train AI is not a copyright infringement, as the AI is learning associations rather than copying data.
This was a discussion in a recent Podcast Interview on Media Copilot, with the subhead "Perplexity needs healthy journalism to thrive," in which Dmitry Shevelenko, the Chief Business Officer of Perplexity says, "We don't use the expression of journalism, but the facts.. truths from around the world." And revenue that comes from these generative AI answers that Perplexity comes from... advertising. Advertising dollars could go directly to the creators and journalists. But they do have a revenue share program for media (not that we need more middleware between advertiser and publisher).
The best response to this comes from Platformer and Casey Newton, "Perplexity’s core innovation is ethical rather than technical. In the recent past, it would have been considered a bad form to steal and repurpose journalism at scale. Perplexity is making a bet that the advent of generative AI has somehow changed the moral calculus to its benefit."
And so, AI content licensing initiatives abound. More and more media companies have reached license agreements with AI companies individually. And these deals are getting announced daily - here is one today of Universal Music and Meta. But these deals can also jeopardize the value of this unstructured data. AI companies don't really want to negotiate with every publisher and media firm, so they will continue to just scrape and take as copyright lawsuits play out.
Courtney Radsch of the Center of Journalism and Liberty recently published an article in the Seattle Times that notes, "News publishers, along with creative industries more broadly, must actively define the worth of their content and data by understanding how and why value is created throughout the generative AI process, from developing foundation models to powering real-time search, if they want to obtain fair compensation."
In simpler terms, in Canadian media your unstructured data, also known as content, is valuable. Journalism and content creation are valuable. The work and effort of reporters, staff, creators, and editors is valuable. However, properly valuing this content requires expertise in knowing how to effectively package, monetize, and determine its value. Unfortunately, many publishers and media firms lack this expertise. And my concern is that everyone rushes to sign deals and value is lost for all.??
The moral calculus hasn't changed: there is more value in the art than the print, and there is more value in journalism than the AI merger of facts into a bland paragraph of text. Know your value, define your worth.
If you want to chat more about this... drop me a line.
Retained Executive Search & Professional Advisory
3 个月Makes sense!
Retained Executive Search & Professional Advisory
3 个月Great post Sarah! For all content creators this is of paramount concern. How can ethical considerations be prioritized against commercial interests?