What companies need to know about AI and content science

What companies need to know about AI and content science

A few months ago, I became obsessed with this new field that I’ve started calling “content science”. I believe it will become even more crucial for companies than the already very popular data science.

We have ChatGPT to thank for that, which burst onto the scene at the end of November 2022 and then basically changed everything. The entire world became hugely excited about generative AI, which is pretty much everywhere now. Even Khan Academy, of which I’ve been a huge fan for a long time, has integrated some sort of AI driven private tutor.

Knowledge dream or privacy nightmare?

The domain of content science is just as fascinating as it is fast moving. AWS recently launched the generative AI tool HealthScribe, which develops clinical notes on the basis of patient-clinician conversations. Doctors normally spend an enormous amount of time documenting these, which diverts their focus from their core, which is patient care. This is without any doubt a really useful tool. But just imagine the amount of information that AWS and Amazon are gathering by analyzing all these doctor-patient conversations. That could also be a privacy nightmare.

The wealth of knowledge being generated is truly fascinating. However, this rapid evolution can also be really scary for some. Just think about how the Writers Guild of America is currently on strike out of fear that studios will start using AI to create scripts for movies or TV shows. While I really don't believe this will become the norm, it's also evident that we're moving toward a new paradigm, giving rise to what I call "content science".

In fact, I’ve already seen many fascinating cases of companies capturing their internal knowledge and packaging that into their own in-house version of generative AI.

Faster, ChatGPT! Bill! Bill!

McKinsey, for instance, recently introduced Lilli, their very own generative AI tool, which packages all their in-house knowledge. So now, every time they receive a customer question, the first thing they do is to run it past Lilli. I’m really curious to see what this will mean for their business model and their fees. They're used to selling warm bodies, for which they charge a lot of money. But now that it's a lot faster, more efficient and easier for them to generate answers, how will they charge their services? Will they offer Lilli as a licensed “consulting-as-a-service”, for instance? There are plenty of opportunities for massive disruption, here.

Another fantastic and very similar example is Harvey, which offers generative AI services to law firms. Harvey AI is trained on general internet data from OpenAI’s GPT-4 as well as general legal data. On top of that it is trained on its customers internal documents and data, which allows it to assists them with contract analysis, due diligence, litigation, regulatory compliance and much more. In short, it helps lawyers become faster and more efficient.

And then we have OpenAI, which launched its enterprise customer license. This version is more robust, but the most interesting part here is that you can train it on your very own company documents, files and content.

What will that do to IP and copyright, I wonder?

Proving a negative

This summer, I received an e-mail from a big tech firm, which I worked with quite often as a keynote speaker, asking me to prove that I have never uploaded any of their information - PowerPoints, Word documents, e-mails, etc. - into an LLM (large language model, like the ones used by ChatGPT and Bard). How can you possibly prove something like that?

But that’s probably just the beginning. I think we may need to brace ourselves for a very messy compliance situation, where we might need to show that we are not using somebody else's data to train our large language model. We will need to be very mindful of what goes in and what comes out.

So this is where I think the world of content science will be playing a crucial role.

Starving for knowledge

One of my favorite quotes comes from the late John Naisbitt who used to be a popular author and public speaker in the area of futures studies: “We are drowning in information, but we are starved for knowledge”. He said that in the 1980’s, way before the digital revolution. But his observation still resonates today. Individuals already have a ton of information on all of their devices, but think of how much companies have on their SharePoints, OneDrives, Google drives, e-mail servers, Slack channels etc.

Today, every company of some size and significance, has a data science department or at least one or more data scientists. The bank, at which I'm a board member, features a department of 200 data scientists, for instance, who clean up and catalogue all of its structured information. Over the years, data science has become widely established, but I think we're now seeing the birth of a new field, which will less focus on that neatly structured data, but more on the messy, unstructured data that is so abundant in companies: Word documents, PowerPoints, PDFs, emails.

That’s huge, if you realize that merely 20% of company data is neatly structured in a database while all the rest is unstructured.

You’re grounded!

Data science triggered the rise of a new type of technology players, offering data governance tools which help orchestrate all the databases in a company: from quality control to GDPR proofing. Now we will need content governance players. Companies will need to figure out how to orchestrate all their content sources: if and how they're going to use them to train these large language models, and how they're going to combine that with the ChatGPTs of this world.

The technical term for that is “grounding”, which is where you take an LLM like ChatGPT, but you connect it - ground it - with your own personal data, your own intellectual set of content. And to be able to manage and control that is going to be crucial in the world of content science.

A tale of two predictions

That’s why I predict two things.

Where companies now have a data science department and data scientists, they will also need to hire content scientists and build content science departments in the future. These are not database nerds, but a whole new breed: “Conan the Librarians” who love to work with unstructured data in strategic ways. So that’s one prediction.

And the second one is that we will need to implement content governance tools because at some point, we will need to comply to AI regulation in this domain. Prepare for a lot of pressure and situations where you will need ?to prove that you're not misusing information to feed your large language models.

So be prepared to hear and think a lot about content science and content governance in the coming years. If you already are active in that department - generative AI for enterprises and content science - I’d love to hear what you have been developing. Also, keep an eye on my socials in the coming weeks, as I'm working on a content science manual!

Want to inspire your employees or customers with a keynote about what's next in business and technology? Check out the topics on my keynote page.

Paul Shearing -The Accidental Trainer

I help new trainers learn from my mistakes

1 年

Very interesting and thought provoking

回复
Tom Berx

Group Chief Information Officer at Credendo

1 年

Anna Z.

回复

Some remarks about the Generative AI hype: LLMs are probabilistic not deterministic. They are trained on selected huge datasets using a bias, who controls the bias controls the narrative. If we are using Generative AI to generate content...and LLMs models are trained on content expect content degradation over time. Generative AI is not AGI and it probably will not lead to AGI. LLMs are limited to the data they are trained upon, hence they don't know your context/data. Look at RAGs for that. For a more complete view on the coming wave of technologies (and containment and regulation) I recommend "The Coming Wave" by Mustafa Suleyman (Google Deepmind) about what's coming and what impact it will have on society.

Dann Rogge

General Manager Blossom - Simplifying EV charging

1 年
Evan Kirstel B2B TechFluencer

Create??Publish???Amplify?? TechInfluencer, Analyst, Content Creator w/600K Social Media followers, Deep Expertise in Enterprise ?? Cloud ??5G ??AI ??Telecom ?? CX ?? Cyber ?? DigitalHealth. TwitterX @evankirstel

1 年

Content science is definitely an intriguing concept! Would love to have you on my podcast to discuss this further!

回复

要查看或添加评论,请登录

Peter Hinssen的更多文章

  • Ponies, Unicorns, Godzillas, Dinosaurs, King Kongs and Phoenixes

    Ponies, Unicorns, Godzillas, Dinosaurs, King Kongs and Phoenixes

    My book 'The Phoenix and the Unicorn' offered a pretty binary visualisation of the world at large. You were either a…

    17 条评论
  • Is Your Organization Surviving or Thriving?

    Is Your Organization Surviving or Thriving?

    In the 1949 classic movie "The Third Man", a character called Harry Lime (played by Orson Welles) tells the protagonist…

    5 条评论
  • Frank Verstraete on the Quantum Computing Revolution

    Frank Verstraete on the Quantum Computing Revolution

    My conversation with prize-winning quantum physicist and engineer Frank Verstraete Since its inception in the early…

    11 条评论
  • We need more rule breakers in an AI world

    We need more rule breakers in an AI world

    My conversation with Bill Boorman, advisor to talent technology companies, keynote speaker, host, researcher and…

    14 条评论
  • The Human Premium in the Age of AI

    The Human Premium in the Age of AI

    A Conversation with Professor David De Cremer on Humanity, Leadership, and the Future of Work Having a meaningful…

    7 条评论
  • AI and the End of Awful

    AI and the End of Awful

    From a young age, educational systems worldwide familiarize us with the notion of grades and evaluation. Our academic…

    25 条评论
  • Why we need more Yesterwork Hunting

    Why we need more Yesterwork Hunting

    At the beginning of each budget year, business leaders tend to carefully draft a financial plan that’s often as absurd…

    15 条评论
  • What Makes Us Human In The Never Normal

    What Makes Us Human In The Never Normal

    My conversation with bestselling author, essayist and columnist Meghan O'Gieblyn The pace of change in AI has been so…

    14 条评论
  • How to Adapt to the Never Normal

    How to Adapt to the Never Normal

    My conversation with Professor Herminia Ibarra, academic and expert in leadership development, career transition, and…

    10 条评论
  • Who should you hire to thrive in the age of GenAI?

    Who should you hire to thrive in the age of GenAI?

    When I was developing Intranets more than 20 years ago, I hired a lot of engineers: brilliant technicians capable of…

    15 条评论

社区洞察

其他会员也浏览了