The Dignity of Data

The Dignity of Data

This is an article about a Q&A film I watched called "Data Dignity and AI". Data in its rawest form is nebulous and meaningless, so how can we unify it with dignity, a word that implies human concepts like "Worth" and "Respect"? Fortunately, that's exactly what the "Prime unifying scientist" at 微软 Jaron Lanier does.

Throughout my career, and perhaps even before I was being paid to work with tech, I deeply admired Jaron Lanier and saw him as the personification of a “moral compass” for a specific movement and ideology in the digital age. There’s an anti-establishment tone to his person that is completely compatible inside or outside the established order. How does he embody this paradox and puzzle? By rhyming being fiercely critical with an unshakeable sense of "human-centric" integrity and a good-natured sense of humor. He is, as far as I know, one of the last embodiments of the late 90’s “technology for good” movement. I hope there are more veterans like him active in big tech, (tips and links are welcome!).

A few months ago, this Yoda like figure shuffled onto the stage at the University of California and talked about “Data Dignity”. As always with Lanier there was, perhaps, 10 minutes of very specific and very valuable on-topic explanations of what data dignity means. Then there was an hour of carpet bombing the tech sector by: Framing social media in the most negative light possible, offering a series of withering truths about the political bribery that sparked the internet revolution, abruptly describing the birth of bias in language models and hammering the fallacy of the term "AI"...All in a soft voice, combined with a disarming smile.

My spark of inspiration starts with a suggestion that we shift our current perception and framing of artificial intelligence. Lanier offers two inversions from our current state.

(Side note: As a musician I love inversions! In a musical inversion you shuffle the hierarchy and order of the notes (thereby changing the relationship of each note) in a chord and by doing so you create a variation, while retaining all the harmonic properties of the starting state - which is kind of what he is doing here).

Inversion 1: “The AI is a creature, and we view it as an entity or a pseudo person. We talk to it, and we ask it to do things, that is the most common way of thinking about things. I mean we have a chat interface, as if you’re chatting with a person. Why would you have that if you didn’t think of it in that way? “
Inversion 2: “It’s a form of social collaboration, so it's closer to something like Wikipedia, where you have a whole lot of people who provided example data/training data and what this thing has, is the ability to mash that data up in new ways. But it still is a collaboration of people and if you analyse it in those terms, you have a more actionable way of understanding it.”

The result?

"If you integrate it into society in these terms (inversion 2) then you have a brighter and more actionable set of paths open for the future of society".?

?Personally, I feel like what he is saying is important enough to zoom in on and it speaks to a small niche within the tech sector, but one that I feel is going to explode in the next year.

Curators.

These are the librarians of the tech sector, the people that carefully select and give meaning to the data that the next generation of AI technology is being built upon. These people embody a certain mindset, a match between curiosity, discipline, and a love of symmetrical coherence. A mindset that goes beyond a label of harmonic neurodivergence and enters the realm of a distilled artform. When I think of digital curators I see parallels between French horticulturalists, Japanese rock gardens or professional lego builders. And as Lanier correctly emphasizes in this short Q&A: the new technology we are starting to integrate into our lives is not giving these people, these "data curators" the dignity they deserve. ChatGPT is viewed and presented as an entity, CoPilot is an entity, they are products that we converse with. If we follow the more humanist approach of inversion2 then Microsoft and OpenAI should also be giving the hundreds, perhaps thousands of humans who worked on the data their moment of appreciation, their moment of “dignity”.

(Exactly the opposite has happened in many cases, where digital colonialism has manifested in a menial and mind numbing "click factory" form with contractors in Africa, South America and Asia)

Personally, every step I take in implementing and understanding the new world of AI I am reminded of the importance of high-quality data. In a recent project: Augmenting a service desk with generative AI capabilities, we have been blown away by the middleware, knowledge and infrastructure that is always just a few mouse clicks away to assist our work as an engineer, project manager or an architect. However, none of those amazing tools come even close to the utility and impact that a high-quality dataset has on an AI project.

The idea that you can just drop a PDF into an LLM and it magically finds meaning, efficiently summarizes or assists you in creating similar content is a strategy that might work at home, but in a Scope-Cost-Time paradigm you’re not going to get very far dumping: unedited, unsorted documents into a storage bucket and hoping that Harry Potter is behind the wheel of your favourite language model.

In an AI project, its generally a good idea, (and I’m not saying anything even slightly controversial here with a soft smile like Lanier) that you spend a lot of time making sure that all your conversations, all our data in this dataset are of an exceptionally high quality, that they are organized vertically in your training dataset and you can incrementally tune and test your model based on clear segments of data that connect to business processes.

?A short, but sweet moment of appreciation for the data people out there.


?

?

?? Joris van Seijen

Transformation consultant @ Studio Winegum

1 年

Nice to meet you kai. Great topics we discussed. :)

MILAN MEYBERG

Founder of Emissary of GAIA and Keynote Speaker on Environmental AI, Ecological Narratives, and the transition into the Symbioscene.

1 年

Nice to meet you! I hope we can continue our conversation in the near future. ????

Kai Bergin

Head of AI @MVR Digital Workforce

1 年

Menno van de Lagemaat dit is een soort van vervolg op waar ik het met jou over had (maar dan met iets meer structuur ??)

Frank Mester

Chief Executive Officer @ MvR Digital Workforce | IT Solutions, New Business Development

1 年

It was a great meeting Kai Bergin thanks for inviting me and thx to Barend Jungerius ????? for organizing this event.

要查看或添加评论,请登录

Kai Bergin的更多文章

  • Entropy, Parrots and Probability

    Entropy, Parrots and Probability

    Anyone that has used a language model like ChatGPT has been a witness to both game changing technology and multiple…

  • The Inevitability of Bias: From Artificial Intelligence to the human brain

    The Inevitability of Bias: From Artificial Intelligence to the human brain

    Before the sparks and embers of the next generation of artificial intelligence ignite the mountains of data we generate…

    4 条评论
  • AI can you make Art ?

    AI can you make Art ?

    This is going to be a long list of links, some images and more questions than answers. Apologies in advance: Some of…

    13 条评论
  • GitHub Copilot: AI Assisted Development

    GitHub Copilot: AI Assisted Development

    At least a year before ChatGPT launched and changed the way we work; one profession already had a head start on the…

    2 条评论
  • CoPilot: Improving MS Teams

    CoPilot: Improving MS Teams

    Microsoft Teams, the Office application that went from “Why would anyone in their right minds use this”? To: “My…

  • CoPilot = Instant Powerpoint

    CoPilot = Instant Powerpoint

    This is the big one in CoPilot, I’ve just had a look through the walkthroughs and tips that Microsoft is publishing on…

    3 条评论
  • Microsoft CoPilot Tips: Writing

    Microsoft CoPilot Tips: Writing

    So let's start working with CoPilot by waking up the great grandfather of this productivity suite: The Word Processor…

    2 条评论
  • Small Language Models:

    Small Language Models:

    Improving: Efficiency + Observability + Accessibility. In the realm of artificial intelligence language models have…

  • Chapter 2: Transformer architecture simplified: Neural Networks.

    Chapter 2: Transformer architecture simplified: Neural Networks.

    Continuing on from my first article: https://www.linkedin.

  • Transformer Architecture: Simplified (sort of).

    Transformer Architecture: Simplified (sort of).

    It's a sad fact of life that computers just don’t understand human beings. For the last 70+ years we human beings have…

社区洞察

其他会员也浏览了