Privacy and AI #18
In this edition of Privacy and AI
AI REGULATION
? California AI Transparency
? ICO consultation on the application of privacy law to the development and use of GenAI
? Use of Generative AI tools in legal documents (Singapore)
? Recommendations for the use of AI coding assistants
? Deepfake guidance SDAIA
AI GOVERNANCE
? AI Operating Model (McKinsey)
? AI Operating Model (IBM)
? Organizational Digital Governance (IAPP survey)
? Choosing the Right Foundation Model (IBM)
? Operationalizing AI governance: Atlassian's Responsible Technology Review Template
? Evaluation of whole-of-government trial into generative AI (Australian Government)
RECOMMENDED READINGS
? What Is ChatGPT Doing ... and Why Does It Work? (Wolfram, 2023)
? Robot Ethics by Mark Coeckelbergh
? Yann LeCunn on Human-Level AI
? What is AI? Karen Gao, MIT Technology Review
? People's attitudes towards AI (AISI)
AI REGULATION
California AI Transparency
To whom it applies?
GenAI provides with +1m monthly visitors
When it will become operative?
January 2026
What are the GenAI providers’ main obligations?
1) GenAI providers must making available an AI detection tool for free allow users to assess whether the content was artificially created. In addition, they must provide system provenance data regarding the content, subject to certain technical and privacy restrictions.
2) GenAI providers must also provide users with latent disclosures (metadata necessary to identify the provider) and give users the option to include manifest disclosures (notice that the content was synthetically generated)
3) Where GenAI licenses the GenAI system, the provider must ensure that its customer maintain the system’s capability to make the disclosures (latent and manifest).
What is out of the scope of the law?
The law does not apply to any product, service, internet website, or application that provides exclusively non-user-generated video game, television, streaming, movie, or interactive experiences.
What are the consequences for violating the Act?
The law establishes penalties of $5.000 per violation
Link here
ICO consultation on the application of privacy law to the development and use of GenAI
- Chapter one: The lawful basis for web scraping to train generative AI models
- Chapter two: Purpose limitation in the generative AI lifecycle
- Chapter three: Accuracy of training data and model outputs
- Chapter four: Engineering individual rights into generative AI models
- Chapter five: Allocating controllership across the generative AI supply chain
Follow this link to find all the chapters
Use of Generative AI tools in legal documents (Singapore)
The Supreme Court of Singapore issued a circular on the use of GenAI tools in Court Documents
While guidance is specific to the Singaporean judicial system, with some adaptations can be used as a guidance for legal teams.
Below I pasted the relevant parts and adapted it to the legal teams generally.
Accountability
(1) You are fully responsible for the content in all your legal documents
- Assess whether the output produced by the GenAI tool is suitable to be used in your specific case.
- Ensure that any AI-generated output used in your legal documents:
i. is accurate;
ii. is relevant; and
iii. does not infringe IP rights (e.g., copyright).
- GenAI tools should not be used to generate any evidence that you wish to rely upon. For example, you cannot use GenAI to ask for evidence to be created, fabricated, embellished, strengthened or diluted. Asking a GenAI tool to generate a first-cut draft of an affidavit/statement can be done, but it is not acceptable to ask a GenAI to fabricate or tamper with evidence.
(2) Existing requirements for you to produce case law, legislation, textbooks and articles which you have referred to continue to apply.
Ensuring Accuracy
(3) To ensure accuracy in your legal documents:
- Fact-check and proof-read any AI-generated content
- Edit and adapt AI-generated content to suit your situation.
- Verify that any references to case law, legislation, textbooks or articles provided as AI-generated content actually exist and stand for the legal positions that are attributed to them. If the AI-generated content includes extracts or quotes, you must verify that these are extracted/quoted accurately and attributed to the correct source.
- When checking the case law, legislation, etc, use a source that is known to have accurate content.
- It is not sufficient verification for you to ask a GenAI tool for confirmation that the materials exist or contain the content that the AI generated content says it does. Do not use one GenAI tool to confirm the content generated from another GenAI tool.
(4) Be prepared to identify the specific portions of the legal documents which used AI-generated content, and explain how you have verified the output produced by a GenAI tool. You may be asked to explain this if there are any doubts about any of your legal documents or a lack of compliance with applicable guidelines.
Protecting intellectual property rights and confidential or sensitive information
(5) Always respect IP rights in the legal documents that you produce, file or submit.
- Ensure that proper source attribution is provided, where appropriate. Include in the legal documents the original source of any material that you use or reference. For example, if you have quoted from an article, you should state who the writer of the article was and the title and year of publication of the article.
(6) Ensure that there is no unauthorised disclosure of confidential or sensitive information when you use GenAI tools. All information you provide to GenAI chatbots may potentially be disclosed publicly. If you include personal information it is possible that the GenAI tool will store the information for various purposes. You must comply with confidentiality orders and laws, personal data protection laws, intellectual property laws and legal privilege when using GenAI tools.
Link here
Recommendations for the use of AI coding assistants
The French Cybersecurity Agency (ANSSI) and the German Federal Office for Information Security (BSI) released guidance for the secure use of AI coding assistants
One of the most salient aspects of the guidelines is that it addresses particular measures that management should take for the secure use of AI coding assistants
For what purposes can AI code assistants be used?
AI can be used for:
- code generation
- debugging
- code explanation
- code explanations
- test case generation
- code formatting and documentation
- annotated code translation
What are the risks?
- missing confidentiality of inputs
- automation bias
- lack of output quality and security (including hallucinations, prompt injection, data/model poisoning)
- supply chain attacks and malicious code
What mitigations can be implemented?
At management level
- hire experienced developers
- discourage the use of shadow IT by providing alternatives to developers
- perform risk assessments
- develop security guidelines and guidelines about tools
- train employees
- evaluate the use of AI coding assistants
At employee (developer) level
- understand the limitations of AI coding assistants
- check outputs
- conduct trainings
- share experience and knowledge with colleagues
Link here
Deepfake guidance SDAIA
KSA data protection and AI authority opened for comments a guidance on deepfakes (or digital twins).
Importance
- It identifies deepfake use cases, admitted uses and malicious use cases.
- It provides guidance for relevant stakeholder groups
a) For GenAI developers:
- It requires them conduct a risk assessment (questionnaire in the guidance annex)
- It requires them to report any ‘unauthorised or unethical use’ of deepfake tech
b) For content creators:
- It requires express consent from the person. This goes beyond KSA PDPL which allows the processing of personal data for legitimate business purposes (legitimate interests). Consent form is attached to the guidance
- It requires watermarking the content as artificially generated
c) Interestingly, it also provide guidance for consumers, suggesting them to evaluate the messages, assess the audio-visual elements, using content authentication tools, and report misuses of the technology
Link here
AI GOVERNANCE
AI Operating Model (McKinsey)
What is an AI Operating Model?
An AI Operating Model is a representation of how an organization's components are organized and function together to execute an strategy. It is a blueprint for how a business puts strategy into action, a representation of how a company runs, including its structure (roles and responsibilities, governance, and decision making), processes (performance management, systems, and technology), and people (skills, culture, and informal networks).
1) Centralized
+ fastest skill and capability building for the genAI team
- gen AI team can be siloed from the decision-making process and distant from the business units and other functions, creating a possible barrier to influencing decisions.
2) Centrally led, BU executed
+ more integration between the BUs and the genAI team, reducing friction and easing support for enterprise-wide use of the technology.
- It can slow execution of the genAI team’s use of the technology because input and sign-off from the business units is required before going ahead.
3) BU led, centrally supported
+ easier buy-in from the BUs and functions, as genAI strategies bubble from the bottom up.
- Difficult to implement uses of genAI across various BUs, and different BUs can have varying levels of functional development on genAI.
4) Decentralized
+ Easier buy-in from the BUs and functions, and specialized resources can produce relevant insights quickly, with better integration within the unit or function.
- BUs that explore genAI risk lacking the knowledge and best practices that can come from a more centralized approach. They can also have difficulty going deep enough on genAI projects to achieve a significant breakthrough.
Organizations tend to have GenAI operating models that are highly centralized. It is expected that as technology, governance and employees' skills mature, the OM will be more hybrid or federated.
Link here
AI Operating Model (IBM)
IBM recently launched a whitepaper "An operating model for AI at scale"
They focus their attention on how the AI operating model can optimize the expected business value
They focus their attention on how the AI operating model can optimize the expected business value
One of the most interesting aspects of the paper is the comparison among the different types of AI operating models, their benefits and downsides.
On the extremes we can find Centralized approaches (Teams/resources/tools/data are all in a single location and cannot be accessed by units outside the centralised unit) and Decentralized approaches (Resources are dispersed across the org in different silos with no view of analytics activities outside their respective unit).
Between these two approaches there are many hydrib options, such as Centre of Excellence (CoE) -which is imo the one of the most common mixed types-, Consulting, Factory, etc.
Details of each can be found in the attached table
Important to note is that there is no one-size-fits-all operating model for AI. The selection of the most appropriate OM for the organization should be based on:
领英推荐
? Business Size and Scope
? AI Maturity
? Geographic Spread
? Resource Availability
? Industry-Specific Needs
Link here
Organizational Digital Governance (IAPP survey)
In September IAPP published a report depicting how organizations are defining and implementing their internal digital governance structures.
Highlights
? Evolving role of CPOs
Most CPOs have seen their remits expanded (in particular AI governance, and data governance/ethics)
? Organizational governance
The survey provides interesting insights on the AI governance structure of companies, and provided 3 AI gov models (analog, augmented, aligned).
A) Analog governance
- less mature organizations
- digital governance on top of existing structures
- not coordinated/coherent digital governance programme (eg, committees with overlapping remits and agendas)
- absence of risk ownership and SME in first line (1L) → second line (2L) is expected to design and operationalize controls
B) Augmented governance
- defined and structured approach to digital governance
- domain-specific committees chaired by domain leads
- emergence of AI Governance Committee to coordinate commercial and compliance functions
- risk ownership in 1L, monitoring risk 2L
- establishment of Risk Committee (interpretation of law and risk-based decision-making) and Data Governance Committee (value-based decision-making). Also potentially an Ethics Advisory Committee (ethical DM -should or should not questions-)
I think this should be the medium-term aspiration for data-driven companies
A note on AI Governance Committee
A potential structure of the AI Gov Committee is the following
1. Board-level AI Committee: ultimate accountability for decision-making and direction-setting regarding the organization's AI activities.
2.A) Innovation board may conduct research and development activities for novel AI uses
2.B) AI Strategy, Governance and Operations Committee: day-to-day decision-making on AI governance within the borders of the organization's broader AI strategy.
3) AI Governance Business Partner Network: AI governance experts in 1L/2L to support the design and operationalization of AI governance controls.
C) Aligned governance
- controls automation and coordination in governance activities
- SME centralization
- decision-making decentralization
One thing I would love to see in the report is which industries regularly choose the different types of governance
Benefits of a more coherent digital governance model
- improved clarity over digital strategy and compliance
- greater visibility and decision-making across the organization
- improved coordination across digital governance subdomains
Link here
Choosing the Right Foundation Model (IBM)
IBM developed a framework for AI foundation model selection.
1. IDENTIFY use case
- Clearly articulate the use case
- Craft the prompt and ideal answer, and then work backwards from there to find the data needed to provide the desired answer
- Work closely with product and engineering teams
2. LIST all model options and identify each model’s size, performance, risks and costs
- List all model options and identify each model’s size, performance and risks (eg 70B vs 13B model)
- Review model card, and evaluate whether the model was trained for specific use cases matching your needs
3. EVALUATE model size, performance, risks and costs
- Select the right-size model for your specific use case. Begin with the best-performing model and use a basic prompt to get your ideal performance. Scale down to a smaller model and use techniques such as prompt tuning to see if you can get the same results (cost-efficiency)
Criteria:
- Accuracy: how close the generated output is to the target output (against benchmarks and metrics)
- Reliability: how best the model generates the same output, and how well the model avoids undesired outputs (toxicity, hallucinations, discrimination, etc)
- Speed; how quickly the user gets the output. This is critical in realtime applications that demand low latency response (eg chatbots)
- Cost: how cost effective is the selected model?
- Larger models (eg 70B) tend to be more accurate and provide more high quality outputs, but also generally are slower and costly and require more computing power
- Smaller models (eg 13B) tend to be less accurate but provide the responses faster.
4. TEST options
- Use metrics to evaluate quality of the outputs. Select the metrics according to the use case.
- Try to achieve high performance with smaller models, using prompt engineering and model tuning. Improve accuracy with specific datasets
5. CHOOSE model that provides most value
- The tradeoff between accuracy and costs will depend to a large extent on the use case.
- Consider also how the use of larger models may impact ESG goals
- Evaluate compatibility with applications, vendors, AI and data platform, and whether the model will be deployed on prem, in cloud or in a hybrid set up.
Summary
For low-stakes use cases, smaller and (slightly) less accurate models may be preferable when considering the costs, latency or speed and transparency. Smaller models can be scaled across the organization for multiple use cases. Techniques prompt tuning can improve performance of these models
Link here
Operationalizing AI governance: Atlassian's Responsible Technology Review Template
Atlassian published their responsible technology principles and also made public their template for responsible technology review.
Link here
Evaluation of whole-of-government trial into generative AI (Australian Government)
The AU Digital Transformation Agency conducted a trial into GenAI. It made Microsoft 365 Copilot available to over 7,600 staff across 60+ government agencies.
The report is worth reading for those leaders planning implementation of GenAI tools, in particular rolling out MSFT Copilot.
Recommendations from the DTA
Detailed and adaptive implementation
- Product selection: Consider which generative AI solution are most appropriate for your overall operating environment and specific use cases, particularly for AI Assistant Tools.
- System configuration: Configure information systems, permissions, and processes to safely accommodate generative AI products.
- Specialised training: Offer specialised training reflecting specific use cases and develop general generative AI capabilities, including prompt training
- Change management: Effective change management should support the integration of generative AI by identifying ‘Generative AI Champions’ to highlight the benefits and encourage adoption.
- Clear guidance: Provide clear guidance on using generative AI, including when consent and disclaimers are needed, such as in meeting recordings, and a clear articulation of accountabilities.
Encourage greater adoption
- Workflow analysis: Conduct detailed analyses of workflows across various job families and classifications to identify further use cases that could improve generative AI adoption.
- Use case sharing: Share use cases in appropriate whole-of-government forums to facilitate the adoption of generative AI across the oerganization
Proactive risk management
- Impact monitoring: Proactively monitor the impacts of generative AI, including its effects on the workforce, to manage current and emerging risks effectively
Link here
What is AI? Karen Gao, MIT Technology Review
Link here
READING RECOMMENDATIONS
What Is ChatGPT Doing ... and Why Does It Work? (Wolfram, 2023)
Stephen Wolfram is a renowned scientist that has been working on machine learning for more than 30 years.
In this book Wolfram gives an easy to read and engaging explanation of how ChatGPT works. It is also suitable for those lacking computer science background.
It is the clearest and more comprehensive explanation about neural networks and LLMs I've seen so far. I recommend this for those who want to know more about the inner workings of ChatGPT and LLMs in general.
"So how is it, then, that something like ChatGPT can get as far as it does with language? The basic answer, I think, is that language is at a fundamental level somehow simpler than it seems. And this means that ChatGPT—even with its ultimately straightforward neural net structure—is successfully able to “capture the essence” of human language and the thinking behind it. And moreover, in its training, ChatGPT has somehow “implicitly discovered” whatever regularities in language (and thinking) make this possible.
The success of ChatGPT is, I think, giving us evidence of a fundamental and important piece of science: it’s suggesting that we can expect there to be major new “laws of language”—and effectively “laws of thought”—out there to discover. In ChatGPT —built as it is as a neural net—those laws are at best implicit. But if we could somehow make the laws explicit, there’s the potential to do the kinds of things ChatGPT does in vastly more direct, efficient—and transparent—ways."
He posted the content of the book in his website, and the book is also available in paperback version in amazon
Paperback https://amzn.eu/d/iqoROuaWolfram's
Wolfram's Website (free version HTML) https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/
Robot Ethics by Mark Coeckelbergh
This book explores the ethical questions that emerge or may emerge from the development and use of robots, illustrated with different use cases, like robot companions, self-driving cars, robots in the warfare, etc.
Questions addressed here are:
- the future of work and robotics, and how this would affect the job market, job displacement and dignity attached to our job.
- the robotic omnipresence and the privacy issues
- the ethics of care robots and what happens with the dignity of the elderly “when robots are used for adults, such as elderly people, there is a concern that this might not respect the dignity of people and instead infantilize them”
- self driving cars, and whose lives should be prioritised in case of accidents
- whether robots agency and moral standing. He distinguishes between:
- “direct” moral standing: whether there are any intrinsic properties of the robot that warrant giving it any particular moral standing -are robots just mere things or goods?)
- “indirect” moral standing: whether humans should treat robots well not because their particular properties but rather because we humans have moral standing and therefore need to be treated well. What counts here is the character and good of humans. “Kicking a robot is bad not because any harm is done to the robot but rather because … it trains a bad moral character on the part of the human, it is not virtuous”. What do you think?
- drone robots for war and autonomous weapons
In a nutshell it is a book for those looking to explore questions beyond operational problems regarding the development and use of robots in our daily life.
Coeckelbergh, Robot Ethics (MIT Press, 2022)
Link to the post here
Yann LeCunn on Human-Level AI
Current LLMs are incapable of achieving human-level intelligence
Link here
People's attitudes towards AI
UK's AI Safety Institute (AISI) studied whether people want AI more human-like
Key findings
Most people
- agree that AI should transparently reveal itself not to be human, but many were happy for AI to talk in human-realistic ways.
- felt that AI systems should refrain from expressing emotions, unless they were idiomatic expressions (like “I’m happy to help”).
- were strongly opposed to the idea that people could or should form personal relationships with AI systems, and this view was stronger among those who did not have a college degree.
- were uncertain about whether AI systems should take the blame for their own actions, or whether it was possible for them to be immoral.
- were against about being rude or insulting to an AI chatbot
People were generally neutral to
- blaming the AI provider for incorrect medical advice which causes harm to the end user
- considering sexist an AI chatbot that discriminates against women when making hiring decisions
Link here
Unsubscription
You can unsubscribe from this newsletter at any time. Follow this link to know how to do it.
ABOUT ME
I'm working as AI Governance Manager at Informa. Previously I worked as senior privacy and AI governance consultant at White Label Consultancy. I previously worked for other data protection consulting companies.
I'm specialised in the legal and privacy challenges that AI poses to the rights of data subjects and how companies can comply with data protection regulations and use AI systems responsibly. This is also the topic of my PhD thesis.
I have an LL.M. (University of Manchester), and I'm a PhD (Bocconi University, Milano).
I'm the author of “Data Protection Law in Charts. A Visual Guide to the General Data Protection Regulation“ and "Privacy and AI". You can find the books here
Deputy CISO @ SFPD | CIPP/E (IAPP)| CIPM (IAPP)
2 周Muchísimas gracias, Federico!
CIO, CISO, CISSP, CIPP/E, Author - Global Data Privacy, Security, Continuity, GDPR, CPRA, Compliance Architecture - Advisor and Founding Associate - The Privacy Panel
1 个月Where do you find the time, Federico? Your work is incredible.