登录查看更多内容

Revamping Third Party Vendor Assessments for the Age of Large Language Models

Maria Schwenger

GenAI & Cyber Strategist | Board Member | Tech Author & Public Speaker | Digital Transformation

发布日期: 2024年5月27日

Introduction?

The increasing adoption of Large Language Models (LLMs) in the supply chain presents a new challenge for traditional Third-Party Vendor Security Assessments (TPVRAs). This blog explores how to adapt existing TPVRAs to gather critical information about the integration of LLMs within the organizational ecosystem and its associated risks. A subsequent blog will outline the specifics of updating Master Service Agreements (MSAs) to address LLM supply chain risks, providing a comprehensive approach to governing the risk of the LLMs in the supply chain.

To help you get started, Appendix 1 includes a sample set of questions specifically tailored to assessing LLM usage within vendor products. This "strawman" approach can be adapted to your specific environment and needs. While it's a work in progress and requires further refinement, it serves as a springboard for developing a more comprehensive LLM assessment framework through community collaboration.

The LLM Conundrum in the Supply Chain

The opaque nature of vendor products can make it difficult to ascertain if they leverage LLMs. Traditional TPVRAs focus on infrastructure security, software vulnerabilities, network security posture, data protection practices, and integration overhead. This focus, however, often overlooks the potential risks associated with embedded LLMs, which may not be explicitly disclosed by vendors. Existing TPVRAs might not explicitly ask about the inner workings of vendor products, potentially missing the assessment of LLM components.?

The inclusion of LLMs can bring inherited risks, potentially impacting the entire ecosystem. Below are some of the main concerns:

Security Vulnerabilities in LLMs: Underlying LLM code might harbor vulnerabilities exploitable by attackers.
Data Privacy Concerns: LLMs may inadvertently process sensitive data during training or inference, raising privacy compliance issues.
Bias and Fairness: Unidentified biases within the LLM can lead to discriminatory outputs impacting your organization and its customers.

Adapting Your TPVRA Arsenal for LLM Detection

To effectively assess LLM-related risks within your supply chain, we need to adapt existing TPVRA practices. Consider the following possible modifications to your existing TPVRA strategy:

Enhanced Vendor Questionnaires: Expand existing questionnaires to explicitly inquire about LLM usage in vendor products. Ask vendors to disclose:

If their products utilize LLMs inside their products or services
If they have undergone LLM-specific security testing
If they employ explainability techniques to understand LLM decision-making
The specific functionalities and purposes of any LLMs used
Their security measures for LLM development and deployment?

Focus on Data Provenance, Security and Retention: Request detailed information about the data used for training, including its origin and how it's utilized.? Evaluate the vendor's data security practices throughout the LLM lifecycle, focusing on how they handle sensitive data during both training and inference stages. This evaluation should encompass measures to prevent unauthorized access and potential data leakage, as well as to ensure adherence to a well-defined data retention policy. Additionally, assess the vendor's practices for mitigating risks of data poisoning or bias.?
Request Transparency and Explainability: Seek assurances that the vendor provides some level of explainability for the LLM's outputs. This allows for a basic understanding of how the LLM arrives at its conclusions and facilitates potential bias detection.
Third-Party Penetration Testing with LLM Expertise: Engage penetration testing firms with expertise in LLM security. These experts can conduct targeted attacks to evaluate the LLM's robustness against adversarial manipulation.
Engaging with Specialized Security Firms: Consider partnering with security firms specializing in LLM security assessments. These firms possess advanced techniques for identifying and analyzing LLM-related vulnerabilities.
Leveraging Code Analysis Tools: If access to the vendor code base is available, you may consider incorporating code scanning/analysis tools into the TPVRA process to potentially detect the presence of LLM libraries or frameworks within vendor products, even if not explicitly disclosed.
Industry Collaboration and Standardization: Advocate for industry-wide collaboration to develop standardized methods for assessing LLM risks within third-party products. This standardization will create a more consistent and effective approach across the supply chain.

Beyond Detection: Mitigating LLM Risks in the Supply Chain

Once organizations identify LLM usage within their supply chain, they should adopt a proactive and multi-faceted approach that may include:?

Contractual Safeguards: Integrate LLM-specific risk mitigation clauses into vendor contracts. These clauses can address aspects like bias mitigation strategies, explainability requirements, and data security protocols.
Continuous Monitoring: Establish ongoing monitoring procedures to detect potential changes in LLM behavior or functionality within vendor products.

Benefits of a LLM-Aware TPVRA

The main advantage of incorporating LLM questions into the TPVRA is to enable us to address model risks early and more effectively. By integrating LLM awareness into our TPVRA, we can achieve:?

Proactive Risk Mitigation: Early identification of LLM usage allows for proactive mitigation strategies, such as conducting deeper security assessments or negotiating limitations on LLM functionalities.
Enhanced Transparency and Trust: A comprehensive TPVRA approach fosters greater transparency between vendors and clients, building trust within the supply chain ecosystem.
Informed Decision-Making: By understanding LLM usage within third-party products, organizations can make informed decisions about their adoption and integration into their own systems.

The Future of LLM TPVRAs: Collaboration and Innovation

Adapting TPVRAs for LLM detection and evaluation is an ongoing and iterative process. Collaboration with vendors, security experts, and industry bodies is key to developing robust assessment methodologies. As the LLM landscape evolves, so too should our TPVRA strategy. By staying vigilant and continuously innovating, we can ensure a secure and reliable LLM-powered supply chain.

This blog provides an initial framework for adapting your TPVRAs. The specific questions and techniques employed will depend on your unique risk tolerance, the nature of your business and vendor relationships. As the LLM ecosystem matures, new tools and best practices will emerge. By proactively adapting your TPVRA strategy, you can stay ahead of the curve and navigate the exciting, yet risk-laden, world of LLMs within your supply chain.

Appendix 1 Sample LLM Risk Assessment for Third-Party Vendors

Overview

This assessment evaluates potential vendors providing Generative AI (GenAI) technologies like large language models for text generation, code generation, image creation, etc. The use of GenAI carries risks around data privacy, security, intellectual property, and ethical concerns that must be carefully reviewed before a decision is made to contact specific vendors.?

These are some key areas that should be reviewed when assessing GenAI vendors. However, the specific criteria and weighting would depend on each organization's unique requirements/industry and risk tolerances. The assessment can be further customized based on the specific GenAI use case being pursued.

The following FlowChart presents a conceptual model for assessing third-party vendors that employ LLMs, emphasizing the requirement for a dynamic and adaptive evaluation process that accounts for the distinctive risks and considerations surrounding LLMs. This proposed approach aims to inspire innovation and flexibility in LLM risk assessment, while ensuring alignment with company policies and regulatory requirements.

FlowChart 1: Simplified Assessment flow?

Assessment Areas Questionnaire

Each of the risk areas below can be evaluated across multiple dimensions - the vendor's processes, the underlying model/system characteristics, output control features, documentation and transparency, and conformance to standards and regulations.

Vendor Information

Vendor Name:
Product/Service Description:?
GenAI Model(s) Used:
[ Y/N] Does the vendor provide a Model Card for the model(s) utilized??

Data Privacy & Security Risks?

[ Y/N] Does the vendor have robust data privacy policies and controls in place??
[ Y/N] Will any of our proprietary or sensitive data be used to train/finetune the vendor’s models?? If yes, please, provide:?
Data residency and compliance with data localization laws:
Potential for data leaks or misuse of training data:
Robustness of data encryption and access controls
Exposure of sensitive personal or corporate information

Security

[ Y/N] Does the vendor follow security best practices (encryption, access controls, etc.)?
[ Y/N] Have their models/systems undergone pen-testing and secure code review?
[ Y/N] Do they have incident response and breach notification processes?

Intellectual Property

[ Y/N] Are there any potential copyright infringement from training data
[ Y/N] Is there any potential risk of trade secret/confidential info exposure?
[ Y/N] Do the vendor’s terms allow for derivative use of generated outputs??
[ Y/N] Are there any known constraints around commercialization of generated outputs?
[ Y/N] How does the vendor handle potential copyright/licensing issues with training data?
How the patentability and defensibility of generated inventions are handled??

Safety & Integrity Risks

[ Y/N] Is there a potential for generating harmful, illegal, or explicit content
[ Y/N] Are there any safeguards against hate speech, violence, misinformation
How does the vendor guarantee the consistency and coherence of outputs across contexts?
How are the risks of hallucinations or factual inaccuracies handled??

Robustness & Reliability

What is the susceptibility to adversarial attacks or prompt injections
[ Y/N] Does the vendor practise failure mode analysis and defense against edge cases
What are the processes for safe deployment, monitoring, and updates?
How are the uptime, throughput, and scalability guaranteed??

Ethical Considerations & Responsible Use?

领英推荐

Plaintext: State of Generative AI

Dark Reading 1 年前

Internet fragmentation, EU AI Act, Lazarus loves…

CISO Series 1 年前

Cyber News #31 - OWASP Top 10 for LLM (Large Language…

CyberX - The Ethical Hacking Services 1 年前

[ Y/N] Does the vendor have ethical AI principles they follow??
What are the processes to analyze model outputs for unfair biases? E.g., benchmarking performance across different demographics?
How do they mitigate risks around bias, harmful outputs, and misinformation?E.g., mitigations for societal biases encoded in training data.?
Does the vendor allow/restrict certain use cases based on ethical concerns? E.g., is there ability to customize for culturally sensitive use cases??
How is the human oversight and ability to intervene implemented in the model creation and output validation??
[ Y/N] Are there any external audits, certifications, ethical reviews performed and available to review??

Model Performance/Robustness?

[ Y/N] Has third-party testing/benchmarking been done on the vendor's models?
What are the failure modes and limitations of their GenAI systems? Provide details on the architecture diagram/flow chart, or the corresponding Model card (if applicable)
Does the vendor have processes to update and improve their models over time??

Support & Services

What level of technical/integration support is provided in terms of model adoption and support? Does this apply to managing/monitoring outputs??
[ Y/N] Are professional services available for custom model fine-tuning/prompting?
[ Y/N] Are there SLAs or uptime guarantees for their GenAI systems? Provide the schedule.?

Pricing & Costs

What are the pricing models (per-request, subscription, etc.)?
[ Y/N] Are there any variable costs that could lead to pricing unpredictability?
How do costs compare to other vendors and/or in-house development?

Additional Generative AI Criteria?

Thi section covers some of the key aspects that differentiate generative AI from other software products and services with emphasis on data practices, model governance, output controls, and monitoring processes. The level of diligence may depend on the risk profile of the use case as well.?

Here are some additional details that can be assessed specifically for Generative AI vendors and technologies:

Data & Training:

Data sources, curation processes for training data (Does it have a model map?)
Techniques used to filter out toxic/biased data
Approach to data privacy/protection of training data
Approach to generating synthetic data (if applicable)?
Ability to incorporate proprietary/customer data for fine-tuning?

Model Characteristics:

Please provide details on the model characteristics, including size, architecture, and training approach
Model size, architecture, training approach (e.g. constitutional AI)
Stated capabilities and limitations of the model
Performance benchmarks across different tasks/domains?
Approaches to mitigate hallucinations, inconsistencies in output?
Model monitoring and ability to handle sensitive topics like violence, hate speech, etc. at model level?

Output Controls:

Filtering options for unsafe/undesirable outputs
Customizable filters based on customer requirements
Watermarking, traceability of generated content
Retention policies for generated outputs

Risks & Testing:?

Potential risks (bias, misuse, IP violations, etc.)
Processes for risk assessments, red teaming, penetration testing, adversarial testing, etc.
Approach to model updates, breaking changes
Disclosure processes for vulnerabilities, weaknesses

Monitoring & Assurance:

Monitoring of outputs for anomalies, policy violations??
Mitigation techniques if faulty output is discovered
Human in the loop approach?
Provenance tracking and audit trails?
Certifications, external audits, ethical reviews

Assessment Results?

Overall Risk Assessment

Based on the criteria evaluated above, the overall risk level is: ?

(Please provide a detailed assessment of the vendor's overall risk level)

[ ] Low Risk
[ ] Moderate Risk?
[ ] High Risk

Risk Mitigations

If moving forward, some potential mitigations for key risks include:?

(Please describe the potential mitigations for key risks)

- ...
- …?
- ...

For each line item above, please, document:?

Specific actions: Describe concrete steps to mitigate each risk (e.g., "Implement encryption for data in transit" or "Conduct regular security audits").
Responsibility: Assign ownership of each mitigation strategy to a specific team or individual (e.g., "IT Department" or "Vendor Manager").
Timeline: Provide a realistic timeline for implementing each mitigation strategy (e.g., "Within the next 6 weeks" or "Before contract renewal").
Metrics for success: Define how the effectiveness of each mitigation strategy will be measured (e.g., "Add specific incident response assistance" or "Increased vendor compliance rate - SOC II").

Below are a few examples specific to GenAI risks that focus on mitigating potential issues like biased data, unexplainable models, harmful output, and the need for human oversight.

Data Quality Control:?
Model Interpretability:?
Output Filtering:?
Human Oversight:?

Key Takeaways

Summary of main findings and implications
Clear statement of vendor's risk profile

Action Items

List of specific tasks or next steps
Responsible personnel or teams for each action item

Decision Rationale

Brief explanation of the reasoning behind the recommendation
Summary of key factors influencing the decision

Recommendations

Based on this assessment, the recommendation is: (Please provide a recommendation based on this assessment)

[ ] Proceed with vendor
[ ] Do not proceed with vendor

[ ] Proceed with this vendor, but with the following conditions… (Please, list all conditions)

Sohil K Naikwadi

Founder & CEO | Mentor

9 个月

Quite informative, thank you MARIA N. SCHWENGER

1 次回应

Anika zaman

Student at Khulna University

9 个月

You might find this intriguing report on global third-party risk worth checking out: https://securityscorecard.com/reports/third-party-cyber-risk

1 次回应

Nitin Kulkarni

9 个月

Great insight, Thanks for publishing detailed information

1 次回应

Michael Keslassy

Co-Founder & CTO at Vendict. Security Questionnaires done in minutes

9 个月

Interesting and very useful! Many are talking about the AI risk, but I don’t see a lot of practical advice like here

1 次回应

查看更多评论

要查看或添加评论，请登录

Maria Schwenger的更多文章

Streamlining IAM: How Bridgesoft's Plug-in Saves Time and Effort in Application Integration

2024年5月17日

Streamlining IAM: How Bridgesoft's Plug-in Saves Time and Effort in Application Integration

Introduction In the world of Identity and Access Management (IAM), integrating applications with Identity Providers…

5 条评论
Reflections on GISEC 2024: A Week of Innovation and Inspiration in Dubai

2024年4月29日

Reflections on GISEC 2024: A Week of Innovation and Inspiration in Dubai

Just last week, I had the pleasure of attending the 13th edition of GISEC Global in Dubai. This powerhouse of a…
Navigating Identity and Access Management (IAM) Integration with Legacy Applications: Some Challenges & Solutions

2024年2月5日

Navigating Identity and Access Management (IAM) Integration with Legacy Applications: Some Challenges & Solutions

Co-authored by Dali Islam Modernizing Identity and Access Management (IAM) is a crucial transformative step for…

7 条评论
AWSZeroTrustPolicy - A Guide to ZeroTrust Policy Implementation

2024年2月3日

AWSZeroTrustPolicy - A Guide to ZeroTrust Policy Implementation

Repository link: https://github.com/CloudDefenseAI/AWSZeroTrustPolicy Company link: https://clouddefense.

2 条评论
Vulnerability Prioritization in 2024: A Fresh Look at an Old Problem

2024年1月5日

Vulnerability Prioritization in 2024: A Fresh Look at an Old Problem

In 2018, managing the overwhelming volume of app security vulnerabilities was a manual beast, with scarce specialized…

3 条评论
Log4Shell (CVE-2021-44228)- a few thoughts

2021年12月12日

Log4Shell (CVE-2021-44228)- a few thoughts

Earlier this week I was asked by a colleague to sum up for him my thoughts on the new critical zero-day vulnerability…

8 条评论

See all articles

Revamping Third Party Vendor Assessments for the Age of Large Language Models

Maria Schwenger

GenAI & Cyber Strategist | Board Member | Tech Author & Public Speaker | Digital Transformation

Introduction?

The LLM Conundrum in the Supply Chain

Adapting Your TPVRA Arsenal for LLM Detection

Beyond Detection: Mitigating LLM Risks in the Supply Chain

Benefits of a LLM-Aware TPVRA

The Future of LLM TPVRAs: Collaboration and Innovation

Appendix 1

Sample LLM Risk Assessment for Third-Party Vendors

Overview

Assessment Areas Questionnaire

领英推荐

Additional Generative AI Criteria?

Assessment Results?

Maria Schwenger的更多文章

社区洞察

其他会员也浏览了

Security Risks in Google's Gemini Large Language Model Unveiled

The Ghost GPT

Captcha Bypass: Navigating the Landscape of Automated Verification Defeat

ChatGPT Security: OpenAI's Bug Bounty Program Offers Up to $20,000 Prizes

GenAI Anxiety for CISOs: Navigating Security in the Age of Generative AI

December 2024

The Role of AI in Cybersecurity Compliance and Governance

They Are Much Better Than Your Chief Compliance Officer

Take Action

Introduction?

The LLM Conundrum in the Supply Chain

Adapting Your TPVRA Arsenal for LLM Detection

Beyond Detection: Mitigating LLM Risks in the Supply Chain

Benefits of a LLM-Aware TPVRA

The Future of LLM TPVRAs: Collaboration and Innovation

Appendix 1

Sample LLM Risk Assessment for Third-Party Vendors

Overview

Assessment Areas Questionnaire

领英推荐

Additional Generative AI Criteria?

Assessment Results?

Maria Schwenger的更多文章

Streamlining IAM: How Bridgesoft's Plug-in Saves Time and Effort in Application Integration

Reflections on GISEC 2024: A Week of Innovation and Inspiration in Dubai

Navigating Identity and Access Management (IAM) Integration with Legacy Applications: Some Challenges & Solutions

AWSZeroTrustPolicy - A Guide to ZeroTrust Policy Implementation

Vulnerability Prioritization in 2024: A Fresh Look at an Old Problem

Log4Shell (CVE-2021-44228)- a few thoughts

社区洞察

其他会员也浏览了

Security Risks in Google's Gemini Large Language Model Unveiled

The Ghost GPT

Captcha Bypass: Navigating the Landscape of Automated Verification Defeat

ChatGPT Security: OpenAI's Bug Bounty Program Offers Up to $20,000 Prizes

GenAI Anxiety for CISOs: Navigating Security in the Age of Generative AI

December 2024

The Role of AI in Cybersecurity Compliance and Governance

They Are Much Better Than Your Chief Compliance Officer

Take Action