Revamping Third Party Vendor Assessments for the Age of Large Language Models
Auto generated with Gemini (but after 22 tries... )

Revamping Third Party Vendor Assessments for the Age of Large Language Models

Introduction?

The increasing adoption of Large Language Models (LLMs) in the supply chain presents a new challenge for traditional Third-Party Vendor Security Assessments (TPVRAs). This blog explores how to adapt existing TPVRAs to gather critical information about the integration of LLMs within the organizational ecosystem and its associated risks. A subsequent blog will outline the specifics of updating Master Service Agreements (MSAs) to address LLM supply chain risks, providing a comprehensive approach to governing the risk of the LLMs in the supply chain.

To help you get started, Appendix 1 includes a sample set of questions specifically tailored to assessing LLM usage within vendor products. This "strawman" approach can be adapted to your specific environment and needs. While it's a work in progress and requires further refinement, it serves as a springboard for developing a more comprehensive LLM assessment framework through community collaboration.

The LLM Conundrum in the Supply Chain

The opaque nature of vendor products can make it difficult to ascertain if they leverage LLMs. Traditional TPVRAs focus on infrastructure security, software vulnerabilities, network security posture, data protection practices, and integration overhead. This focus, however, often overlooks the potential risks associated with embedded LLMs, which may not be explicitly disclosed by vendors. Existing TPVRAs might not explicitly ask about the inner workings of vendor products, potentially missing the assessment of LLM components.?

The inclusion of LLMs can bring inherited risks, potentially impacting the entire ecosystem. Below are some of the main concerns:

  • Security Vulnerabilities in LLMs: Underlying LLM code might harbor vulnerabilities exploitable by attackers.
  • Data Privacy Concerns: LLMs may inadvertently process sensitive data during training or inference, raising privacy compliance issues.
  • Bias and Fairness: Unidentified biases within the LLM can lead to discriminatory outputs impacting your organization and its customers.

Adapting Your TPVRA Arsenal for LLM Detection

To effectively assess LLM-related risks within your supply chain, we need to adapt existing TPVRA practices. Consider the following possible modifications to your existing TPVRA strategy:

  1. Enhanced Vendor Questionnaires: Expand existing questionnaires to explicitly inquire about LLM usage in vendor products. Ask vendors to disclose:

  • If their products utilize LLMs inside their products or services
  • If they have undergone LLM-specific security testing
  • If they employ explainability techniques to understand LLM decision-making
  • The specific functionalities and purposes of any LLMs used
  • Their security measures for LLM development and deployment?

  1. Focus on Data Provenance, Security and Retention: Request detailed information about the data used for training, including its origin and how it's utilized.? Evaluate the vendor's data security practices throughout the LLM lifecycle, focusing on how they handle sensitive data during both training and inference stages. This evaluation should encompass measures to prevent unauthorized access and potential data leakage, as well as to ensure adherence to a well-defined data retention policy. Additionally, assess the vendor's practices for mitigating risks of data poisoning or bias.?
  2. Request Transparency and Explainability: Seek assurances that the vendor provides some level of explainability for the LLM's outputs. This allows for a basic understanding of how the LLM arrives at its conclusions and facilitates potential bias detection.
  3. Third-Party Penetration Testing with LLM Expertise: Engage penetration testing firms with expertise in LLM security. These experts can conduct targeted attacks to evaluate the LLM's robustness against adversarial manipulation.
  4. Engaging with Specialized Security Firms: Consider partnering with security firms specializing in LLM security assessments. These firms possess advanced techniques for identifying and analyzing LLM-related vulnerabilities.
  5. Leveraging Code Analysis Tools: If access to the vendor code base is available, you may consider incorporating code scanning/analysis tools into the TPVRA process to potentially detect the presence of LLM libraries or frameworks within vendor products, even if not explicitly disclosed.
  6. Industry Collaboration and Standardization: Advocate for industry-wide collaboration to develop standardized methods for assessing LLM risks within third-party products. This standardization will create a more consistent and effective approach across the supply chain.

Beyond Detection: Mitigating LLM Risks in the Supply Chain

Once organizations identify LLM usage within their supply chain, they should adopt a proactive and multi-faceted approach that may include:?

  • Contractual Safeguards: Integrate LLM-specific risk mitigation clauses into vendor contracts. These clauses can address aspects like bias mitigation strategies, explainability requirements, and data security protocols.
  • Continuous Monitoring: Establish ongoing monitoring procedures to detect potential changes in LLM behavior or functionality within vendor products.

Benefits of a LLM-Aware TPVRA

The main advantage of incorporating LLM questions into the TPVRA is to enable us to address model risks early and more effectively. By integrating LLM awareness into our TPVRA, we can achieve:?

  • Proactive Risk Mitigation: Early identification of LLM usage allows for proactive mitigation strategies, such as conducting deeper security assessments or negotiating limitations on LLM functionalities.
  • Enhanced Transparency and Trust: A comprehensive TPVRA approach fosters greater transparency between vendors and clients, building trust within the supply chain ecosystem.
  • Informed Decision-Making: By understanding LLM usage within third-party products, organizations can make informed decisions about their adoption and integration into their own systems.

The Future of LLM TPVRAs: Collaboration and Innovation

Adapting TPVRAs for LLM detection and evaluation is an ongoing and iterative process. Collaboration with vendors, security experts, and industry bodies is key to developing robust assessment methodologies. As the LLM landscape evolves, so too should our TPVRA strategy. By staying vigilant and continuously innovating, we can ensure a secure and reliable LLM-powered supply chain.

This blog provides an initial framework for adapting your TPVRAs. The specific questions and techniques employed will depend on your unique risk tolerance, the nature of your business and vendor relationships. As the LLM ecosystem matures, new tools and best practices will emerge. By proactively adapting your TPVRA strategy, you can stay ahead of the curve and navigate the exciting, yet risk-laden, world of LLMs within your supply chain.

Appendix 1

Sample LLM Risk Assessment for Third-Party Vendors

Overview

This assessment evaluates potential vendors providing Generative AI (GenAI) technologies like large language models for text generation, code generation, image creation, etc. The use of GenAI carries risks around data privacy, security, intellectual property, and ethical concerns that must be carefully reviewed before a decision is made to contact specific vendors.?

These are some key areas that should be reviewed when assessing GenAI vendors. However, the specific criteria and weighting would depend on each organization's unique requirements/industry and risk tolerances. The assessment can be further customized based on the specific GenAI use case being pursued.

The following FlowChart presents a conceptual model for assessing third-party vendors that employ LLMs, emphasizing the requirement for a dynamic and adaptive evaluation process that accounts for the distinctive risks and considerations surrounding LLMs. This proposed approach aims to inspire innovation and flexibility in LLM risk assessment, while ensuring alignment with company policies and regulatory requirements.

FlowChart 1: Simplified Assessment flow?

Assessment Areas Questionnaire

Each of the risk areas below can be evaluated across multiple dimensions - the vendor's processes, the underlying model/system characteristics, output control features, documentation and transparency, and conformance to standards and regulations.

  1. Vendor Information

  • Vendor Name:
  • Product/Service Description:?
  • GenAI Model(s) Used:
  • [ Y/N] Does the vendor provide a Model Card for the model(s) utilized??

  1. Data Privacy & Security Risks?

  • [ Y/N] Does the vendor have robust data privacy policies and controls in place??
  • [ Y/N] Will any of our proprietary or sensitive data be used to train/finetune the vendor’s models?? If yes, please, provide:?
  • Data residency and compliance with data localization laws:
  • Potential for data leaks or misuse of training data:
  • Robustness of data encryption and access controls
  • Exposure of sensitive personal or corporate information

  1. Security

  • [ Y/N] Does the vendor follow security best practices (encryption, access controls, etc.)?
  • [ Y/N] Have their models/systems undergone pen-testing and secure code review?
  • [ Y/N] Do they have incident response and breach notification processes?

  1. Intellectual Property

  • [ Y/N] Are there any potential copyright infringement from training data
  • [ Y/N] Is there any potential risk of trade secret/confidential info exposure?
  • [ Y/N] Do the vendor’s terms allow for derivative use of generated outputs??
  • [ Y/N] Are there any known constraints around commercialization of generated outputs?
  • [ Y/N] How does the vendor handle potential copyright/licensing issues with training data?
  • How the patentability and defensibility of generated inventions are handled??

  1. Safety & Integrity Risks

  • [ Y/N] Is there a potential for generating harmful, illegal, or explicit content
  • [ Y/N] Are there any safeguards against hate speech, violence, misinformation
  • How does the vendor guarantee the consistency and coherence of outputs across contexts?
  • How are the risks of hallucinations or factual inaccuracies handled??

  1. Robustness & Reliability

  • What is the susceptibility to adversarial attacks or prompt injections
  • [ Y/N] Does the vendor practise failure mode analysis and defense against edge cases
  • What are the processes for safe deployment, monitoring, and updates?
  • How are the uptime, throughput, and scalability guaranteed??

  1. Ethical Considerations & Responsible Use?

  • [ Y/N] Does the vendor have ethical AI principles they follow??
  • What are the processes to analyze model outputs for unfair biases? E.g., benchmarking performance across different demographics?
  • How do they mitigate risks around bias, harmful outputs, and misinformation?E.g., mitigations for societal biases encoded in training data.?
  • Does the vendor allow/restrict certain use cases based on ethical concerns? E.g., is there ability to customize for culturally sensitive use cases??
  • How is the human oversight and ability to intervene implemented in the model creation and output validation??
  • [ Y/N] Are there any external audits, certifications, ethical reviews performed and available to review??

  1. Model Performance/Robustness?

  • [ Y/N] Has third-party testing/benchmarking been done on the vendor's models?
  • What are the failure modes and limitations of their GenAI systems? Provide details on the architecture diagram/flow chart, or the corresponding Model card (if applicable)
  • Does the vendor have processes to update and improve their models over time??

  1. Support & Services

  • What level of technical/integration support is provided in terms of model adoption and support? Does this apply to managing/monitoring outputs??
  • [ Y/N] Are professional services available for custom model fine-tuning/prompting?
  • [ Y/N] Are there SLAs or uptime guarantees for their GenAI systems? Provide the schedule.?

  1. Pricing & Costs

  • What are the pricing models (per-request, subscription, etc.)?
  • [ Y/N] Are there any variable costs that could lead to pricing unpredictability?
  • How do costs compare to other vendors and/or in-house development?

Additional Generative AI Criteria?

Thi section covers some of the key aspects that differentiate generative AI from other software products and services with emphasis on data practices, model governance, output controls, and monitoring processes. The level of diligence may depend on the risk profile of the use case as well.?

Here are some additional details that can be assessed specifically for Generative AI vendors and technologies:

  1. Data & Training:

  • Data sources, curation processes for training data (Does it have a model map?)
  • Techniques used to filter out toxic/biased data
  • Approach to data privacy/protection of training data
  • Approach to generating synthetic data (if applicable)?
  • Ability to incorporate proprietary/customer data for fine-tuning?

  1. Model Characteristics:

  • Please provide details on the model characteristics, including size, architecture, and training approach
  • Model size, architecture, training approach (e.g. constitutional AI)
  • Stated capabilities and limitations of the model
  • Performance benchmarks across different tasks/domains?
  • Approaches to mitigate hallucinations, inconsistencies in output?
  • Model monitoring and ability to handle sensitive topics like violence, hate speech, etc. at model level?

  1. Output Controls:

  • Filtering options for unsafe/undesirable outputs
  • Customizable filters based on customer requirements
  • Watermarking, traceability of generated content
  • Retention policies for generated outputs

  1. Risks & Testing:?

  • Potential risks (bias, misuse, IP violations, etc.)
  • Processes for risk assessments, red teaming, penetration testing, adversarial testing, etc.
  • Approach to model updates, breaking changes
  • Disclosure processes for vulnerabilities, weaknesses

  1. Monitoring & Assurance:

  • Monitoring of outputs for anomalies, policy violations??
  • Mitigation techniques if faulty output is discovered
  • Human in the loop approach?
  • Provenance tracking and audit trails?
  • Certifications, external audits, ethical reviews

Assessment Results?

  1. Overall Risk Assessment

Based on the criteria evaluated above, the overall risk level is: ?

(Please provide a detailed assessment of the vendor's overall risk level)

  • [ ] Low Risk
  • [ ] Moderate Risk?
  • [ ] High Risk

  1. Risk Mitigations

If moving forward, some potential mitigations for key risks include:?

(Please describe the potential mitigations for key risks)

  • - ...
  • - …?
  • - ...

For each line item above, please, document:?

  • Specific actions: Describe concrete steps to mitigate each risk (e.g., "Implement encryption for data in transit" or "Conduct regular security audits").
  • Responsibility: Assign ownership of each mitigation strategy to a specific team or individual (e.g., "IT Department" or "Vendor Manager").
  • Timeline: Provide a realistic timeline for implementing each mitigation strategy (e.g., "Within the next 6 weeks" or "Before contract renewal").
  • Metrics for success: Define how the effectiveness of each mitigation strategy will be measured (e.g., "Add specific incident response assistance" or "Increased vendor compliance rate - SOC II").

Below are a few examples specific to GenAI risks that focus on mitigating potential issues like biased data, unexplainable models, harmful output, and the need for human oversight.

  • Data Quality Control:?
  • Model Interpretability:?
  • Output Filtering:?
  • Human Oversight:?

  1. Key Takeaways

  • Summary of main findings and implications
  • Clear statement of vendor's risk profile

  1. Action Items

  • List of specific tasks or next steps
  • Responsible personnel or teams for each action item

  1. Decision Rationale

  • Brief explanation of the reasoning behind the recommendation
  • Summary of key factors influencing the decision

  1. Recommendations

Based on this assessment, the recommendation is: (Please provide a recommendation based on this assessment)

  • [ ] Proceed with vendor
  • [ ] Do not proceed with vendor

[ ] Proceed with this vendor, but with the following conditions… (Please, list all conditions)

Sohil K Naikwadi

Founder & CEO | Mentor

9 个月

Quite informative, thank you MARIA N. SCHWENGER

Anika zaman

Student at Khulna University

9 个月

You might find this intriguing report on global third-party risk worth checking out: https://securityscorecard.com/reports/third-party-cyber-risk

Nitin Kulkarni

Technology Project Manager | IoT | DevSecOps | Application Security | Cloud Security | Data Security | Solution Architect

9 个月

Great insight, Thanks for publishing detailed information

Michael Keslassy

Co-Founder & CTO at Vendict. Security Questionnaires done in minutes

9 个月

Interesting and very useful! Many are talking about the AI risk, but I don’t see a lot of practical advice like here

要查看或添加评论,请登录

Maria Schwenger的更多文章

社区洞察

其他会员也浏览了