Customize LMs for Enterprise Use
Image generated using DALL-E via Microsoft Co-Pilot designer

Customize LMs for Enterprise Use

In a previous article, I delved into the realm of “Trusted Enterprise AI & ML .” The bedrock of users’ trust in AI models lies in the reliability of the insights they generate. Enter Distributed Ledger Technologies (DLTs), which provide governance mechanisms crucial for responsible AI usage and seamless model lifecycle management.

Now, embarking on an AI and Machine Learning journey within any enterprise demands a disciplined approach. It involves carefully selecting and training language models that align with the organization’s specific use cases. However, the linchpin of success remains high-quality diverse data sets. Without it, no AI program can thrive.

Over the past several weeks, I’ve immersed myself in exploring how AI can enhance our blockchain-enabled platform, aptly named “Trust Your Supplier .” Insights gleaned from conferences, customer feedback, and interactions with risk managers and procurement officers have underscored the myriad use cases that bolster supplier onboarding and third-party risk management.

The infographic below illustrates potential use case models that can benefit?Trust Your Supplier (TYS). We have prototyped several of these task models to determine which ones will significantly enhance efficiency and user experience.

The primary objectives and goals of this endeavor are to?enhance the Buyer and Supplier experience?by harnessing the power of?AI and ML. Specifically, we aim to simplify two critical processes:

  1. Supplier Onboarding Process: Streamlining the process through intelligent automation.
  2. Efficient Partner Risk Management: Leveraging data-driven insights to mitigate risks effectively.

In the ever-evolving landscape of artificial intelligence (AI), language models (LMs) play a pivotal role. These models, trained on vast amounts of text data, exhibit remarkable capabilities in understanding and generating human-like language. However, adopting these LMs effectively within enterprises requires strategic planning and customization.

The Enterprise Challenge

Enterprises seek to harness the power of LMs for various tasks, from customer support chatbots to risk assessment in supply chain management. But how can organizations ensure that LMs align precisely with their specific needs? Let’s explore some approaches.

Direct Utilization of Pre-Trained LMs:

The simplest path involves using?pre-trained LMs?directly for generative tasks. Models like?GPT-4?come with extensive knowledge and can be applied out of the box. However, this approach lacks customization, limiting its relevance to enterprise contexts.

Embeddings and Prompt Engineering:

A more effective strategy involves?embedding LMs?with enterprise data. By integrating LMs with domain-specific information, we enhance their relevance. Prompt engineering?further tailors the model’s behavior for specific use cases. For example, if the enterprise deals with supply chain risk assessment, the prompt can be designed to focus on risk-related queries.

Fine-Tuning for Enterprise Context:

The most robust approach is?fine-tuning pre-trained LMs. Fine-tuning allows organizations to customize an LM for their specific domain. By exposing the model to relevant data and adjusting its parameters, we optimize its performance. This path ensures that the model aligns precisely with business requirements.

After extensive exploration of publicly available literature and experimentation with various?Proof of Concepts (POCs)—including techniques like?Generative Adversarial Models (GAM)?and?Retrieval Augmented Generation (RAG)—it became evident that a?hybrid approach?would yield superior results. Here’s the proposed path for enterprise customization and adoption:

  1. Select an Appropriate Language Model (LM): The foundation lies in choosing an LM that aligns with the organization’s specific needs. Customization is key, and tuning the LM for the domain context is essential.
  2. Add the Secret Sauce: During the tuning process, organizations infuse their?secret sauce—domain-specific knowledge, business rules, and proprietary insights. This step differentiates them from competitors and ensures a strategic advantage.
  3. Create Use-Case Specific GenAI Models: Once the LM is in place, create?use-case specific Generative AI (GenAI)?models. These models serve as the workhorses, generating insights tailored to the enterprise’s unique requirements.

RAG and Prompt Engineering can be applied on the Fine-tuned LM

In practical terms, consider this analogy: Imagine a person who has just graduated from college. They’ve undergone a diverse range of courses and can now be put to general use. However, to truly excel, they pursue a master’s program, specializing in a specific domain—say, finance or marketing. This specialization fine-tunes their skills. Similarly, our GenAI models, once customized, become specialists—equipped to handle intricate tasks within an enterprise. For instance, they can manage financial data, create budgets, and forecast revenue, much like a seasoned finance professional.

Fine-tuning is a powerful technique that unlocks the full potential of pre-trained language models (LLMs). It allows you to customize these models for specific tasks and domains, significantly improving their performance.

The fine-tuning process involves:

Define the Goals of Fine Tuning

Understanding the LM's intended use is key. For instance, an LM might analyze user inputs related to supply chains, identify potential risks, and suggest mitigation plans.

Here, the ideal LM would be trained on data that includes risk scenarios, details, categories, mitigation strategies, and interpreting third-party scores, impact, etc.

This knowledge would allow the LM to not only react to user inputs but also proactively predict hidden risks. By considering these factors, you can select the most suitable LM for fine-tuning.

Building Strong Foundations: Training Data for Effective Risk Assessment

The quality of training data is paramount for fine-tuning an LM to excel at risk assessment. The data should equip the model to understand:

  • Risk Landscape: Different risk categories, scenarios, and their interpretations.
  • External Evaluation: How to analyze third-party risk scores.
  • Mitigation Strategies: Reviewing and recommending suitable mitigation plans for identified risks.
  • Proactive Risk Detection: Predicting potential hidden risks based on user inputs.

Here's what makes effective training data:

  • Diversity: A rich variety of scenarios and examples to ensure the LM generalizes well to unseen situations.
  • Bias Mitigation: Curated datasets that minimize bias and ensure fairness in risk assessment.
  • Human Expertise: Enriched with insights from domain experts to enhance the LM's understanding of risk factors and mitigation strategies

Structure and Volume:

Consider gathering the right data. A risk training dataset could include input attributes (e.g., risk scenario, description, category, other) and corresponding response attributes (e.g., risk level, interpretation, mitigation). Engaging data scientists here will make a huge difference.

The amount of data needed for fine-tuning can vary depending on the complexity of the tasks. A good dataset for TYS could be between 150,000 to 200,000 sets of curated, unbiased, and diverse data points. Fine tuning is usually accomplished with a small corpus of? good quality data.

Choosing an appropriate LM

Choosing the right LM is crucial for success. Consider both cost and technical requirements. A variety of commercial and open-source options are available, ranging from large models with billions of parameters (like GPT-3 or Llama2) to smaller, more efficient models like GPT-2, flan-t5-small or llama-2-7b-hf.

Availability is another aspect. Hugging Face offers a vast library of open-source LMs. However, keep in mind that large language models require significant computational resources like GPUs and ample memory. Smaller language models, often sufficient for custom tasks, can sometimes run on CPUs with less memory.

Tuning Approaches

Fine-tuning LMs has various approaches. Full parameter tuning of all weights is resource-intensive and time-consuming. Parameter Efficient Fine Tuning (PEFT )?emerges as a preferred method. PEFT involves employing various deep learning techniques to reduce the number of trainable parameters, limiting the need for expensive compute and memory resources, ?while still maintaining comparable performance to the full fine-tuning.

LoRA and qLoRA are techniques support PEFT and are highly efficient:

  1. LoRA?(Low Ranking Adaptation) selectively impacts a?subset of LM parameters, conserving resources. During fine-tuning, it modifies the model by freezing original weights and applying changes to a separate subset.
  2. qLoRA, (quantized LoRA) is a quantized version of LoRA, reduces numerical precision while maintaining performance. LM weights use high floating point precision, By reducing the precision, qLora improves performance with some impact to the accuracy.

The diagram below illustrates how the original weight matrix remains frozen, while trained(tuned) matrices are kept separate.

Adapted from the paper and a presentation on LoRA by Umar Jamil on You tube

Notably, catastrophic memory loss?could still be a an issue during the fine-tuning process. This paper introduces I-LoRA (Interpolation), a pioneering approach that leverages two independent modules functioning as fast and slow learners, respectively to reduce the problem in PEFT.

In summary, LoRA and qLoRA offer efficient fine-tuning, making optimal use of resources while preserving model effectiveness. We at Chainyard conducted experiments with LoRA using an open-source LM and a sample set of manually manipulated?risk scenarios.

PEFT/LoRA Supervised Fine Tuning in Python

To fine-tune the?google/flan-t5-small ?model using PEFT and LoRA (Low-Rank Adapters) with SFT Trainer, follow these steps:

Dataset Preparation: Have the?train.json?and?test.json?datasets ready. These datasets should contain real world scenarios, structured and curated. The example below is synthetic data generated using Co-Pilot.

Model Selection: You can use the?google/flan-t5-small?model as your base model for fine-tuning or any other like gpt-2 or NousResearch/Llama-2-7b-hf .

Fine-Tuning Process: Fine-tune the model using PEFT and LoRA using the Hugging Face Transformers library for this purpose. Python PEFT libraries provide modules for LoRA and qLoRA.

We can use Supervised Fine Tuning(SFT) to train the model. TRL provides easy-to-use APIs to create SFT models and train them with few lines of code on the dataset.

This snippet from my experiment demonstrates setting up the trainer and is explained in various documentations and many articles.

Refer to documentation for each of the parameter definitions

Training Environment: The training can be executed on Google Colaboratory using an A100 GPU for better performance. Adjust parameters such as batch size to reduce memory consumption if needed. The resulting fine-tuned model will be found in the "output" directory and is ready for use using RAG.

Conclusion and Summary

Enterprises can use fine-tuning to customize LMs for their specific use. The most critical path activity is the process is the preparation of training data. Enterprises have to create a family of fine-tuned models that can execute various tasks.

RAG and fine-tuning are two techniques that can be combined to improve how language models handle certain topics. This combination can be effective for providing a better user experience and addressing the needs of industry-specific inquiries. RAG is used to generate contextually relevant responses from diverse documents, while fine-tuning is used to tailor the language model to perform well on specific tasks or domains.?

Though, we discussed supervised fine tuning, I still need to experiment with other techniques such as SFT combined reinforcement training. More about the results in the next post. If you have any suggestions or comments, email me.




要查看或添加评论,请登录

社区洞察

其他会员也浏览了