Unleashing the Power of Llama 2: Unveiling the Secrets of Generative AI with Snowflake! ????
Shivani Paunikar, MSBA
Data Engineer @Tucson Police Department | ASU Grad Medallion | Snowflake Certified | BGS Member
Whether we are cognizant of it or not, Large Language Models (LLMs) have seamlessly integrated into various aspects of our daily routines. Whether used for debugging code, organizing events, or sharing content on social media, the influence of LLMs is pervasive. Behind the scenes, numerous companies are engaged in a competitive landscape to deliver cutting-edge LLMs, a trend that gained momentum with the introduction of OpenAI s GPT-3. Among these innovators is Meta , which recently unveiled its latest achievement in August 2022 Llama-2 . This LLM, not only Meta 's most robust but also open source, comprises model weights and initial code for both pre-trained and fine-tuned Llama language models (Llama Chat, Code Llama), spanning from 7B to 70B parameters.
?Llama-2 's pre-trained models stand out, having been trained on a massive 2 trillion tokens, offering double the context length compared to its predecessor, Llama-1. Additionally, the fine-tuned models have undergone training with over 1 million human annotations, presenting users with model size options of 7B, 13B, and 70B parameters. This release underscores Meta 's commitment to advancing the field of Large Language Models and empowering users with state-of-the-art language understanding capabilities.
Event Summary:
Recently participated in Snowflake 's Build 2023-24 Event, an insightful gathering where industry leaders imparted their perspectives and expertise on the applications of artificial intelligence (AI). The event highlighted diverse approaches to integrating AI into real-world scenarios, emphasizing best practices that can be applied to enhance individual workflows. Notably, there was a notable advancement in the Natural Language aspect of Large Language Model (LLM) implementation, with a projected shift towards leveraging LLMs for Computer Vision applications—a development crucial to addressing current global needs.
The collaborative efforts of researchers and organizations focusing on open-source LLMs are a valuable asset to society. They empower individuals across various industries to address contextual business challenges by fine-tuning these LLMs. It is recognized that LLMs developed by business organizations are tailored to specific business objectives. However, the adaptability of these LLMs through further fine-tuning for industry-specific needs or particular challenges enhances their utility. An illustrative example shared during the event involved the potential revolutionization of wheat cultivation through AI. While this might not be a billion or trillion-dollar problem initially targeted by proprietary LLMs, the availability of open-source LLMs enables industry experts to fine-tune them for addressing such challenges. The objective is not to replace human efforts but to provide assistance, enabling more efficient task execution with the power of AI.
Snowflake , along with its functionalities, equips us with the capacity to execute large language models within our systems, eliminating the necessity for infrastructure management for such tasks. We aim to elucidate how you can acquire the ability to deploy and fine-tune Large Language Models, enabling the creation of customized AI solutions tailored to your business context.
Implementation:
?? Business Scenario Goal: Leveraging OSS LLMs on Proprietary Data
Example company: Frosty Toys
At Frosty Toys, we encounter the task of handling toy requests via phone calls, where our objective is to securely leverage Meta ’s Llama 2 for the analysis of call transcripts to extract essential information. The call transcripts, typically extensive, contain valuable details pertinent to Frosty Toys' needs. Summarizing these transcripts into concise forms poses a challenge, as it is inherently difficult to distill meaningful information effectively.
(Disclaimer: The Frosty Toys demo is a hypothetical intended for illustrative purposes only. Results are not endorsements of any specific toy or company.)
?? Technical Scenario Breakdown:
??? Raw Data: A table housing hundreds of thousands of call transcript texts.
?? LLM: Harnessing the power of Llama 2 7B.
Task 1: deploy LLM from hugging face for inference inside snowflake
Output: Summarize the call transcript in less than 200 words.
Task 2: fine-tune and deploy the LLM model
Output: Extract name, location, and toy list.
Prerequisites:
?? Working knowledge of Python and SQL
?? Experience working with Docker
???? Pre-work: Configure your environment by ensuring you have the following:
?-?????? Laptop with WiFi and the ability to download libraries and Python packages.
-?????? Docker, Inc Desktop installed (Get it here: Docker Desktop )
-?????? VS Code (recommended) or another IDE
-?????? Hugging Face account to access the LLAMA 2 model. Join here:
?
-?????? Snowflake account with the required data repository
?? ?Access to Llama 2:
Approval is required to access Llama 2 on HuggingFace (may take several hours).
??? Task 1: LLM Deployment:
Environment Configuration:
Ensured a smooth initiation by customizing a conda environment tailored specifically to the requirements. By installing essential packages through Conda and supplementing with additional libraries via pip, established a resilient foundation for the investigative tasks.
领英推荐
Establishing a Secure Connection:
Creating a secure environment was of utmost importance. Crafted a Snowflake Session object, utilizing defined connection parameters by reading a connection JSON file.
Integration of Llama 2:
A noteworthy aspect was the incorporation of Llama 2 from Hugging Face. Seamlessly integrating Llama 2 using LLMOptions, fine-tuning tokens, and batch size configurations.
Model Registration, Logging, and Deployment :
Navigating the Snowflake Model Registry was a crucial step. Securely logged our model, paving the way for deployment within the efficient Snowpark Container Services (SPCS). Notably, the deployment process, a critical milestone in our exploration, took approximately 25-30 minutes.
Data Management in Snowflake:
Our subsequent endeavor involved efficient data management. Read data from a JSON file stored in snowflake into a Pandas DataFrame, ensuring a seamless transition. Writing the DataFrame to Snowflake, we achieved a harmonious integration, consolidating our data management within the Snowflake ecosystem.
Prompt Engineering:
We then provided instructions to our LLM model to assess its predictive capabilities aligned with our specific requirements. We employed both straightforward and intricate prompts to gauge the proficiency of the existing LLM model. The outcomes revealed inconsistency across all transcripts. The foundational model exhibited challenges in adhering to specific or complex instructions, reinforcing the imperative for fine-tuning to align the model with the precise demands of our business requirements.
Output:
?? LLM Inference and Fine-Tuning:
Setup:
Docker Desktop served as the containerization solution for packaging and deploying Snowpark applications. A Docker image, encompassing the Snowpark application code along with its dependencies and requisite libraries, was created. This image encapsulates the comprehensive environment essential for executing the Snowpark application. Subsequently, SnowUI was employed to evaluate the existing compute pools and SPCS services within the current database warehouse. The active services were then terminated. Using SPCS, a new Jupyter Notebook Service was established and accessed through the "show" endpoint in the service command within the SnowUI. Open the Jupyter notebook, download the required libraries, and load the tokenizer and the Llama 2 pre-trained model using the hugging CLI login token.
Dataset Loading (Training, evaluation, and testing):
To evaluate the model, we load only 100 rows for each of the tasks (training, evaluating, and testing) from the transcript.json dataset stored in Snowflake. Once the data is loaded we apply prompts to the training and evaluating dataset and tokenize the same.
Fine-tuning the Llama 2 Model:
The fine-tuning procedure entails commencing training with an existing model and modifying a subset of its parameters to improve performance on a specific task. Utilizing LoRA (Low-Rank Adaptation), a parameter-efficient technique, allows the adjustment of only a fraction of the model's extensive ~7B parameters. This method significantly economizes computational and memory resources, leading to efficiency gains. The implementation of LoRA on a single A10 GPU serves as a demonstration of the effectiveness of inference under resource constraints. It is crucial to provide both the train_dataset and eval_dataset for loss calculation during the fine-tuning process. Furthermore, we specify the output_weights_dir as the directory where the fine-tuned weights will be stored after completing the fine-tuning job, with a minimum requirement of 1 num_epochs.
Strategic Model Referencing, Logging, and Deployment:
Leveraging the Snowflake session environment, the fine-tuned Llama 2 model was denoted as 'FINETUNED_LLAMA2,' establishing a model registry. This streamlined and efficient operation was completed in around 15 minutes. Subsequently, the model can be deployed by referencing the registry through Snowpark Container Services.
Inference and Prediction:
Subsequently, the fine-tuned Llama 2 model underwent inference on the evaluation dataset. This involved applying prompts to the datasets, tokenizing them, and ultimately executing inference tasks. Following this successful evaluation, the model was tested on the dataset, demonstrating its proficiency in deployment by delivering the necessary business information.
Output:
Conclusion:
We effectively fine-tuned Meta 's Llama2 model, which is openly accessible, to meet our specific business requirements. This successful adaptation illustrated the advantages of open-source Large Language models in contributing to diverse industries when freely accessible. Furthermore, it highlighted how Snowflake , and its tools such as Snowpark Container Services , empower us to creatively discover effective solutions for challenges by harnessing the capabilities of AI.
Check out the Github link for more information.
This article was co-authored by Aayush Singh (Github ).
Helping companies keep projects on track.
10 个月Hi Shivani Paunikar, MSBA. Do you know how to complete the deployment and the fine-tunning not having access to the bootcamp's snowflake user, DB, etc? I'm following the recorded video of the bootcamp and readying the repository, but need to work with my snowflake's account instead of the one facilitated to the bootcamps participants.
Career Coach | upGrad | Soft Skills Trainer | Communication Skills and Behavioral Coach | Edtech | Learning & Development | Empowering individuals to thrive personally and professionally
11 个月Great Going Shivani??