登录查看更多内容

Running Llama-3.2-based Chatbot on Intel Core Ultra Processor using OpenVINO-GenAI

Ramesh Perumal PhD

AI Solution Architect | SMIEEE | Edge AI | Computer Vision | GenAI | MLOps | Taiwan Employment Gold Card Recipient | Healthcare & Life Sciences

发布日期: 2024年9月28日

This article presents the steps to quantize the Llama-3.2-3B-Instruct model using Optimum-Intel and run the chatbot using OpenVINO-GenAI on iGPU in Intel Core Ultra 7 165H Processor (Windows 11).

Steps

1. Request access to the Llama-3.2 models with your account on HuggingFace

2. Create a Python virtual environment to install the required dependencies

python -m venv ov_genai_venv
# Activate the virtual environment
.\ov_genai_venv\Scripts\activate
python -m pip install pip --upgrade

3. Install openvino-genai package

pip install openvino-genai==2024.4.0.0

3. Clone the OpenVINO-GenAI repository and install the dependencies

git clone https://github.com/openvinotoolkit/openvino.genai.git
cd openvino.genai-master\samples
pip install -r requirements.txt

4. Convert the model into intermediate representation format and quantize its precision into INT4

optimum-cli export openvino --model meta-llama/Llama-3.2-3B-Instruct --task text-generation-with-past --weight-format int4 --group-size 64 --ratio 1.0 --sym --awq --scale-estimation --dataset "wikitext2" --all-layers llama-3.2\Llama-3.2-3B-Instruct-INT4

5. Replace the tokenizer_config.json file in Llama-3.2-3B-Instruct-INT4 with this patch to run the chatbot sample

6. Run the chatbot with chat_sample.py on iGPU

#Change the device variable in chat_sample.py into GPU 
device = 'GPU'
# Run the chatbot
cd python\chat_sample
python chat_sample.py ..\..\llama-3.2\Llama-3.2-3B-Instruct-INT4

Sample Output on iGPU

References

https://medium.com/openvino-toolkit/how-to-run-llama-3-1-locally-with-openvino-45f066b3059a

要查看或添加评论，请登录

Ramesh Perumal PhD的更多文章

Towards Building Safe & Trustworthy AI Agents and A Path for Science? and Evidence?based AI Policy

2024年12月12日

Towards Building Safe & Trustworthy AI Agents and A Path for Science? and Evidence?based AI Policy

Welcome to the summary of the twelfth lecture of the LLM Agents course conducted by University of California, Berkeley.…
Measuring Agent capabilities and Anthropic’s RSP

2024年12月12日

Measuring Agent capabilities and Anthropic’s RSP

Welcome to the summary of the eleventh lecture of the LLM Agents course conducted by University of California…
Open-Source and Science in the Era of Foundation Models

2024年12月10日

Open-Source and Science in the Era of Foundation Models

Welcome to the summary of the tenth lecture of the LLM Agents course conducted by University of California, Berkeley…
Project GR00T: Training Robots through Large-Scale Simulation Frameworks

2024年12月9日

Project GR00T: Training Robots through Large-Scale Simulation Frameworks

Welcome to the summary of the nineth lecture of the LLM Agents course conducted by University of California, Berkeley…

2 条评论
Towards a unified framework of Neural and Symbolic Decision Making

2024年11月23日

Towards a unified framework of Neural and Symbolic Decision Making

Welcome to the summary of the eighth lecture of the LLM Agents course conducted by University of California, Berkeley…
Agents for Enterprise Workflows

2024年11月10日

Agents for Enterprise Workflows

Welcome to the summary of the seventh lecture of the LLM Agents course conducted by University of California, Berkeley.…
Agent-driven Autonomous Software Development

2024年11月8日

Agent-driven Autonomous Software Development

Welcome to the summary of the sixth lecture on the LLM Agents course conducted by University of California, Berkeley…
Building Reliable Compound AI Systems using DSPy Framework

2024年10月28日

Building Reliable Compound AI Systems using DSPy Framework

Welcome to the summary of the fifth lecture on the LLM Agents course conducted by University of California, Berkeley…

4 条评论
Enterprise Trends for GenAI and Tools for Customizing LLMs for Domain-Specific Use Cases

2024年10月14日

Enterprise Trends for GenAI and Tools for Customizing LLMs for Domain-Specific Use Cases

Welcome to the summary of the fourth lecture on the LLM Agents course conducted by University of California, Berkeley…
Agentic AI Frameworks and Applications

2024年10月4日

Agentic AI Frameworks and Applications

Welcome to the summary of the third lecture on Agentic AI frameworks and applications as part of the LLM Agents course…

4 条评论

See all articles

Running Llama-3.2-based Chatbot on Intel Core Ultra Processor using OpenVINO-GenAI

Ramesh Perumal PhD

AI Solution Architect | SMIEEE | Edge AI | Computer Vision | GenAI | MLOps | Taiwan Employment Gold Card Recipient | Healthcare & Life Sciences

Steps

References

Ramesh Perumal PhD的更多文章

社区洞察

其他会员也浏览了

IDM VTON : Virtual Try On APP Automatic Installers for Windows, RunPod, Massed Compute and a free Kaggle Account notebook published — Can transfer obj

Building firmware for STM32 with LLVM toolchain

codermag ringsCE issues

A Programmer's Journey

Quantum Battleships | A Classical Game on a Quantum Computer!

??Rowing into the #NXWorld??

THE Y2K BUG

Cubesat Attitude simulator | Balancing Algorithm

C++26: Exploring Structured Bindings with Packs

Steps

References

Ramesh Perumal PhD的更多文章

Towards Building Safe & Trustworthy AI Agents and A Path for Science? and Evidence?based AI Policy

Measuring Agent capabilities and Anthropic’s RSP

Open-Source and Science in the Era of Foundation Models

Project GR00T: Training Robots through Large-Scale Simulation Frameworks

Towards a unified framework of Neural and Symbolic Decision Making

Agents for Enterprise Workflows

Agent-driven Autonomous Software Development

Building Reliable Compound AI Systems using DSPy Framework

Enterprise Trends for GenAI and Tools for Customizing LLMs for Domain-Specific Use Cases

Agentic AI Frameworks and Applications

社区洞察

其他会员也浏览了

IDM VTON : Virtual Try On APP Automatic Installers for Windows, RunPod, Massed Compute and a free Kaggle Account notebook published — Can transfer obj

Building firmware for STM32 with LLVM toolchain

codermag ringsCE issues

A Programmer's Journey

Quantum Battleships | A Classical Game on a Quantum Computer!

??Rowing into the #NXWorld??

THE Y2K BUG

Cubesat Attitude simulator | Balancing Algorithm

C++26: Exploring Structured Bindings with Packs