Ollama with LangChain for Local LLMs Applications

Rany ElHousieny, PhD???

SENIOR SOFTWARE ENGINEERING MANAGER (EX-Microsoft) | Generative AI / LLM / ML / AI Engineering Manager | AWS SOLUTIONS ARCHITECT CERTIFIED? | LLM and Machine Learning Engineer | AI Architect

发布日期: 2024年4月17日

Ollama is a versatile platform for running and interacting with large language models (LLMs) like Llama, Gemma, Phi, Zypher, Code Llama, and many more. It allows users to pull, run, and create models easily on local machines, ensuring privacy and control over data. With compatibility for macOS, Linux, and Windows, Ollama also supports API functionalities that align with OpenAI standards, allowing for seamless integration and usage of various tools and applications locally.

For Python developers, Ollama provides a library that can be installed with a simple pip command. The library enables straightforward interaction with models for tasks like chat completions, text generation, and even handling multimodal inputs such as images. This flexibility is showcased in the ability to handle streaming data, use different models for specific tasks, and create custom models tailored to unique requirements.

!pip install ollama

Ollama also integrates with LangChain, allowing developers to build complex applications such as retrieval augmented generation (RAG). These applications leverage embedding models to create vector embeddings from texts, which can be used to retrieve and generate relevant responses based on the input queries. This capability is particularly useful for building sophisticated AI-powered search and response systems.

Ollama Installation

Go to https://ollama.com/ and follow the steps below:

Unzip the downloaded file and click on Ollama

Open a terminal and run the command

ollama run gemma

pulling manifest 
pulling ef311de6af9d...  10% ▕█                  ▏ 525 MB/5.0 GB   37 MB/s   1m58s

It will pull the LLM image first and then give you a prompt to chat with the LLM

ollama run gemma 
pulling manifest 
pulling ef311de6af9d... 100% ▕███████████████████▏ 5.0 GB                         
pulling 097a36493f71... 100% ▕███████████████████▏ 8.4 KB                         
pulling 109037bec39c... 100% ▕███████████████████▏  136 B                         
pulling 65bb16cf5983... 100% ▕███████████████████▏  109 B                         
pulling 0c2a5137eb3c... 100% ▕███████████████████▏  483 B                         
verifying sha256 digest 
writing manifest 
removing any unused layers 
success 
>>> Send a message (/? for help)

You can get the list of downloaded models with the command ollama list

ollama list
NAME         	ID          	SIZE  	MODIFIED       
gemma:latest 	a72c7f4d0a15	5.0 GB	52 minutes ago	
llama2:latest	78e26419b446	3.8 GB	25 hours ago

Python Example:

!pip install ollama

Towards AI 2 个月前

OpenAI Launches DALL·E 2 Now Available in Beta with…

Michael Spencer 2 年前

Open Source AI Models: Coding Outside the Proprietary…

Neil Sahota 6 个月前

import ollama
response = ollama.chat(model='gemma', messages=[
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
])
print(response['message']['content'])

The sky is blue due to a phenomenon called **Rayleigh scattering**. 

* Sunlight is composed of all the colors of the rainbow, each with a specific wavelength. 
* When sunlight interacts with molecules in the atmosphere, like nitrogen and oxygen, the molecules scatter the light in all directions. 
* Different wavelengths of light are scattered differently. 

**How it works:**

- Shorter wavelengths of light (like blue light) scatter more efficiently than longer wavelengths (like red light).
- Since the molecules in the atmosphere are much smaller than the wavelengths of visible light, they preferentially scatter the shorter wavelengths in all directions.
- The scattered blue light is dispersed in all directions, but our eyes are primarily facing upwards, so we see more blue light from the sky than any other color.

**Additional factors:**

- The amount of sunlight that reaches the Earth's surface also affects the color of the sky. The closer you are to the equator, the more direct sunlight you receive, resulting in a slightly whiter sky.
- Clouds and dust in the atmosphere can also scatter light and change the color of the sky.

**Result:**

- The combination of Rayleigh scattering and other factors results in the sky appearing predominantly blue during clear weather conditions.

Ollama with LangChain

!pip install langchain-community

from langchain_community.llms import Ollama

llm = Ollama(model='gemma')

llm.invoke('tell me a joke?')

'What did the ocean say to the beach?\n\nNothing, it just waved!'

You can also print the response as follows:

Let's load Llama2 and see if it has better jokes

llm = Ollama(model='llama2')

print(llm.invoke('tell me a joke?'))

Sure, here's one:

Why don't scientists trust atoms?
Because they make up everything!

I hope that brought a smile to your face! Do you want to hear another one?

You can check all the available models at ollama.com then click Models

Let's try Mixtral

From the command terminal, write:

Rany ~ >

Llama 3.1

Meta AI has unveiled Llama 3.1, their most capable AI model to date. This new release includes the flagship Llama 3.1 405B, an open-source model that rivals top proprietary models in general knowledge, tool use, and multilingual translation.

Download Llama3.1 405B

library Get up and running with large language models.ollama.com

Ollama pull llama3.1:405b

After that, you can invoke as any other model

AI Solutions Architect

1,402 位关注者

KV Subbaiah Setty

AI and GenAI, Data Science, Machine Learning, and Data Engineering:Teach, Train, Write and Learn.

1 个月

For most users, especially those without access to enterprise-level hardware, leveraging cloud-based solutions or using smaller, more manageable models like Llama 3.1 70B is recommended. These smaller models offer a balance between performance and resource requirements, making them suitable for a wider range of applications while still delivering impressive capabilities

1 次回应

KV Subbaiah Setty

AI and GenAI, Data Science, Machine Learning, and Data Engineering:Teach, Train, Write and Learn.

1 个月

However, quantized versions of the model can reduce the hardware burden. For instance, quantizing to lower precision (like 4-bit) significantly decreases memory and compute requirements, making it more accessible on less powerful hardware. Even then, running a quantized version of Llama 3.1 405B on a Mac M1 Pro is still impractical, but smaller models like Llama 3.1 8B or 70B in a quantized form might be viable options for consumer-grade devices.

1 次回应

KV Subbaiah Setty

AI and GenAI, Data Science, Machine Learning, and Data Engineering:Teach, Train, Write and Learn.

1 个月

Hello, thanks for the informative article, including the latest Meta's Llama3.1 405B model. But one thing we need to add is the hardware requirements to run such huge models, in fact even a 4-bit quantized version can not be run on powerful Macs like M1/M2 Pros. Here are some details about the hardware ware requirements to run the 405B model: Running Meta's Llama 3.1 405B model requires substantial hardware resources due to its massive size and complexity. Here are the key specifications: 1. Storage: Approximately 820 GB of storage space is needed. 2. RAM: At least 1 TB of RAM is required to load the model into memory. 3. GPU: Multiple high-end GPUs, preferably NVIDIA A100 or H100 series, are necessary. 4. VRAM: A total of at least 640 GB of VRAM across all GPUs is essential for handling the model efficiently. Given these requirements, running Llama 3.1 405B on consumer-grade hardware, such as a Mac M1 Pro laptop, is not feasible. The hardware limitations of consumer devices, particularly in terms of RAM and GPU capacity, make them unsuitable for such a large-scale model.

1 次回应

查看更多评论

要查看或添加评论，请登录

查看全部

Ollama with LangChain for Local LLMs Applications

Rany ElHousieny, PhD???

SENIOR SOFTWARE ENGINEERING MANAGER (EX-Microsoft) | Generative AI / LLM / ML / AI Engineering Manager | AWS SOLUTIONS ARCHITECT CERTIFIED? | LLM and Machine Learning Engineer | AI Architect

Ollama Installation

Python Example:

领英推荐

Ollama with LangChain

Llama 3.1

AI Solutions Architect

1,402 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

Integrating OpenAI APIs with ChatMotor.ai : A Retex Guide

Issue #186 - THE ML ENGINEER ??

Building an AI Assistant with DSPy

Is the Future Open-Source? Ollama on Windows 11

From Model to Service: The LLMOps Playbook

Code Generation with Large Language Models (LLMs)

Top 5 Open-Source LangChain Alternatives to Use in 2024

The Hidden Dimensions of a Prompt for AI Language Models

The Top 10 Automated Coding Tools to Boost your Productivity

Run any LLM Locally

Ollama Installation

Python Example:

领英推荐

Ollama with LangChain

Llama 3.1

AI Solutions Architect

1,402 位关注者

Mastering LangChain's LCEL: Using bind to Control Flow

2024年9月4日

Mastering LangChain's LCEL : Simplifying Language Model Outputs with StrOutputParser

2024年9月3日

Exploring LangChain's Expression Language (LCEL)

2024年9月2日

Understanding LLM Hallucination and Strategies to Mitigate It

2024年8月22日

AI Prompt Engineering and ReACT Framework

2024年8月21日

Mastering Prompt Engineering with LangChain: Writing Effective Prompts for AI

2024年8月15日

Enhancing LLM Agents with Tool Integration

2024年8月15日

Understanding Markov Decision Processes (MDP)

2024年8月11日

Understanding LLM Agents: The ReAct Framework and Its Application

2024年8月11日

Create and Deploy GPT-4o and GPT-4o-mini on the New Azure OpenAI Studio

2024年8月8日

社区洞察

其他会员也浏览了

Integrating OpenAI APIs with ChatMotor.ai : A Retex Guide

Issue #186 - THE ML ENGINEER ??

Building an AI Assistant with DSPy

Is the Future Open-Source? Ollama on Windows 11

From Model to Service: The LLMOps Playbook

Code Generation with Large Language Models (LLMs)

Top 5 Open-Source LangChain Alternatives to Use in 2024

The Hidden Dimensions of a Prompt for AI Language Models

The Top 10 Automated Coding Tools to Boost your Productivity

Run any LLM Locally