登录查看更多内容

Serve LLama 2 with Fast API using collab GPU and Ngrok for free

Tlili Achref

FullStack Developer #LLM #IA #GenIA

发布日期: 2024年5月23日

I have always had the idea of using open source products which can be useful for many companies due to budget or some time to secure their sensible data,

In our case, we will test serving LLAMA 2 which is an open-source LLM by Meta using FastAPI on Collab, Ngrok is used for exposing the local web server of Collab on a public URL

we used Collab because we can use the GPU allocation.

First of all, we start by changing the runtime type to T4 GPU in :

Then we start by installing the necessary packages :

llama-cpp-python, fastapi[all], uvicorn, python-multipart, transformers, pydantic,  tensorflow

Next, we install Ngrok in our collab session:

Now we need to create a free account in Ngrok, next, we set our Authentication token as in the screenshot which will be saved in the Ngrok configuration.

After installing the packages and setting the authentication token we create our FastAPI App in the app.py file, you can find the full version in Collab link

so in our FastAPI, we created a post route '/generate'

The route uses LLama2_model to generate a response as simple as that.

Finally, we run this code to serve the localhost of Collab on our Ngrok account:

Documentation of our RestApi:

We use this route by sending a prompt text with temperature ( more creative response is, lower more precise ) and max_tokens( the max amount of (simplified) "words" allowed to be generated ) params:

{
  "inputs": "Who is elon musk ?",
  "parameters": {"temperature":0.1, "max_tokens":400}
}

Prompt result:

Url to collab file :

要查看或添加评论，请登录

Tlili Achref的更多文章

ReAact Agent with Llama 3 and Langsmith monitoring

2024年11月14日

ReAact Agent with Llama 3 and Langsmith monitoring

In this article, we explore the integration of ReAct agents with Llama 3 and Langsmith monitoring to enhance AI…
NotebookLlama An Open Source version of NotebookLM

2024年11月3日

NotebookLlama An Open Source version of NotebookLM

In 2023 Google Labs launched NotebookLM which helped with the creation of Podcasts from PDFs, In this tutorial, we will…
Fine-Tuning Gemma LLM model

2024年10月26日

Fine-Tuning Gemma LLM model

Fine-tuning an LLM model that typically costs tens of millions can be achieved by quantizing the model and focusing on…
Use Knowledge Graph with An LLM search using Neo4j and Groq

2024年10月2日

Use Knowledge Graph with An LLM search using Neo4j and Groq

By integrating Knowledge Graphs with LLMs in search systems, the accuracy of results improves compared to using only…
Create a prediction model for plant-type detection using TensorFlow

2024年5月6日

Create a prediction model for plant-type detection using TensorFlow

https://www.linkedin.
Create a Football Players Detection using YOLO v8 model and Open CV in Colab

2024年4月17日

Create a Football Players Detection using YOLO v8 model and Open CV in Colab

I want to share with you how easy to create an object detection Model with YOLO using data sets from RoboFlow, the main…
REST API Using SYMFONY 4

2018年2月13日

REST API Using SYMFONY 4

Hey Every body i'm about to publish my first article in my new Blog on how to Create a Rest API using Symfony4 , I hope…

6 条评论
Audience Overview for my White-Paper Blog :)

2017年1月2日

Audience Overview for my White-Paper Blog :)

After half year with my white-paper blog this is my Audience Overview , it's good to feel that you helpping someone ^_^…
Game Rock Paper,Scissors using Angular2 and ng-redux for the counting

2016年11月27日

Game Rock Paper,Scissors using Angular2 and ng-redux for the counting

I have just created a Rock ,paper,scissors game using angular2 and Redux for the counting. link to Game :…
How to Use RabbitMQ with Symfony 3.1.6

2016年11月19日

How to Use RabbitMQ with Symfony 3.1.6

A nice tutorial on how to use RabbitMQ with Symfony 3.1.

See all articles

Tlili Achref的更多文章

ReAact Agent with Llama 3 and Langsmith monitoring

NotebookLlama An Open Source version of NotebookLM

Fine-Tuning Gemma LLM model

Use Knowledge Graph with An LLM search using Neo4j and Groq

Create a prediction model for plant-type detection using TensorFlow

Create a Football Players Detection using YOLO v8 model and Open CV in Colab

REST API Using SYMFONY 4

Audience Overview for my White-Paper Blog :)

Game Rock Paper,Scissors using Angular2 and ng-redux for the counting

How to Use RabbitMQ with Symfony 3.1.6