登录查看更多内容

My experiments with LLaMA 2

Pranav Patil, TOGAF,CSM,PGDIPR,

PLM Architect and Director - Iristechsys Software Services

发布日期: 2023年8月1日

Llama 2 is a large language model (LLM) developed by Meta that has been trained on over 2 trillion tokens of public data, designed to enable developers and organizations to build generative AI-powered tools

It was released to the public on July 18, 2023 in a blog post on Meta's AI website.?

Since it was published with such fanfare, I decided to experiment with one of the models on the weekend - the Llama-2-13b-hf model.

Though the models are available on the AWS Sagemaker, I did not attempt to use them because my objectives were different.

The model and config files were obtained from :

https://huggingface.co/meta-llama/Llama-2-13b-hf/tree/main

One has to request access to the model by accepting the T&Cs, which is a fairly straightforward process.

The objectives of my experiments were the following:

1. To find out whether such a model can be loaded on a general purpose desktop

2. If the model is loaded, does it process prompts in a reasonable amount of time

3. The influence of input parameters like temperature on the final outcome

4. The influence of prompt variations on the final output

5. The accuracy of the output.

I experimented with two machines:

1. 32GB RAM i5 4 vCPUs + SSD desktop - Ubuntu

2. AWS g3.4xlarge - 122 GB RAM, 16 CPUs, 1 GPU + SSD - Ubuntu, cloud

I started the experimentation with the 32 GB RAM machine - however, it chickened out due to an out-of-memory error. I realized that the memory required for the model to load was much more than the 32 Gigs. I attempted a workaround by increasing the swap space to about 100GB. Fortunately, I had a 1 TB SSD at my disposal.

But even with the workaround, the time taken to load the model was excruciatingly slow, and finally, I stopped my efforts.

Next was the g3.4xlarge cloud EC2 instance. It was set up in a matter of minutes. The virtual environment and dependent libraries took hardly 10 mins.

The model was downloaded in about 3-5 minutes and it took around 10+ mins to load it into the memory.

I had written a small script that read the prompt and the model parameters like temperature, top_k, max_tokens etc from a config file.

Following are a few prompts and the answers that the model provided (I have truncated some of the output for brevity sake):

Prompt: smita is manoj's wife. Amita is manoj's daughter. Sachi is smita's daughter. then the relationship between amita and sachi is of

Answer:

A. Grandmother and granddaughter

B. Mother and daughter

C. Aunt and niece

D. Sister and sister

Ans: A. Grandmother and granddaughter

Prompt: Mumbai:Maharashtra, Chennai:Tamilnadu, Bhopal:Madhya Pradesh, Panaji:Goa, Srinagar

Answer:

Panaji:Goa,Srinagar:Jammu & Kashmir,Shimla:Himachal Pradesh,Thiruvananthapuram:Kerala,Bengaluru:Karnataka,Hyderabad:Telangana,Ahmedabad:Gujarat,Patna:Bihar,Chandigarh:Punjab,Lucknow:Uttar Pradesh,Dehradun:Uttarakhand,Bhubaneswar:Odisha,Kochi:Kerala,Jaipur: thing to do in Jaipur,Jaipur: Rajasthan,Jaipur: Rajasthan,Jaipur: Rajasthan

Prompt: A poem on rains in India would be like this:

Answer:

The rains have come,The rains have come,The rains have come.The rains have come, the rains have come,The rains have come, the rains have come.

Prompt: The prime minister of india is

Answer:

The prime minister of india is the chairman of the planning commission of india.

Which one of the following is the largest inland water body of India?

The correct option is C Chilka lake

Prompt: An email for requesting a leave would be :

Answer:

Dear Mr. /Mrs. (name),

I am writing to inform you that I will be going on leave from (date) to (date) due to (reason). I will be available to take calls and respond to emails during this time. If you need any urgent help, please contact (name) at (phone number).

Thank you for your understanding and cooperation.?

Prompt: 1,3,9,27,81

Answer:

领英推荐

SwiftKV: Accelerating Enterprise LLM Workloads with…

Snowflake 3 个月前

Innovative AI Solutions: Edvenswa’s Approach to…

Edvenswa Enterprises 7 个月前

Building a Private AI Assistant with Local LLMs — A…

White Prompt 1 个月前

1,3,9,27,81,243,613,1827,5468,16384,49152,196608,786432,3145728,12582912,50439360,201784320,807112

Prompt: 1,1,2,1,2,3,1,2,3,4,1,2,3,4,5

Answer:

1,1,2,1,2,3,1,2,3,4,1,2,3,4,5,1,2,3,4,5,6,1,2,3,4,5,6,7,1,2,3,4,5,6,7,8,1,2,3,4,5,6,7,8,9,1,2,3,4,

Prompt: the woods are lovely dark and deep, but i have promises to keep

Answer:

the woods are lovely dark and deep, but i have promises to keep, and miles to go before i sleep, and miles to go before i sleep.i am a child of the night.

Prompt: a short poem on rain would be following:

Answer:

The rain is falling on the window pane.The leaves are falling from the trees.

The wind is blowing through the branches Of the trees.

Prompt: a summary of india-pakistan relations would be the following:

Answer:

On 29 December 2015, Indian and Pakistani foreign secretaries met for the first time since 2012 in Islamabad to discuss peace and security issues. In a joint statement issued at the conclusion of the meeting, both sides agreed to work together to combat terrorism and reiterated their commitment to the

Prompt: joe biden: USA, Vladimir Putin: Russia, Benjamin Netanyahu: Israel, Greta Thurnberg:

Answer:?

joe biden: USA, Vladimir Putin: Russia, Benjamin Netanyahu: Israel, Greta Thurnberg: Sweden, Aung San Suu Kyi: Myanmar, Angela Merkel: Germany, Boris Johnson: UK, Donald Trump: USA, Xi Jinping: China, Recep Tayyip Erdo?an: Turkey, Imran Khan: Pakistan, Kim Jong Un: North Korea, Vladimir Putin: Russia,

Prompt: if adam is mada, secret is terces, pump is pmup then colossal is

Answer:

if adam is mada, secret is terces, pump is pmup then colossal is 1000

Prompt: java code to generate fibonacci sequence would be

Answer:

What is the Java code to generate the Fibonacci sequence?

Java Program to generate Fibonacci series using recursion

Java Program to generate Fibonacci series using iteration

Prompt: translation of 'I am fine' in hindi would be

Answer:

translation of 'I am fine' in hindi would be ??? ??? ???.

?можем да кажем и "я съм добре"

My conclusions:

1. It does fairly well in matters of "general knowledge" or information prevailing at the time of its training.

2. It does well in simple sequence predictions. But when you give it off the track prompts, it struggles

3. Very bad performance in programming prompts - who knows, a better prompt might do the trick?

4. It fairs poorly when asked logical questions

5. The Max length parameter has a significant influence on the processing time. For a max length of 100 it took about 1.5 mins on g3.4xlarge, whereas for a max length of 1000, it took about 9 minutes. (Looks like it is a fairly linear increase)

In addition, the more the max-length, the more it produced junk output at the end of the token sequence.

6. It fared poorly when the temperature is reduced - this was unexpected as I was under the impression that for lower temperature it hallucinates less, and becomes less creative.?

All in all, the model still has a long way to go till it could be reliably used in an industry setting. Of course this is the 13b model - the 70b model might fare better.

On a lighter note, the most important conclusion is "Memory and processing Is All You Need, Attention comes next" :)

Next steps:

I would like to fine-tune the model - at this point in time, it throws a CUDA error that I am attempting to resolve but so far no solution has been found.?

Has anyone carried out a similar study with the 7B or 70B model? I would love to hear the results!

References:

https://huggingface.co/blog/llama2

https://huggingface.co/meta-llama/Llama-2-13b-hf

https://sundarpuli.medium.com/how-llama-v2-could-revolutionize-the-way-we-interact-with-computers-mindfuldiscoveries-16d366be5637

https://www.digitalocean.com/community/tutorials/how-to-add-swap-space-on-ubuntu-20-04

Pranav Patil, TOGAF,CSM,PGDIPR,

PLM Architect and Director - Iristechsys Software Services

1 年

Yogesh Haribhau Kulkarni referring to your latest post, I have reached the same conclusion: these models are not industry ready yet. Please have a look at the results of LLaMAv2 13b above and if you have any insights or suggestions to share, you are most welcome

1 次回应

要查看或添加评论，请登录

Pranav Patil, TOGAF,CSM,PGDIPR,的更多文章

Take Heart! ??

2024年10月6日

Take Heart! ??

Imagine cruising along the highway on a sleek bike with a powerful engine. Suddenly, it sputters—further investigation…

5 条评论
My experiments with code-Llama

2023年9月19日

My experiments with code-Llama

Back with another experiment with code-llama, the large language model from Meta AI, trained specifically for code. It…

2 条评论
CHANDRAYAAN-3 - Technological challenges for a soft-landing

2023年8月24日

CHANDRAYAAN-3 - Technological challenges for a soft-landing

Chandrayaan-3's smooth landing has been the feather in the cap for us all Indians. India made history as the 4th…
Entrepreneurship and Marathons: Analogies and insights

2019年12月1日

Entrepreneurship and Marathons: Analogies and insights

Today I successfully completed 21Kms Pune half marathon. As an entrepreneur, I drew some analogies and insights between…

17 条评论
Don't judge a book by its cover!

2019年9月15日

Don't judge a book by its cover!

How many times have you have seen a swanky, large mall in an upscale locality forced to shut down within a year while a…

9 条评论
Hail the "experts"

2019年6月30日

Hail the "experts"

Few years back, my car developed a “bipolar mood disorder”. At times, it was feisty and sporty.

3 条评论
Patent Myths - Stop fooling yourself !

2016年12月5日

Patent Myths - Stop fooling yourself !

I have met quite a few people who think they have invented something new and want to make a killing out of their…

6 条评论

See all articles

My experiments with LLaMA 2

Pranav Patil, TOGAF,CSM,PGDIPR,

PLM Architect and Director - Iristechsys Software Services

领英推荐

Pranav Patil, TOGAF,CSM,PGDIPR,的更多文章

社区洞察

其他会员也浏览了

Milvus 2.4 is here, Latest RAG articles, Zilliz Cloud on Azure Marketplace, and SO many March and April virtual and in-person events!

HPC Hardware and Cloud GPU Hosting for the Development and Operation of Artificial Intelligence: A Conversation with the Founders of AIME GmbH

How We at Avesha Elastic GPU Service(EGS) Enable Real-Time Data Fluidity with Inference

How Cloud Computing and Machine Learning is entwined?

OctoAI Compute Service Launch Recap

Rabbit vs. Google Recommender: achieving 3.4x higher savings on Google Compute Engine

The Latest from the CloudProse Blog: Opus Released - AWS Leads the Way for Foundational Model Hosting (Currently)

The Gen AI Smackdown Continues Between Microsoft, Amazon, and Google

Picking Teams in AI

Breaking Down AWS Bedrock Pricing Models

领英推荐

Pranav Patil, TOGAF,CSM,PGDIPR,的更多文章

Take Heart! ??

My experiments with code-Llama

CHANDRAYAAN-3 - Technological challenges for a soft-landing

Entrepreneurship and Marathons: Analogies and insights

Don't judge a book by its cover!

Hail the "experts"

Patent Myths - Stop fooling yourself !

社区洞察

其他会员也浏览了

Milvus 2.4 is here, Latest RAG articles, Zilliz Cloud on Azure Marketplace, and SO many March and April virtual and in-person events!

HPC Hardware and Cloud GPU Hosting for the Development and Operation of Artificial Intelligence: A Conversation with the Founders of AIME GmbH

How We at Avesha Elastic GPU Service(EGS) Enable Real-Time Data Fluidity with Inference

How Cloud Computing and Machine Learning is entwined?

OctoAI Compute Service Launch Recap

Rabbit vs. Google Recommender: achieving 3.4x higher savings on Google Compute Engine

The Latest from the CloudProse Blog: Opus Released - AWS Leads the Way for Foundational Model Hosting (Currently)

The Gen AI Smackdown Continues Between Microsoft, Amazon, and Google

Picking Teams in AI

Breaking Down AWS Bedrock Pricing Models