ç™»å½•æŸ¥çœ‹æ›´å¤šå†…å®¹

Using LLMs locally on iPad or iPhone

Maciek J?drzejczyk

å‘å¸ƒæ—¥æœŸ: 2023å¹´12æœˆ10æ—¥

In this tutorial you will learn how to install a ChatGPT-like large language model (LLM) locally on your Apple device.

This step-by-step guide is based on my personal iPad Pro 5th Gen with M1 chip, 8GB of RAM and 128GB of local storage space. I will use llmfarm.site application as a client application to run a downloaded model. I will also select a pre-trained Mistral-7B model available on huggingface.co, although this guide will allow you to use probably any other model of your choice.

For the detailed description of the thought process and rationale behind decisions made, you can scroll to the bottom of this article.

Step 0: Prerequisites

An iPad or iPhone with at least 8GB of RAM
At least 8GB of free local storage

Step 1: Install Testflight and LLMFarm

To host our local LLM, we will use LLMFarm, an open source client with the support for Apple Silicon. Since LLMFarm is still in development, it is necessary to use Testflight app. Let's jump straight to LLMFarm website and click "Install with TestFlight" option.

Once you are redirected to TestFlight website, click on "View in App Store" to install TestFlight:

Install TestFlight as if you were installing any other app:

Now go back to TestFlight website and click on Step 2 to install LLMFarm:

Step 2: Download a pre-trained model

For the purpose of this tutorial, I will select a pre-trained Mistral-7B model available on huggingface.co. If you want to read further about this model, please scroll down to the bottom of this article. Otherwise, go to TheBloke repository hosting a ready-to-use model: https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF

Select mistral-7b-instruct-v0.1.Q4_K_M.gguf file from the list of available models and click download. Please note that the model will require around 4GB of free space on your device:

Check the location of the model file and note it down as you will need to point LLMFarm to this location in the next step:

Step 3: Set up LLMFarm to use Mistral-7B model

Switch back to LLMFarm application, click on Settings option in the bottom of the screen and then select "Models" option on the left:

Point the application to the location where the model was downloaded. Please note that LLMFarm will copy the model to its own folder which means that it will take another 4GB of free space on your device. Once select the model, you will be able to delete the file from its original location, though.

Once the model is uploaded, it should appear on the list as on the screenshot below:

Step 4: Configure the chat:

We are now ready to set up a chat context window which will allow us to interact with our model. Click on the Chats option in the bottom left of LLMFarm application and then select "Start New Chat" option:

é¢†è‹±æŽ¨è

What Is Apple Intelligence? The New Apple AI With iOS 18.1

What Is Apple Intelligence? The New Apple AI With iOSâ€¦

The-Next-Tech 5 ä¸ªæœˆå‰

Apple AI Unveils iPhone 16 While Elon Muskâ€™s Grok 2 Sparks Controversy

Apple AI Unveils iPhone 16 While Elon Muskâ€™s Grok 2â€¦

YRAL 7 ä¸ªæœˆå‰

Apple Goes All-In on a Privacy-Based AI Experience

Mark Vena 9 ä¸ªæœˆå‰

In the chat settings, click "Select model":

Then, select "Import from file" option and select the model which we imported earlier to LLMFarm library:

Now let's refine prompt formatting. Go to "Prompt format" option, remove the default entry and add the following line as per Mistral documentation:

<s>[INST] {{prompt}} [/INST]

We're nearly ready to go! The last thing to do is to tweak settings related to the resource management. Click on "Prediction options" and select the following settings:

Turn on "Metal" to leverage Apple Silicon
Turn on "MLock" and leave MMap option on for more efficient RAM management

Once your settings are set up, click on "Add" option in the top of the screen:

Step 5: Testing!

Your chat window should now be ready to go. You can start giving tasks, ask questions and check for accuracy of results:

Please note that during the first prompt, the warmup period may take some time. Afterwards, the response should be faster:

That's it, congratulations! Now you know how to run large language models (LLMs) locally on your Apple device.

Why do I write this tutorial?

Picture this: you're in the era of all things smart and digital, where AI's prowess is as common as your morning cup of coffee. Enter LLMs, these brainy language models that can chat, generate text, and even write Shakespearean sonnets if you ask them nicely. But here's the kicker: instead of relying on some closed-source model residing in an unknown location, why not bring this wizardry closer to home and use it offline? Yep, right into the cozy confines of your own device!

Privacy buffs, rejoice! Hosting an LLM on your local gadget is like waving a magic wand over your data. No more fretting about your conversations being overheard by unseen digital ears. It's like having your own secret language lair where only you and your LLM pal share the juiciest details of your chat without nosy third parties peeking in.

Why iPad or iPhone and not a Macbook?

When it comes to being the featherweight champion of portability, iPads and new iPhone 15 Pro Max series step into the ring with their slick moves and compact charm, while MacBooks bring the heavyweight power. When it comes down to portability, iPads and iPhones are the go-to choice for those always on the move, craving versatility, and wanting a device that's as light as a feather. Last but not least, LLMs running locally on your iDevices finally become a good use case to squeeze the juice out of them!

Why Mistral-7B?

Mistral-7B from MistralAI is widely praised for its impressive performance in comparison to its relatively small size. As per the claims of its creators, this foundational model handles tasks better than some larger models while not requiring so much computing power. This is particularly important since even the beefiest iPads and iPhones are constrained with the available RAM. It also takes a much more permissive approach when it comes to accepting all kinds of tasks.

I tested several quantized models from TheBloke repository and I found that a model pre-trained at level 3 (mistral-7b-instruct-v0.1.Q3_K_M.gguf) or 4 (mistral-7b-instruct-v0.1.Q4_K_M.gguf) quantization retains more precision in the model's parameters, potentially preserving better accuracy while not requiring more than 8GB of RAM. Below you will find a screenshot of the memory usage while running the abovementioned models:

However, in the end, the accuracy of the model has to be determined by your subjective needs.

Yurii Ormson

Full-Stack QA Engineer | {Java & Playwright} | EPAM Systems

6 ä¸ªæœˆ

Thanks for your effort! ??

èµž

å›žå¤

Dr. Zeeshan Alam

BDS-intern at NIMS DENTAL COLLEGE, NIMS UNIVERSITY| Python- web development with Django| Algo-trading and investing.

7 ä¸ªæœˆ

Now you donâ€™t even need to download testflight, LLM Farm is available in the app store and there are models to download directly from it

èµž

å›žå¤

Timothy LaDuca

9 ä¸ªæœˆ

This works even on the iPhone 14 which only has 6GB RAM. Amazing stuff.

èµž

å›žå¤

Jonathan Tismo

R & D Engineer at Nokia

1 å¹´

Hi, I noticed the UI had changed a lot since from the YT video ( https://youtu.be/5QEDNZlDf-c?si=eKfXv_9mwWgJmEhI ) Iâ€™ve used to install and confifure your app. Basically the plus sign on the upper right and the setting button at the bottom are no longer there, which is fine if I want to just use 1 model for all my chat threads but if I want to use different models I will have to remove all chats first so that I could add or select a different model. I am using an 11â€ Ipad Pro M1 with 16GB RAM 1TB

èµž

å›žå¤

David Alan Birdwell

Founder - Humanity and AI, LLC

1 å¹´

So this is working great with the Q4 model on an 11â€ M1 iPad Pro. However, the responses often get truncated after three short paragraphs, and I get all kinds of claims of it being able to use email and instant messaging accounts to interact with users, but obviously no nominal ability to do so.

èµž

å›žå¤

1 æ¬¡å›žåº”

æŸ¥çœ‹æ›´å¤šè¯„è®º

è¦æŸ¥çœ‹æˆ–æ·»åŠ è¯„è®ºï¼Œè¯·ç™»å½•

Maciek J?drzejczykçš„æ›´å¤šæ–‡ç«

Au revoir LLaMa2 et GPT (3.5), bonjour Mixtral 8x7b!

2023å¹´12æœˆ14æ—¥

Au revoir LLaMa2 et GPT (3.5), bonjour Mixtral 8x7b!

It looks like the newest brainchild of Mistral AI is Hugging Face's Christmas 2023 party crasher. Mixtral 8x7B Instructâ€¦

6 æ¡è¯„è®º
A recap of 2018 and The State of Blockchain in 2019

2018å¹´12æœˆ31æ—¥

A recap of 2018 and The State of Blockchain in 2019

As the year is approaching to the end, it is time to analyze predictions for 2018 and determine the things to come inâ€¦
On the origins of Hyperledger Fabric

2018å¹´7æœˆ1æ—¥

On the origins of Hyperledger Fabric

This article is inspired by an interview I held with a journalist who is writing a book about different perspectives onâ€¦

2 æ¡è¯„è®º
Exploring Hyperledger Composer (Part 5): Developing assets, participants and transaction definitions

2018å¹´6æœˆ10æ—¥

Exploring Hyperledger Composer (Part 5): Developing assets, participants and transaction definitions

Introduction In previous articles we prepared our development environment with Hyperledger Composer Playground andâ€¦

2 æ¡è¯„è®º
Exploring Hyperledger Composer (Part 4): Our scenario - a blind auction for artworks

2018å¹´6æœˆ10æ—¥

Exploring Hyperledger Composer (Part 4): Our scenario - a blind auction for artworks

In a previous article we had a first practical look at Composer Playground, an easy-to-use interface for rapidâ€¦
Exploring Hyperledger Composer (Part 3): A primer to Playground

2018å¹´6æœˆ10æ—¥

Exploring Hyperledger Composer (Part 3): A primer to Playground

In a previous article we explored the concept of a business network and how it is mapped to modelling technologies usedâ€¦

1 æ¡è¯„è®º
Exploring Hyperledger Composer (Part 2): The Business Network

2018å¹´6æœˆ9æ—¥

Exploring Hyperledger Composer (Part 2): The Business Network

Introduction In previous article we made an introduction to Hyperledger Composer, its main characteristics and benefitsâ€¦

1 æ¡è¯„è®º
Exploring Hyperledger Composer (Part 1): Introduction

2018å¹´6æœˆ9æ—¥

Exploring Hyperledger Composer (Part 1): Introduction

As Hyperledger Composer is progressing towards production-ready release I found that it might be a good idea to start aâ€¦

4 æ¡è¯„è®º
Hyperledger Fabric 1.1 is out! What's new?

2018å¹´3æœˆ18æ—¥

Hyperledger Fabric 1.1 is out! What's new?

After spending very intensive 36 hours with IBM's greatest experts in blockchain from Europe and beyond I was awaitingâ€¦

10 æ¡è¯„è®º

See all articles

Using LLMs locally on iPad or iPhone

Maciek J?drzejczyk

Step 0: Prerequisites

Step 1: Install Testflight and LLMFarm

Step 2: Download a pre-trained model

Step 3: Set up LLMFarm to use Mistral-7B model

Step 4: Configure the chat:

é¢†è‹±æŽ¨è

Step 5: Testing!

Why do I write this tutorial?

Why iPad or iPhone and not a Macbook?

Why Mistral-7B?

Maciek J?drzejczykçš„æ›´å¤šæ–‡ç«

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

ChatGPT UI is Intentionally Googly, Helping Users Understand A.I.; Just Like How Steve Jobs Stated We All Used Televisions Wrong for 30 Years.

Many people think Apple is actually lagging behind when it comes to AI, but in reality they are right at the forefront!

?? Apple WWDC AI-Roundup

Apple WWDC 2024: Whatâ€™s Hot and Whatâ€™s Not

The Start Of A New Era For Siri

Revolutionize Your iPad: How to Install a Powerful ChatGPT-like AI Model with Just a Few Simple Steps!

#17 Puddle Pods - let's talk math

Apple Intelligence - Apple's WWDC 2024 Recap

?? Exciting News: Siri's Next Chapter with Generative AI! ????

Step 0: Prerequisites

Step 1: Install Testflight and LLMFarm

Step 2: Download a pre-trained model

Step 3: Set up LLMFarm to use Mistral-7B model

Step 4: Configure the chat:

é¢†è‹±æŽ¨è

Step 5: Testing!

Why do I write this tutorial?

Why iPad or iPhone and not a Macbook?

Why Mistral-7B?

Maciek J?drzejczykçš„æ›´å¤šæ–‡ç«

Au revoir LLaMa2 et GPT (3.5), bonjour Mixtral 8x7b!

A recap of 2018 and The State of Blockchain in 2019

On the origins of Hyperledger Fabric

Exploring Hyperledger Composer (Part 5): Developing assets, participants and transaction definitions

Exploring Hyperledger Composer (Part 4): Our scenario - a blind auction for artworks

Exploring Hyperledger Composer (Part 3): A primer to Playground

Exploring Hyperledger Composer (Part 2): The Business Network

Exploring Hyperledger Composer (Part 1): Introduction

Hyperledger Fabric 1.1 is out! What's new?

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

ChatGPT UI is Intentionally Googly, Helping Users Understand A.I.; Just Like How Steve Jobs Stated We All Used Televisions Wrong for 30 Years.

Many people think Apple is actually lagging behind when it comes to AI, but in reality they are right at the forefront!

?? Apple WWDC AI-Roundup

Apple WWDC 2024: Whatâ€™s Hot and Whatâ€™s Not

The Start Of A New Era For Siri

Revolutionize Your iPad: How to Install a Powerful ChatGPT-like AI Model with Just a Few Simple Steps!

#17 Puddle Pods - let's talk math

Apple Intelligence - Apple's WWDC 2024 Recap

?? Exciting News: Siri's Next Chapter with Generative AI! ????

é¢†è‹±æŽ¨è

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†