Time for a short, fun little series about on-device LLMs
If you’re in IT, and even if you aren’t, you’re seeing AI everywhere. Large language models (LLMs) like ChatGPT can be a great tool for many things, and like a multitool, they can do a little of everything. They’re great at coming up with a rough draft, fixing grammar and even writing some basic copy. Are they perfect? Not really. But is a mutlitool? Not really. But do I like my mutlitool? Yup!
This isn’t a course in AI from a 15 year data scientist nor is it Buster Bluth’s review of 18th century agrarian business. It’s a few practical uses (and considerations) for everyday folks.
Let’s start with a few key thoughts to get on the same page: LLMs
At its core, an LLM has access to a vast amount of information but doesn’t know how to apply it. Imagine a person who is very “book-smart”, but not “street-smart”. They know a little bit about everything, but really don’t know how to apply it or use it. If they don’t know, they’ll just confidently tell you what they thought. Hallucinations, but I’m not going into detail on those.
Parameters
Many models can be measured in number-of-parameters: a 7B model has 7 billion parameters. In general, the more parameters, the larger the model. 7B can run on an 8GB phone, 33B on a 32GB RAM laptop and as of 2024 ChatGPT has around 1.8 trillion - that’s why it’s so good at so many things.
If ChatGPT and CoPilot are large language models, what fits on a phone or laptop? Small language models! 7B and 33B models are a far cry from 1.8T, but that means that they can run in WAAAAAAAY more places (which is sweet). The models that run on devices are usually on the smaller end.
On-device LLMs
With that said, what happens to the data that you type in there? Does it get reported to data brokers and tied to your identity? If you ask it about cold symptoms, could that data make its way to insurance companies? If you ask for job advice, could your interest in starting something new affect your credit score? I’m not saying that these happen, but nobody actually knows - ESPECIALLY when you’re using something for free. Remember, if you aren’t paying for something then you are the product. The privacy is a whole conversation in itself, but wouldn’t it be great if you could get the utility of an LLM but without the privacy concerns or need for constant network connection? Yes. Yes it would.
LLMs where you need them:
Enter on-device LLMs.
In the field of edge computing, we are always focused on moving compute as close to the data source or end-user as possible. That way, the data is available sooner and is therefore more relevant. Data doesn’t need to risk traversing the Internet or relying on an outside cloud, and it all works without a constant Internet connection. Edge computing also brings with it a long list of challenges that you don’t usually have to deal with in a data center - but I like these challenges. Things like no network connection, slow network connection, no unlimited power, tiny spaces, or mobility - mean that there are lots of things to consider. In short, while there are some challenges, there are some big benefits.
With the right prompts, the biggest LLMs like ChatGPT can handle a wide array of tasks, but with a couple trade-offs you can get similar functionality with more privacy and without the need for constant network. Over the past few months I’ve been working with an iOS/MacOS on-device LLM app called PrivateLLM (https://privatellm.app) created by Jeethu Rao and Thilak R. . It's works with open source models and is made by a small indie developer team. While it is specific to the Apple ecosystem, it brings a huge advantage: Shortcuts (https://support.apple.com/guide/shortcuts/welcome/ios) support. You can think of Shortcuts as scripts that are very quick and simple to create and work across devices. I can quickly write one on my phone and it's there on my computer when I get home or I can spend some real time crafting one of my desk and get the same functionality on my phone.
As with edge computing, there are considerations. Because phones generally have less RAM than computers, I can't use as large a model on my phone. ~7B instead of ~33B - but that’s ok! Unless your goal is to have it write a research paper or win a quiz at trivia, it doesn't seem to be a problem for day to day usage like writing a first draft, correcting grammar or handling casual conversation.
领英推荐
Why do I need a private, custom chatbot?
I really don't. For me, after the initial novelty of chatting with a computer wore off, I did lose interest. However that all changed when I discovered the Shortcuts integration. If the most useful data, is the newest, closest data then having then having the ability to act on it can be very useful in many ways. I don't own a factory or chain of retail stores (that I’m aware of) but I do have plenty of uses for an assistant that is only mine and is always there.
The system prompt
I'll be the first to say that I am not the best at writing incredible prompts. If you want to drastically improve, I suggest taking Jordan Wilson 's PPP course. It's free and you’ll learn exactly what it takes to get useful information out of an LLM. Actually, the models are surprisingly good but most folks don't put in the time to create a thorough prompt. I'll leave it at that, but if you want more than take the class. Note that the class is focused on ChatGPT but the skill of writing a good prompt to get real value means that this applies quite broadly (even to local LLMs).
Each of these models starts with a system prompt. Think of it as setting the stage or setting the environment. For personal use, I haven't yet found much value in establishing a detailed prompt about who I am (though I do have a long one that gives some background about name, family, job etc) - it can come up with solid rough draft of an email in a few seconds can save a lot of work, and it can even catch some spelling mistakes, grammatical mistakes, and reign me in when I'm being verbose.
Here are a few useful things to include for a personal system prompt:
Coming up with a paragraph or two, you can save time by not rewriting or figuring this out every single time you do a task.
Shortcuts
If getting your data into one of these models is the key, but how do you do it? In my case, shortcuts allows me to bring all sorts of relevant data to one of these LLMs. Time, location, weather, text, or even preset inputs mean that most things that I'm working on or need help with can quickly be added to the conversation as context. After all, it's very difficult to respond to an email if you don’t know what the original message was that it is responding to. Am I using this to respond to emails? Not regularly but certainly testing on old ones.
Let’s look at some personal examples
For the rest of the series, I'll focus on examples that I wrote and how and why I wrote them. Let’s start with the very first one - my PoC: “Summarize this long article”
Background: on an age where we are bombarded by news, I just don't have enough time to read it all, nor do need to. So how do I stay informed? This shortcut takes all the text of an article, it can be pages, and summarizes it into 3 key points - then, based on location, lets me know how this might affect everyday life.
Input: Long text article, location
Prompt/output: Summarized into 1 paragraph, create 3 key takeaways and if within 100 miles of current location - how might this affect the general public?
The whole process takes under a minute (including loading a small, fast 10B model) and running it. Am I getting all the information in the article? Of course not, but if it’s just a random link that is just “good to know” then it can save a LOT of time when used over and over.
That’s enough for today. I’ve written a few already that I’ll share - much shorter, I promise! My feelings won’t be hurt if you summarize this too ??
On-Device Generative AI @Picovoice — HIRING
9 个月I can't agree more. This article sums up why my team created picoLLM :)
Confused on AI implementation? We help ?? | Top AI Voice | Founder of Everyday AI, a Top 15 Tech Podcast | AI Strategist | AI Consultant | AI Keynote Speaker | Helping companies leverage GenAI | Prompt Engineer
10 个月Thanks for the PPP love here, Ben! The SLM space has legit like double in sized the last few weeks I swear. lol