Time for a short, fun little series about on-device LLMs

If you’re in IT, and even if you aren’t, you’re seeing AI everywhere. Large language models (LLMs) like ChatGPT can be a great tool for many things, and like a multitool, they can do a little of everything. They’re great at coming up with a rough draft, fixing grammar and even writing some basic copy. Are they perfect? Not really. But is a mutlitool? Not really. But do I like my mutlitool? Yup!


This isn’t a course in AI from a 15 year data scientist nor is it Buster Bluth’s review of 18th century agrarian business. It’s a few practical uses (and considerations) for everyday folks.


Let’s start with a few key thoughts to get on the same page: LLMs

At its core, an LLM has access to a vast amount of information but doesn’t know how to apply it. Imagine a person who is very “book-smart”, but not “street-smart”. They know a little bit about everything, but really don’t know how to apply it or use it. If they don’t know, they’ll just confidently tell you what they thought. Hallucinations, but I’m not going into detail on those.


Parameters

Many models can be measured in number-of-parameters: a 7B model has 7 billion parameters. In general, the more parameters, the larger the model. 7B can run on an 8GB phone, 33B on a 32GB RAM laptop and as of 2024 ChatGPT has around 1.8 trillion - that’s why it’s so good at so many things.


If ChatGPT and CoPilot are large language models, what fits on a phone or laptop? Small language models! 7B and 33B models are a far cry from 1.8T, but that means that they can run in WAAAAAAAY more places (which is sweet). The models that run on devices are usually on the smaller end.


On-device LLMs

With that said, what happens to the data that you type in there? Does it get reported to data brokers and tied to your identity? If you ask it about cold symptoms, could that data make its way to insurance companies? If you ask for job advice, could your interest in starting something new affect your credit score? I’m not saying that these happen, but nobody actually knows - ESPECIALLY when you’re using something for free. Remember, if you aren’t paying for something then you are the product. The privacy is a whole conversation in itself, but wouldn’t it be great if you could get the utility of an LLM but without the privacy concerns or need for constant network connection? Yes. Yes it would.


LLMs where you need them:

Enter on-device LLMs.

In the field of edge computing, we are always focused on moving compute as close to the data source or end-user as possible. That way, the data is available sooner and is therefore more relevant. Data doesn’t need to risk traversing the Internet or relying on an outside cloud, and it all works without a constant Internet connection. Edge computing also brings with it a long list of challenges that you don’t usually have to deal with in a data center - but I like these challenges. Things like no network connection, slow network connection, no unlimited power, tiny spaces, or mobility - mean that there are lots of things to consider. In short, while there are some challenges, there are some big benefits.


With the right prompts, the biggest LLMs like ChatGPT can handle a wide array of tasks, but with a couple trade-offs you can get similar functionality with more privacy and without the need for constant network. Over the past few months I’ve been working with an iOS/MacOS on-device LLM app called PrivateLLM (https://privatellm.app) created by Jeethu Rao and Thilak R. . It's works with open source models and is made by a small indie developer team. While it is specific to the Apple ecosystem, it brings a huge advantage: Shortcuts (https://support.apple.com/guide/shortcuts/welcome/ios) support. You can think of Shortcuts as scripts that are very quick and simple to create and work across devices. I can quickly write one on my phone and it's there on my computer when I get home or I can spend some real time crafting one of my desk and get the same functionality on my phone.


As with edge computing, there are considerations. Because phones generally have less RAM than computers, I can't use as large a model on my phone. ~7B instead of ~33B - but that’s ok! Unless your goal is to have it write a research paper or win a quiz at trivia, it doesn't seem to be a problem for day to day usage like writing a first draft, correcting grammar or handling casual conversation.




Why do I need a private, custom chatbot?

I really don't. For me, after the initial novelty of chatting with a computer wore off, I did lose interest. However that all changed when I discovered the Shortcuts integration. If the most useful data, is the newest, closest data then having then having the ability to act on it can be very useful in many ways. I don't own a factory or chain of retail stores (that I’m aware of) but I do have plenty of uses for an assistant that is only mine and is always there.


The system prompt

I'll be the first to say that I am not the best at writing incredible prompts. If you want to drastically improve, I suggest taking Jordan Wilson 's PPP course. It's free and you’ll learn exactly what it takes to get useful information out of an LLM. Actually, the models are surprisingly good but most folks don't put in the time to create a thorough prompt. I'll leave it at that, but if you want more than take the class. Note that the class is focused on ChatGPT but the skill of writing a good prompt to get real value means that this applies quite broadly (even to local LLMs).


Each of these models starts with a system prompt. Think of it as setting the stage or setting the environment. For personal use, I haven't yet found much value in establishing a detailed prompt about who I am (though I do have a long one that gives some background about name, family, job etc) - it can come up with solid rough draft of an email in a few seconds can save a lot of work, and it can even catch some spelling mistakes, grammatical mistakes, and reign me in when I'm being verbose.


Here are a few useful things to include for a personal system prompt:

  • Your name
  • Your age
  • Where you live
  • Your family members
  • Your job
  • Things that you find interesting, good/bad or high-level opinions


Coming up with a paragraph or two, you can save time by not rewriting or figuring this out every single time you do a task.


Shortcuts

If getting your data into one of these models is the key, but how do you do it? In my case, shortcuts allows me to bring all sorts of relevant data to one of these LLMs. Time, location, weather, text, or even preset inputs mean that most things that I'm working on or need help with can quickly be added to the conversation as context. After all, it's very difficult to respond to an email if you don’t know what the original message was that it is responding to. Am I using this to respond to emails? Not regularly but certainly testing on old ones.


Let’s look at some personal examples

For the rest of the series, I'll focus on examples that I wrote and how and why I wrote them. Let’s start with the very first one - my PoC: “Summarize this long article”


Background: on an age where we are bombarded by news, I just don't have enough time to read it all, nor do need to. So how do I stay informed? This shortcut takes all the text of an article, it can be pages, and summarizes it into 3 key points - then, based on location, lets me know how this might affect everyday life.


Input: Long text article, location

Prompt/output: Summarized into 1 paragraph, create 3 key takeaways and if within 100 miles of current location - how might this affect the general public?


The whole process takes under a minute (including loading a small, fast 10B model) and running it. Am I getting all the information in the article? Of course not, but if it’s just a random link that is just “good to know” then it can save a LOT of time when used over and over.


That’s enough for today. I’ve written a few already that I’ll share - much shorter, I promise! My feelings won’t be hurt if you summarize this too ??

Alireza Kenarsari

On-Device Generative AI @Picovoice — HIRING

9 个月

I can't agree more. This article sums up why my team created picoLLM :)

回复
Jordan Wilson

Confused on AI implementation? We help ?? | Top AI Voice | Founder of Everyday AI, a Top 15 Tech Podcast | AI Strategist | AI Consultant | AI Keynote Speaker | Helping companies leverage GenAI | Prompt Engineer

10 个月

Thanks for the PPP love here, Ben! The SLM space has legit like double in sized the last few weeks I swear. lol

要查看或添加评论,请登录

Ben Cohen的更多文章

  • Tackling email, without interrupting your flow

    Tackling email, without interrupting your flow

    We all get a lot of email, texts, IMs, etc etc. However, what sets email apart from most of these is that it’s long.

  • Business cards are the worst…AM I RIGHT???

    Business cards are the worst…AM I RIGHT???

    In 2024, a physical business card is not great (for me, at least) and it’s kind of a waste of paper all around. What…

  • Focusing on work

    Focusing on work

    Disclaimer, I am not using this (or any of these) at work - but you do you! With a solid foundation and a few…

    2 条评论
  • It's time to brush your teeth

    It's time to brush your teeth

    No pictures for this one BUT I think this one might be my favorite of all time. In general, it is recommended that…

  • Weather, location and time - using local and personal data

    Weather, location and time - using local and personal data

    Moving on from simple text, I decided to see how much more local, relevant data I could pull into the model to get some…

  • DocGPT and Word Problem Solver

    DocGPT and Word Problem Solver

    This next one is a two-for-one special and are my experiments with Optical Image Recognition (OCR), which is the…

  • Summarize this looooong article

    Summarize this looooong article

    Well US is heading into election season so bring on the onslaught of news and opinions! Fortunately, I’ve whipped up a…

    2 条评论
  • Personal Chef

    Personal Chef

    It’s not that I’m bad at cooking, in fact, when I make something it generally turns out pretty well - but the process…

社区洞察

其他会员也浏览了