登录查看更多内容

AI on a laptop

Kuba Filipowski

发布日期: 2023年3月15日

1. I am fascinated by the topic of running generative AI models on PCs and smartphones.

2. OpenAI's business model is to provide ChatGPT and GPT-3 APIs for a small fee. The model is in the cloud, the developer can easily connect to it and query it using the API. This way he can embed GPT into his software. You pay for every thousand tokens downloaded.

3. OpenAI recently enabled the use of the ChatGPT API, the cost of this solution is $0.002 per thousand tokens. It's very cheap, 10x cheaper than the previous GPT access price.

4. The decreasing price makes it easier to experiment and test ideas, which will result in more applications in specific products.

5. However, there is a way to make the use of a large language model even cheaper. The way to do this is to run it locally on your computer. Then 1000 tokens cost $0.

6. Until now, this was impossible, because large language models need a lot of RAM and many GPUs. For example, the open source LLM Bloom needs 352 GB of RAM and 8 GPUs (PLN 22k each). This is not the specification of a typical laptop.

7. However, the engineers from Meta managed to create a model that supposedly has GPT-3 capabilities and can be run on a MacBook Pro M2 64GB RAM laptop.

8. The LLaMA model is open source, unfortunately under a license that excludes commercial use. But you can watch it and play with it.

9. Programmer Georgi Gerganov watched and played with it and made a C++ version of LLaMA that runs on a laptop: llamma.ccp. Locally, without the internet, without other vendors’ APIs. LLaMA runs on the user's computer. For free, without limits.

The same gentleman did a similar thing with Whisper, a model for voice transcription (automatic speech recognition), thanks to which you can do transcriptions on the iPhone, offline, for free.

10. What is a significant drawback of LLaMA is that it doesn’t understand the instructions. Understanding instructions is a major OpenAI innovation. Thanks to it, the responses of the language model are more intuitive and useful for the user.

OpenAI shows this with an example:
prompt: Explain the moon landing to a 6 year old in a few sentences.
the answer of "pure" GPT-3 without instruct: Explain the theory of gravity to a 6 year old.
InstructGPT response: People went to the moon, and they took pictures of what they saw, and sent them back to the earth so we could all see them.

11. As you can see in the example above, understanding the instructions is important to get the expected results.

12. LLaMA doesn’t understand the instructions.?

Paul Storm 7 个月前

ODSC's AI Weekly Recap: Week of August 9th

Open Data Science Conference (ODSC) 3 个月前

LLM Pulse- October 15, 2024

Blackstraw 1 个月前

13. Very quickly, however, came the Stanford Alpaca project, created by a group of Stanford scientists who took LLaMA and tuned it to understand instructions.

14. Interestingly, the Stanford Alpaca has been tuned with data from the OpenAI API. Researchers paid for access to the OpenAI API, threw 175 prompts/seed tasks into it, and fine-tuned LLaMA with answers. It works. Beautiful!

Instruct, a technique created by OpenAI, on which it has been working for a long time and which is one of the pillars of a good ChatGPT user experience, was "copied" for a few dollars.

15. Why is it fascinating? It turns out that you can have a large language model that is not resource intensive, runs locally, and has a similar UX to an industry leader, for free.

16. This means that it will probably be possible to embed such a model at the level of the computer's operating system, or maybe soon the smartphone’s operating system.

17. Imagine an alternate reality where whenever you take a photo with your phone, it has to be uploaded to the internet for some digital service to enhance brightness, contrast and other computational photography tricks that allow you to take good photos on your phone. Then every photo capture would be slow and costly for the owner of the operating system.

Computational photography takes place on the phone with the support of processors specially designed for this function. That's why it's free.

18. A more vulgar example: if the calculator on the phone, whenever we enter 2+2 into it, would have to ask the central calculation server for the result, then using the calculator would be less convenient, slower and more expensive.

19. This is how LLMs work in the model proposed by OpenAI. We have to pay for each processing.

20. Alpaca and LLaMA are a promise that in the future these operations will be able to happen locally, at the level of the operating system.

21. Who makes money from editing photos from the phone? Who makes money counting on a calculator? Only the manufacturer of the operating system or the phone / computer that is the wrapper for this operating system.

22. I think this is the future of this technology. A large language model will be part of the operating system of our phones and personal computers.

Did you enjoy this edition of my newsletter? Feel free to share it with your connections!

AI'm Informed

2,392 位关注者

Krystian Dylewski

AI Architect/Data Science Manager

1 年

Kuba Filipowski It's rather not the case that it's cheaper on the laptop as it's not (the laptop uses a huge amount of energy during inference - much more that the optimized global API version). It's probably rather the case that everyone can have his own AI trained on private conversations, images etc.

查看更多评论

要查看或添加评论，请登录

查看全部

AI on a laptop

Kuba Filipowski

领英推荐

AI'm Informed

2,392 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

A large week for Mistral

YouTube’s ML Wants You To Stop Being Mean, Researchers Propose Sparse Upcycling, and Google Sheets Announces SimpleML

The World’s First Robot Lawyer Is Here, Microsoft Hopes ChatGPT Will Make Bing Smarter and Scaling Model Serving in Prod Just Got Easier

AMD Enters the Chat!

Everyone will have an Independent, Unique and Personalized AI

OpenAI’s GPT Store Goes Live / Ballie: The AI Companion Robot / Ads Image with AI / How to Use Google Bard Extensions

CuttingEdgeTech chronicles #3

AI Newsletter

Why GPT-4 Is Just the Start – Wait Till You See GPT-6 and Beyond!

CuttingEdgeTech chronicles #5

领英推荐

AI'm Informed

2,392 位关注者

NASA’s AI Copilot ?????? Google Lens Reinvents Shopping ?? Microsoft’s Digital Workers

2024年11月27日

Coke’s Xmas Ad ?? Alibaba Bets Big on AI ?? AI Granny to Combat Scammers

2024年11月20日

Sustainable AI at CommBank ?? Godfather of AI With a Nobel ?? URL For Over $10M ??

2024年11月13日

Robotaxis Built on Gemini ?? ChatGPT Search ?? AI Manages Traffic in Verona ??

2024年11月6日

Morgan Stanley’s GenAI Assistant ?? Exceeding Human Intelligence with AI ?? Midjourney’s Image Editor ??

2024年10月30日

Nuclear Data Centres ??? Harvard’s Cancer Diagnosis AI ?? OpenAI on Accessibility ??

2024年10月23日

Unique Homepage for Each Customer ?? Risky Smart Glasses ??? AI Tools for Healthcare ??

2024年10月16日

Gemini Live Goes Multilingual ?? ‘Mozart of Math’ on AI ?? Wearable Device for Emotional Support ??

2024年10月9日

Meta’s Ad-Embedded Chatbots ?? Bye Bye Hallucinations ? AI to Solve It All ??

2024年10月2日

AI Factory at Commonwealth Bank ?? AI-Powered Gift Search ?? AI Panopticon ??

2024年9月25日

社区洞察

其他会员也浏览了

A large week for Mistral

YouTube’s ML Wants You To Stop Being Mean, Researchers Propose Sparse Upcycling, and Google Sheets Announces SimpleML

The World’s First Robot Lawyer Is Here, Microsoft Hopes ChatGPT Will Make Bing Smarter and Scaling Model Serving in Prod Just Got Easier

AMD Enters the Chat!

Everyone will have an Independent, Unique and Personalized AI

OpenAI’s GPT Store Goes Live / Ballie: The AI Companion Robot / Ads Image with AI / How to Use Google Bard Extensions

CuttingEdgeTech chronicles #3

AI Newsletter

Why GPT-4 Is Just the Start – Wait Till You See GPT-6 and Beyond!

CuttingEdgeTech chronicles #5