登录查看更多内容

Switching LLMs - The Invisibles

Srikanth Bhakthan

Data & AI Leader

发布日期: 2023年8月18日

Opinions are personal, do not reflect my employer.

Properitary to Open Source LLM Model switch is more than just a replacement of models - Few factors to consider on how to make "informed choices" over changing and growing complexity (not an exhaustive list):

Did we understand the llm-patterns & problems?

https://eugeneyan.com/writing/llm-patterns/ & https://eugeneyan.com/writing/llm-problems/

What are some of the most open challenges in LLM today?

https://arxiv.org/pdf/2307.10169.pdf &

https://huyenchip.com/2023/08/16/llm-research-open-challenges.html

Comparison of Self-hosted LLMs vs OpenAI:

https://betterprogramming.pub/you-dont-need-hosted-llms-do-you-1160b2520526

New Foundation Models: https://magazine.sebastianraschka.com/p/ahead-of-ai-11-new-foundation-models

In LLM context: "Fine tuning" is for form, not facts??

https://www.anyscale.com/blog/fine-tuning-is-for-form-not-facts

Fine Tuning Guide:

https://platform.openai.com/docs/guides/fine-tuning &

Training: https://learn.deeplearning.ai/finetuning-large-language-models

Guide: What does it take to run LLama Model on a GPU computer?

https://www.hardware-corner.net/guides/computer-to-run-llama-ai-model/

Model options customer should be exploring?

https://arxiv.org/abs/2303.18223

Does the Organization have the budget, time, skills and resources to create fine-tunable dataset and engineering to optimize "serving" of the model?

Serving is a discipline by itself: https://betterprogramming.pub/frameworks-for-serving-llms-60b7f7b23407

How to build a dataset for finetuning? https://platypus-llm.github.io/ & https://huggingface.co/datasets?sort=trending&search=llama2

What dataset out there - 3 Trillion Token: https://blog.allenai.org/dolma-3-trillion-tokens-open-llm-corpus-9a0ff4b8da64

Cost of training data exceeds cost of compute: https://www.dhirubhai.net/feed/update/urn:li:activity:7087771674959310849/

LLM Inference: Why it matters - continuous and static batching

https://www.anyscale.com/blog/continuous-batching-llm-inference

How to LoRA, QLoRA, PEFT, and other refinements on open source LLM??

Example: platypus. No.1 in leaderboard

https://arxiv.org/pdf/2308.07317.pdf

LLama Inference primer and the case of why GPT-3.5 is cheaper.?

领英推荐

?? DSA + Patterns + Algorithms ??

Akshay Nandwana 3 个月前

Maximizing LLM Inference Speed: Proven Strategies and…

Deci AI (Acquired by NVIDIA) 9 个月前

From Dot Matrix to Digital Age: Republishing My 1984…

Stephen King 3 周前

https://www.cursor.so/blog/llama-inference

Probably don't need to finetune an LLM

https://www.tidepool.so/2023/08/17/why-you-probably-dont-need-to-fine-tune-an-llm/

Which Cloud GPU to use for LLM

https://gpus.llm-utils.org/cloud-gpu-guide/

Optimize for latency?

https://hamel.dev/notes/llm/inference/03_inference.html

LLMOps - road ahead - data drift, model drift, everything drifts

https://wandb.ai/iamleonie/Articles/reports/Understanding-LLMOps-Large-Language-Model-Operations--Vmlldzo0MDgyMDc2 &

https://github.com/tensorchord/Awesome-LLMOps

Llama2 with n-bit quantization on CPU with huge RAM? When quality and latency is not a prime concern.

GGML: https://huggingface.co/TheBloke/Llama-2-70B-Chat-GGML

GPTQ: https://huggingface.co/TheBloke/Llama-2-70B-chat-GPTQ

Great explanation on how quantization is done from deeplearning dot ai:

https://youtu.be/g68qlo9Izf0?t=783

Understanding the size tradeoffs of LLMs & a decision tree

https://newsletter.victordibia.com/p/understanding-size-tradeoffs-with

Offline batch inference of 200TB

https://www.anyscale.com/blog/how-bytedance-scales-offline-inference-with-multi-modal-llms-to-200TB-data

How do i evaluvate?

https://github.com/onejune2018/Awesome-LLM-Eval

Case Study: Tailoring models to unique applications?

https://www.anyscale.com/blog/fine-tuning-llama-2-a-comprehensive-case-study-for-tailoring-models-to-unique-applications

The dilemma: Generalization, Evaluation and Deployment

https://arxiv.org/ftp/arxiv/papers/2308/2308.08061.pdf

The new business of AI. How it's different from traditional software? (2020)

https://a16z.com/2020/02/16/the-new-business-of-ai-and-how-its-different-from-traditional-software-2/

Back to prompt tuning. What else is there?

https://amatriain.net/blog/prompt201

Switching LLMs - The Invisibles

Srikanth Bhakthan

Data & AI Leader

领英推荐

更多精彩文章

社区洞察

其他会员也浏览了

Phi 2 for RAG and the Emergence of Small Language Model (SLM)

Google Colab: A Powerful Testing Platform for Machine Learning and Time Series Analysis

Learn how Milvus 2.4 Enhances Search Capabilities and More!

No One Will Read This Series - Flipping the Switch: The Rise of Binary Code

We still need to learn to code

CORDIC Algorithm Course

Unlocking the Speed Secret: Why Processing a Sorted Array is Faster Than You Think

A Problem Larger Than the Universe

GenAI n00b, Part 2

AN EPIC HISTORY OF LLMs:

领英推荐

A CxOs Guide to Capturing Value using Generative AI

2023年10月1日

Azure Open AI - GPT - LLM Reference

2023年2月18日

ChatGPT Prompts : Tested

2022年12月19日

ChatGPT Ref

2022年12月6日

Azure Synapse Reference

2022年10月20日

Microsoft Purview - Why, What, When, Who & How - End to End

2022年4月20日

社区洞察

其他会员也浏览了

Phi 2 for RAG and the Emergence of Small Language Model (SLM)

Google Colab: A Powerful Testing Platform for Machine Learning and Time Series Analysis

Learn how Milvus 2.4 Enhances Search Capabilities and More!

No One Will Read This Series - Flipping the Switch: The Rise of Binary Code

We still need to learn to code

CORDIC Algorithm Course

Unlocking the Speed Secret: Why Processing a Sorted Array is Faster Than You Think

A Problem Larger Than the Universe

GenAI n00b, Part 2

AN EPIC HISTORY OF LLMs: