Switching LLMs - The Invisibles

Opinions are personal, do not reflect my employer.

Properitary to Open Source LLM Model switch is more than just a replacement of models - Few factors to consider on how to make "informed choices" over changing and growing complexity (not an exhaustive list):


Did we understand the llm-patterns & problems?

https://eugeneyan.com/writing/llm-patterns/ & https://eugeneyan.com/writing/llm-problems/


What are some of the most open challenges in LLM today?

https://arxiv.org/pdf/2307.10169.pdf &

https://huyenchip.com/2023/08/16/llm-research-open-challenges.html


Comparison of Self-hosted LLMs vs OpenAI:

https://betterprogramming.pub/you-dont-need-hosted-llms-do-you-1160b2520526


New Foundation Models: https://magazine.sebastianraschka.com/p/ahead-of-ai-11-new-foundation-models


In LLM context: "Fine tuning" is for form, not facts??

https://www.anyscale.com/blog/fine-tuning-is-for-form-not-facts


Fine Tuning Guide:

https://platform.openai.com/docs/guides/fine-tuning &

Training: https://learn.deeplearning.ai/finetuning-large-language-models


Guide: What does it take to run LLama Model on a GPU computer?

https://www.hardware-corner.net/guides/computer-to-run-llama-ai-model/


Model options customer should be exploring?

https://arxiv.org/abs/2303.18223


Does the Organization have the budget, time, skills and resources to create fine-tunable dataset and engineering to optimize "serving" of the model?

Serving is a discipline by itself: https://betterprogramming.pub/frameworks-for-serving-llms-60b7f7b23407

How to build a dataset for finetuning? https://platypus-llm.github.io/ & https://huggingface.co/datasets?sort=trending&search=llama2

What dataset out there - 3 Trillion Token: https://blog.allenai.org/dolma-3-trillion-tokens-open-llm-corpus-9a0ff4b8da64


Cost of training data exceeds cost of compute: https://www.dhirubhai.net/feed/update/urn:li:activity:7087771674959310849/


LLM Inference: Why it matters - continuous and static batching

https://www.anyscale.com/blog/continuous-batching-llm-inference


How to LoRA, QLoRA, PEFT, and other refinements on open source LLM??

Example: platypus. No.1 in leaderboard

https://arxiv.org/pdf/2308.07317.pdf


LLama Inference primer and the case of why GPT-3.5 is cheaper.?

https://www.cursor.so/blog/llama-inference


Probably don't need to finetune an LLM

https://www.tidepool.so/2023/08/17/why-you-probably-dont-need-to-fine-tune-an-llm/


Which Cloud GPU to use for LLM

https://gpus.llm-utils.org/cloud-gpu-guide/


Optimize for latency?

https://hamel.dev/notes/llm/inference/03_inference.html


LLMOps - road ahead - data drift, model drift, everything drifts

https://wandb.ai/iamleonie/Articles/reports/Understanding-LLMOps-Large-Language-Model-Operations--Vmlldzo0MDgyMDc2 &

https://github.com/tensorchord/Awesome-LLMOps


Llama2 with n-bit quantization on CPU with huge RAM? When quality and latency is not a prime concern.

GGML: https://huggingface.co/TheBloke/Llama-2-70B-Chat-GGML

GPTQ: https://huggingface.co/TheBloke/Llama-2-70B-chat-GPTQ


Great explanation on how quantization is done from deeplearning dot ai:

https://youtu.be/g68qlo9Izf0?t=783


Understanding the size tradeoffs of LLMs & a decision tree

https://newsletter.victordibia.com/p/understanding-size-tradeoffs-with


Offline batch inference of 200TB

https://www.anyscale.com/blog/how-bytedance-scales-offline-inference-with-multi-modal-llms-to-200TB-data


How do i evaluvate?

https://github.com/onejune2018/Awesome-LLM-Eval


Case Study: Tailoring models to unique applications?

https://www.anyscale.com/blog/fine-tuning-llama-2-a-comprehensive-case-study-for-tailoring-models-to-unique-applications


The dilemma: Generalization, Evaluation and Deployment

https://arxiv.org/ftp/arxiv/papers/2308/2308.08061.pdf


The new business of AI. How it's different from traditional software? (2020)

https://a16z.com/2020/02/16/the-new-business-of-ai-and-how-its-different-from-traditional-software-2/


Back to prompt tuning. What else is there?

https://amatriain.net/blog/prompt201

要查看或添加评论,请登录

社区洞察

其他会员也浏览了