Generative AI - Learnings 2023
@Just my VSCode view

Generative AI - Learnings 2023

This year 2023 has been the year of Generative AI using Large Language Models both closed source and open source.

Like many of us I have been learning via blogs, courses and using prompt engineering for building Generative AI apps and key takeaways are here:


1. Use LLM as a thought partner

  • It’s a new way to find creative information but be careful to check on incorrect information (hallucinations)


2. Examples of tasks LLMs can carry out

  • Writing - Brainstorming, Press Release, Translation
  • Reading - Proofreading, Summarizing, Reputation Monitoring (Sentiment Analysis), Topic Modelling, Extract entities, Moderation for harmful content
  • Chatting - Specialized customer service chatbot with internal company data

3. What LLMs can and cannot do?

Ask this question, and if the fresh grad can do the following task, then an LLM can also do it.

4. Prompting best practices

  • Be detailed and specific
  • Give sufficient context for LLM to complete the task
  • Guide the model to think through the answer
  • Chain of thought reasoning”, is the step-by-step reasoning to give the model time to think to get to a final conclusive answer 1. Fast thinking vs slow thinking for complex thinking 2. Give Steps 1, 2, 3, etc and how you want the answer and in which format
  • Experiment and iterate - No perfect prompt for every person or situation Instead develop a process for improving prompts through iteration
  • Be careful with confidential information which you post as part of your prompt - Check how the prompt provider deals with the privacy of the information you post
  • Double check if you can trust the output of the LLM

5. Iterative Prompt Engineering

  • Idea
  • Iteration 1
  • Implementation
  • Experimental result, Error Analysis
  • Iteration 2 - Repeat

6. Design considerations

Lifecycle of a Generative AI project

  • Scope project
  • Estimate volume of tokens for Cost estimation
  • Choose a model (based on size, closed or open source, and cost) - Refer list of LLMs in the Leaderboards section below
  • Build/improve system
  • Internal Evaluation
  • Deploy and monitor
  • Repeat Internal evaluation
  • Repeat Build/improve system


7. Advanced Technologies – Beyond Prompting

- Retrieval Augmented Generation (RAG) - Ground model on additional internal proprietary data

- Part 1

Take Knowledge base, break to chunks

Create embeddings for each chunk

For each chunk -> store the embedding with the corresponding chunk in a Vector Database

- Part 2

Take a prompt and create its embedding

Compare the embedding with the embeddings stored in Vector Database

Get the chunks for the matched embedding and send the chunks as part of the prompt to the LLM


- 3 stages of LLM training

  • Pretraining - via Supervised Learning on trillions of tokens on all public data on internet
  • Fine-tuning - via Supervised Learning Adapt LLM to your task by fine tuning on high quality data
  • Reinforcement Learning from Human Feedback (RLHF) - via Classification, Reward Model and Reinforcement Learning


- Cost Considerations - Think through the Cost considerations of using a cloud based LLM to power software applications.

8. Process of building an application

  • Tune prompts on some examples
  • Add additional "tricky" examples
  • Develop metrics to measure performance on examples
  • Collect randomly selected set of examples to tune to (development set, hold out cross validation set)
  • Collect and use a hold out test set


9. Challenges with LLM

  • Ambiguous inputs and outputs
  • Hallucination vs facts
  • Compatibility amongst models
  • Maintaining Data Privacy
  • Safeguarding against Prompt Injection


Code

Refer my Github repo for few GenAI projects on

  1. https://github.com/mahtabsyed/ChatGPT-API
  2. https://github.com/mahtabsyed/Building-Systems-ChatGPT-API
  3. https://github.com/mahtabsyed/LangChain-for-LLM-App


Leaderboards

Check these Leaderboards to compare against various LLMs like OpenAI GPT, Meta LLaMa, Google PaLM, Microsoft Phi-2, Mistral SC56

Acknowledgement:

  1. Andrew Ng and DeepLearning.ai for so many wonderful courses
  2. Chip Huyen for her blogs in https://huyenchip.com/blog/

?

Mahtab Syed - Melbourne - 21 Dec 2023

Absolutely fascinating insights on Generative AI! ?? As Albert Einstein once said, "The true sign of intelligence is not knowledge but imagination." Your exploration into #genai and #llms really showcases the power of imagination in driving innovation. Keep pushing the boundaries! ?

回复
Manish Choudhary

#Digital Transformation - #Machine Learning, #AI, #Big Data, #Mobility #Deeplearning | #Investment Banking| #Insurance #P&C #F&A#ScaledAgilePractioner

1 年

Well laid down Mahtab!

要查看或添加评论,请登录

Mahtab Syed的更多文章

  • AI Agents or Agentic Systems

    AI Agents or Agentic Systems

    In the new year 2025 we see everyone talking about “Agents” or Agent like systems called “Agentic Systems”. I recently…

    1 条评论
  • Develop your career in AI in 2025

    Develop your career in AI in 2025

    The hype of AI, especially in 2023 and continuing in 2024 and now in 2025, has created a supply of various courses. And…

    1 条评论
  • On Emotional Intelligence

    On Emotional Intelligence

    From my old archives - published on Tue 02 Nov 2010 in https://mahtabsyed.blogspot.

    1 条评论
  • What is Data Governance? And why is it necessary especially now?

    What is Data Governance? And why is it necessary especially now?

    With the advent of Machine Learning and Artificial Intelligence for Predictions (Business metrics like Inventory…

  • Its end of year again… And I have no new year resolutions…

    Its end of year again… And I have no new year resolutions…

    Its 31 Dec 2022, an end of a year again… And I am quite happy and contented. ?? I have a clear vision of what I will do…

    3 条评论
  • Machine Learning Blog – 9

    Machine Learning Blog – 9

    Machine Learning using 3 ways - Full code vs. No Code vs.

    3 条评论
  • Winning with life which keeps throwing new challenges every day...

    Winning with life which keeps throwing new challenges every day...

    I had written this self care tip few months back which I thought its better to be published as an article..

    2 条评论
  • The Silence within

    The Silence within

    Its peak winter in Melbourne and early morning of Wed 29 May 2019, and so far it’s the coldest day this year. I am at…

  • This year 2021… was in the trenches of worries

    This year 2021… was in the trenches of worries

    This year 2021… was in the trenches of worries due to Covid lockdowns, number of daily cases, economic slowdown…

    1 条评论
  • Machine Learning Blog – 8

    Machine Learning Blog – 8

    Multi-Layer Stacking Ensemble and Optuna Hyperparameter Tuning In this blog I will illustrate and link to the code of a…

    1 条评论

社区洞察

其他会员也浏览了