Generative AI with LLM : application challenges on deployment

Generative AI with LLM : application challenges on deployment

Deploying any generative AI application, which is using Large Language Model (LLM) need some special consideration. In this article, I will try to introduce some issues, which need to consider during deployment of generative AI issue and give some hints (references) of frameworks and libraries which will be candidate of deployment architecture of that solution.

Important abbreviations

LLM: Large Language Model

RAG: Retrieval Augmentation Generation

DL: Deep Learning

BNLP: Bangla Natural Language Processing

Basic deployment architecture of ML project

To work for generative AI, to serve some extraordinary features to your customer, you have done all hard works by preparing the LLM. You did lot research work to select one, prepare lot data and did data analysis works to train the selected LLM for fine-tuning. You saw exciting results, which is satisfying all of your investors. Now everybody excited to see how it is going to effect on business. Perhaps you already planning to deploy for beta version.

Therefore, you deployed it for alfa testing with following architecture. Which is typical application architecture with additional layer as ML access layer with LLM. You must plan to keep log and other best practices of solution architecture. You also need to collect user updates and retrain the model. In this diagram, I ignored these parts. (In our DL program for BNLP, we followed this structure)


Simple deployment diagram

Issues on this architecture?

However, one of your team found some serious issues.

The LLM is

(a) Not giving correct results from latest updates on NEWS (i.e. an LLM cannot give answer to question, “Who own the football world cup in 2022?” Not only that, it is not giving any latest NEWS. How to get information from latest NEWS?)

(b) Cannot give answer to reasoning problems.

(c) Giving close result but not giving exact result of simple math.

In team discussion on this issue, one of your team member noticed that, LLM would never give answer to this (at least as of current understanding). Because, the main task of LLM is to predict related words (completion) against given words (prompt). Therefore, if we need to solve above problems, we need to work more.

Another issue you need to notice. Your LLM will never stop giving answer. It is because, it always predicts next token (another word), and hence always it will have some answer. In practice, it may no answer at al or may some funny words. This is hallucination.

Modified architecture

For this, you need to retrieve related data from external resources to augment with LLM to generate desired output. This is Retrieval Augmentation Generation (RAG) framework. This framework suggest to generate a query and retrieve related data from external resources (any document, API data or can be internally stored data) and concatenate with prompt before sending to LLM. Then the LLM generate the completion. There are different version of RAG and different implementation. You need to find specific one as of your need (in this article, I am nor focusing on RAG). There are different libraries available to support RAG and some other helping tools and frameworks like LangChain, PromtChainer etc. These libraries also solve other problems which directly not solvable by LLM (like reasoning). You can find lot more available both free and paid. You will select one library, which will serve your purpose. So the updated deploy can be look like following diagram.

Modified architecture, Collecting data from external resource using RAG

Conclusion:

Generative AI using LLM to generate completion. But to get applicable output, it requires some extra efforts depending on requirement. It is doing so many things that were almost impossible before. In addition, some vendor offering regular works (like coding of applications) that is becoming thread for some jobs. It also opening door to new opportunities. You will see the libraries may need wrapper to fit, collecting data for further training, troubleshooting of issues and so on. Devils people are not out of race. Therefore, new cyber thread will appear. Hackers will find new ways of cyber-crime on Generative AI related applications like prompt injection. Therefore, new opportunities will come for cyber security professional. It is time for organizations to be prepared for these before starting mass level of deployment of Generative AI application.

References

https://boost.ai/blog/llms-large-language-models/

https://www.promptingguide.ai/research/rag

https://www.analyticsvidhya.com/blog/2023/12/langchain-alternatives/#:~:text=industries%20and%20domains.-,PromptChainer,could%20achieve%20on%20its%20own.

https://python.langchain.com/docs/get_started/introduction

https://slashdot.org/software/p/LangChain/alternatives

?

?

?

?

?

要查看或添加评论,请登录

Khaled Hussain的更多文章

  • How to motivate less confident employees

    How to motivate less confident employees

    I asked chatgpt to understand how it has been train to explain types of employees based on motivation. Actually, I had…

  • Importance of string manipulation in data project

    Importance of string manipulation in data project

    Anyone who worked in programming has some knowledge of string. It is sequence of characters ended by null.

  • HSV and color plate generation

    HSV and color plate generation

    We talk about color by name, their brightness, color mixing etc. Do it really matter on RGB.

  • Choosing right color for presentation

    Choosing right color for presentation

    In my recent works, I was involved in preparing presentations for management who are not technical people. During this…

  • Color Theory - In Brief

    Color Theory - In Brief

    There are many terminologies used in color theory. In this article I am presenting some basic terminologies in sho.

  • The philosophy of color

    The philosophy of color

    Color is important in our life in art, culture, design, presentation and in many ways. We use color to make the view…

  • ML binary classification Models Evaluation metrics

    ML binary classification Models Evaluation metrics

    Evaluating an ML model is not easy task. There are lot of data with distribution need to be prepared with proper…

  • How to choose right ML algorithm

    How to choose right ML algorithm

    There are lot in literature about ML and ML algorithms. Specially, after neural network and deep learning in the show…

  • Acronyms like words in workplace

    Acronyms like words in workplace

    TEAMWORK: Build a cohesive team with TEAMWORK: Trust, Empower, Achieve, Motivate, Work together LEAD: Lead with…

  • Brain structure

    Brain structure

    Note: This article is mostly motivated by Saul Mcleod, PhD, BSc (Hons) Psychology, MRes, PhD, University of Manchester…

社区洞察

其他会员也浏览了