Shorticle 988 – Reinforcement Learning from Human Feedback using autoregressive model behind ChatGPT design strategy

In last two decades, I haven’t seen a solution or technology trend got so much hype and attention when I first noticed ChatGPT from it official launch in November/December 2022 as free preview release. In fact some friends asked me when I will publish a shorticle on ChatGPT which surprised me on their curiosity on both Shorticle (and expectation from it) and ChatGPT (and its reach).?I wanted to take some time to understand ChatGPrT technically before writing anything about it.


Artificial Intelligence (AI) has been growing in leaps and bounds for more than two decades and chat services has been a strong usecase to be implemented with AI solution for long time. Logic learning machine (LLM) is an important break-through in Natural Language Processing (NLP) to train large language models to learn the structure and relationship between strokes, letters and words in a language.


This kind of LLM design, leads to advanced AI solutions including the web crawler based search engines (eg; Google and Bing) to provide suggestive text and ordering of search results based on user locality, interest and frequent search. For chat based querying service, we need both supervised and reinforced learning to prepare large domain models.

Autoregressive models are build based on time-varying process to forecast future based on past values of the service (in this case user questions). Supervised learning model in AI terminology is very popular where user has to train the system to build the knowledge base and later use the trained model to develop focussed AI service like signature verification in cheques deposited in banks, handwritten character recognition to name a few. Reinforced learning model is important in a web-crawling solution where we collect results from one or more sources (webpages) to user previous results to predict future queries from users.

ChatGPT is an extension of an earlier InstructGPT, which is an autoregressive model of responding to user queries. New age digital assistant devices like Alexa and Google Home is based on this where human voice intents are interpreted to prepare answers by the AI engines . OpenAI was started as non-profit research organization for AI based research and development and one of the projects for them is to develop Generative Pre-trained Transformer (GPT) series to develop intelligent search engine optimization (SEO) tools.

ChatGPT is definitely a break-through solution using Reinforcement learning from human feedback (RLHF) and it can be used as intelligent technology advisor in near future to get design approach, business approach and sales approach for many industrial applications.?

#magtechbytes #wipro #shorticle #shorticleaiml #shorticlegeneral

要查看或添加评论,请登录

Dr. Magesh Kasthuri的更多文章

社区洞察

其他会员也浏览了