What every CIO needs to know about OpenAI ChatGPT
While many of the AI enthusiasts have been playing with GPT-3 since 2020, the release of OpenAI ChatGPT on 30th Nov’22 to the public has created waves of excitement, garnering over 1 million sign ups in the first week itself. OpenAI ChatGPT is a natural language processing platform that allows users to communicate with artificial intelligence in a conversational manner. The platform utilizes advanced machine learning algorithms to understand and respond to user input in a way that is similar to how a human would.
I would not get into the raving examples of OpenAI responses, as the Internet is bubbling with that, however using an AI system that is fit for its intended purpose requires an understanding of how the technology works, its capabilities and limitations, and how to achieve good results in a cost-effective manner, therefore I will focus more on covering these aspects.?
What is GPT and how does OpenAI ChatGPT use it?
GPT is a Large Language Model (LLM) and is an acronym for Generative Pre-trained Transformer, where
Generative means capability of model to generate new content. The model converts large training data set into mathematical structures, learns the pattern from it and then uses it to iteratively predict one word at a time to create best response for the given prompt. It is important to understand that the Machine Learning model is simply generating new content based on past patterns it has learnt from the training data, without any meaningful understanding of the content.
Pre-trained refers to the fact that the model has already been trained using a large dataset of text before it is used for a specific task. Pre-trained models are useful as they are trained on broader concepts and then they can be further fine-tuned for a specific task (such as translation, text summarization, question answering) using a smaller dataset, which can be much faster and more cost-effective than training a model from scratch.
Transformer refers to the self-attention Neural Network architecture. Here the model assigns a weight to each input element, indicating the importance of that element in relation to the other elements in the input. This allows the model to selectively attend to certain parts of the input, rather than considering all elements equally. Hence it is able to process input sequences of variable length and is able to capture long-term dependencies in the data. This makes the transformer very effective at understanding the context and meaning of words in a sentence or paragraph.
GPT models are lately being used in a variety of applications, including natural language processing, image generation, video generation and even music composition.
The OpenAI ChatGPT is a chatbot built on top of a large language model (GPT version 3.5) that has been trained by OpenAI on a vast amount of data, including text from books, articles, and websites, to generate human-like responses to user input. The ChatGPT was optimized using a method called Reinforcement Learning with Human Feedback (RLHF), which uses human demonstrations (supervised and reinforcement learning) to guide the model towards desired behavior. The ChatGPT has been trained on data up to Q3 2021. It does not have the capability to connect to internet and browse for latest information. While large amount of data was used for training to build generic language communication capabilities, it is difficult to predict the model output for specific questions, as the size of the training data can vary depending on the specific tasks and objectives defined.
The ChatGPT has been made available to the public as a research preview, allowing researchers to study its strengths and weaknesses. The ChatGPT model is available in four versions: Ada, Babbage, Curie, and Davinci. Of these, Davinci is the most advanced model, expensive to use, but is known for producing impressive results.
Who else is there in the Large Language Model market?
Microsoft has invested > $1 billion in OpenAI and in return OpenAI runs all its Machine Learning models on Microsoft Azure cloud servers. This partnership with Microsoft also allows them to access ChatGPT in their own search engine—Bing and GPT-3 in Github Copilot for code generation.
While many people call OpenAI ChatGPT as Google Search killer, in my opinion Google’s LaMDA (Language Model for Dialogue Applications) is way ahead. While ChatGPT is trained on websites hence its responses read almost like a Quora Q&A response, LaMDA has been trained on human dialogues, so its responses are more friendly and, in real terms, “conversational” as if talking to another person. Also, OpenAI produces shallow verbose content (replicated from other websites), including producing incorrect information, fake quotes, and non-existing references. LaMDA on other hand is fine-tuned on three metrics: Quality, Safety, and Groundedness. More on LaMDA in another blog.
Blenderbot by Meta is similar to OpenAI and also trained on 175b parameters, it can connect to internet for latest information. However, both Google and Meta are keeping their chatbots under wrap for now, with very limited availability in USA, hence public awareness is very low.
ChatSonic is also built on GPT 3.5 architecture but it is further enhanced with additional proprietary preprocessing and postprocessing algorithms to feed real-time information from Google and to come up with contextual responses. ChatSonic also generates images using open source AI Stable Diffusion. However, compared to ChatGPT the responses of ChatSonic are less impressive (for content up to Q3 2021).
And then there is Bloom, ready to fight with the heavy weights above, banking on being open sourced, transparent and accessible for everyone as its strength. Bloom is trained on 176B parameters and 500 billion curated words, however it also pales in comparison to ChatGPT, primarily because of smaller training data set used for training.
Competition is always good and it is good to see that there are many LLMs competing to take the crown. While almost all models (except LaMDA) are trained on 175+ Billion parameters, the quality of extensively curated training data for OpenAI chat differentiates it apart from rest of the competition.
Capabilities of OpenAI ChatGPT
With OpenAI's ChatGPT, CIOs can leverage the power of artificial intelligence to improve various areas within their organization, such as:
·????????Software development assistance – generating code from prompts, debugging code, code explanation, generating code documentation
·????????Assistance in content writing – documentation, summarization, translation, parsing unstructured data and keyword extraction, content classification, tagging
·????????Knowledge management – information retrieval, document summarization to understand key points and main ideas, create content to populate knowledge base
·????????Customer interactions – chatbot services involving – answering FAQs about product, Product support, resolving issues, providing personalized recommendations
·????????Employee Experience – employee communication within the organization, answering FAQs (especially for new employees), guidance on company policies and procedures
·????????Marketing and lead generation – by promoting products and services, special offers, interacting with customers to identify needs and generate leads
领英推荐
Limitations & Challenges for Organizations
While the OpenAI ChatGPT may seem very appealing, before diving in, the CIOs also need to keep in mind few limitations that exist as of today
1.??????Cost – Large Language Models (LLMs) like OpenAI ChatGPT need very powerful infrastructure to run which can very quickly end up becoming quite expensive to run. Further while these models are trained on general conversations, to fine tune, pretrain and host these models for your own organization is an expensive and highly skilled activity by itself. There are limitations in training methods - The training data is typically provided in curated prompt-completion pairs (show and tell approach) and is not as providing dumps of documents. To fine tune OpenAI ChatGPT, a minimum of 200 training examples are recommended for an intent, after which each doubling training examples leads to a linear increase in model quality.
2.??????Accuracy – OpenAI ChatGPT is a conversation model, it can generate impressive human like responses, however its responses are not fact checked, content is not verified and often produce stereotype results. These models are infamous for hallucinating knowledge and can very confidently present incorrect information. Also, simple changes in prompts can end up in very different responses, so prompt engineering in itself becomes a specialized task for even slightly sophisticated scenarios (like in coding). Recently, Stack Overflow, the popular programming forum, banned all answers created by ChatGPT, citing a high degree of inaccuracy in the bot’s responses.
3.??????Static knowledge - Training a huge LLM is a very expensive and time-consuming exercise (takes couple of months). OpenAI ChatGPT models are hence batch trained once in a while and are not up-to-date with latest information, any long-term memory they do have is static which is limited to what they’ve been previously trained on. ChatGPT is trained only up to Q3 2021, it has no knowledge of world after that. Memory of any new learnings is very short term, like a goldfish’s memory and doesn’t get into its long-term memory (i.e. the model doesn’t learn).?
4.??????Responsible AI – There is no transparency on (large amount of) public data that has been used to train these AI models and they often tend to reflect general societal biases relating to race, gender, religion, age, and other groups of people, as well as other undesirable content. Such models can potentially behave in ways that are unpredictable, unfair and unreliable. Hence these are not suitable for scenarios where use or misuse of the system could result in significant loss or if stakes are high.
5.??????Legal implications – It’s a new disruptive form of AI, the governments and the law enforcement authorities are still unaware or trying to comprehend how to deal with it. There are many IP & licensing related issues that are yet to be resolved. A class action lawsuit has already been filed against Microsoft GitHub Copilot and OpenAI Codex for open source license violations. Similarly end users of OpenAI ChatGPT can be held accountable for IP infringement if they unknowingly end up using any copyrighted content provided by the OpenAI ChatGPT. Finally, Organizations must also be careful about using their sensitive and proprietary data to train these LLMs. Conversations with OpenAI chat are not private, they may be read by OpenAI team and used for training the LLM model.
6.??????Performance and Robustness – the ChatGPT model may take many seconds to Complete the response and often fails to respond.?
Future ahead
These generative AI models are very disruptive in nature. The public introduction of ChatGPT and DALL.E-2 by OpenAI has generated significant interest in the larger society. AI enthusiasts all over the world are playing around with the model and giving real time feedback which is helping the researchers evolve it very fast. As a consequence, what is barely acceptable today will be state of art a few years later. This will open up opportunities worth billions of dollars for the new startups to capitalize on. Soon we will move away from SaaS (Software as a Service) to MaaS (Model as a Service). These models are likely to be available in 3 broad categories?
·????????Base general intelligence models - Large companies like Open AI, Google, Meta with large resources will create Large Language Models like these which are trained on vast resources from internet, understand the overall language and general world context. It will be easier to take this model
·????????Domain specific models – Few large companies (like Microsoft) and a new set of startups will take above generic LLM and train it in for specific domains – like Law, Medicine, Sports, Finance, etc.
·????????Super specialized models – again few more startups will take above domain specific models and train the model to super specialize in certain niche areas – like Criminal law, Corporate law, etc.
Each category above will need to curate large amounts of training data and use it to fine tune the model (on high end machines with powerful GPUs and a large RAM) for its target segment, adding its value-added premium to the cost of model subscriptions. As enterprise consumers, CIO will have the option to evaluate and pick models from any of the 3 categories as suitable to use As-Is or to fine tune specifically for their organization and purpose.
Conclusion
In this blog we covered what is ChatGPT, its capabilities, it’s limitations and its future in context of an Enterprise. While we are still in an early stage of Generative AI, we are today standing at the cusp of a breakout which over next few years will fundamentally change how complex white-collar jobs are done. Currently cost per query is a major deterrent in mass public adoption of this technology and will become focus of research in the coming years. Meanwhile in enterprise context, a tool like OpenAI ChatGPT can significantly reduce the effort required and provides a very good starting point for humans in areas like software development, content writing, knowledge management, customer interactions, employee experience etc. With right selection of use cases and correct implementation, it has potential to amplify human productivity by many folds. Many of its current limitations relating to accuracy and responsible use can be addressed by putting a man in the middle to provide the necessary guard rails. While OpenAI APIs are available for use in restricted regions (certain parts of North American) through Azure OpenAI API services, CIOs should start thinking how they would ride this disruptive wave of Generative AI to bring efficiencies and competitive differentiation for their organizations.
References
Tech Lead ? Appinventiv
1 年Thanks for insights! Experiencing Chatbot quite a bit. It's exciting technology. Acquiring more than 1 million users within a week of its launch, this platform has made the internet curious?with its human-like replies and prompt answers.? So what that exciting in it? here, you will know - https://www.dhirubhai.net/posts/appinventiv_openai-chatgpt-aichatbot-activity-7009520952296255489-tbYE?utm_source=share&utm_medium=member_desktop
Data Scientist at Infosys || MLOps ||DevSecOps Engineer || Python || Generative AI
1 年Great information sunil sir??