Would OpenAI’s GPT-3 ever work for a direct enterprise use?

Would OpenAI’s GPT-3 ever work for a direct enterprise use?

OpenAI’s?GPT-3 offers?different features?to build custom?applications. However, it still remains to be seen whether?GPT-3?can deliver enterprise-level solutions for direct consumption, or make it easy only for point solutions after a heavy customization and training. Enterprises should be able to derive value directly for internal consumption from any technology that they invest in. Contrary to the hype surrounding GPT-3 since its release in 2020, I believe it is not yet ready to deliver significant value. I tried out GPT-3 myself?to check?if it is ready for a?direct?enterprise?consumption?and it fell short on a number of counts.??

In this blog, I will cover my experience and explain with simple examples where Deep learning language models like GPT-3 fails to work for direct enterprise use cases.

What is GPT-3 and its promises??

GPT-3 is primarily a transformer model that generates human-like structured text based on a given text input sequence. It is a sequence-to-sequence deep learning model, comprising 175 billion parameters and trained on large text datasets. It is designed to find application in text summarization, language translation, question answering and other text-based processes.?

With its promised ability to perform a wide range of natural language tasks, GPT-3 claims to be useful in performing the following tasks:?

  • Keyword extraction?
  • Text generation?
  • SQL translation?
  • Grammar corrections?
  • and many?more...?

My experience with GPT-3?

I picked up some of the above tasks to check how GPT-3 delivers on its promise. The primary aim of my analysis was to check whether GPT-3 delivers any value to enterprises from non-technology industries such as healthcare, financial services, retail, or manufacturing, instead of an integrator like Microsoft Office or Gmail.?

I chose this industry user profile after considering some important reasons:??

  • access to training data?
  • efforts needed for customization?
  • user base ranging into millions for point solutions?

Let me elaborate with some examples:?

Keyword extraction???

Keyword extraction?is an automated identification of terms that best describe a specific content.?The task of keyword extraction is an important problem in?text mining?and?information extraction. Think of use?cases where an enterprise user is analyzing user surveys, product reviews, team correspondences, articles, product documentation etc.?

Context is?everything?when it comes to extracting keywords.?

To test its summarizing capabilities, I provided the content of?different?survey questions?that?an?enterprise was looking to see the responses as a pivot table.?The?specific example is picked because of the openness of the data and the common understanding in public news mediums.?The intention is?to?figure?out?how?GPT-3?would?show the key summary of the content. Unfortunately, the “keywords” it extracted and displayed contained some words which weren’t even remotely hinted in the provided text. Look at the keywords like?erythrocyte sedimentation rate?and?synovitis?and check for yourself if they really summarize the provided text.?

No alt text provided for this image


Text generation?

This is supposed to be one of the widely popular applications of GPT-3.?Text generation can be helpful for enterprises to narrate on the data observations,?target?customer segments,?or understanding of the usage patterns. A list of key value pairs can be hard to read and understand due to various sorting orders, but a text narration can be super easy. Think of?the difference?where a restaurant?presents their?menu as?a set of?key-value pairs vs.?as?a?paragraph?describing the item.?

Providing value-added content should be the key focus of text generation.?

How text generation is expected to?work is that you provide some key ideas and context and you get a readymade paragraph of content. I got some questionable and irrelevant output while trying out this use case. The provided text was something that we often extract from business reports or jot down during monthly review meetings. My expectation was to get a comprehensive sentence or two that describes the sales figures for the product. We would expect a generated text like:?“The June 2020 sales for Orange Coffee product were 2.9M in San Jose”?or?“In June 2020, Orange Coffee sales in San Jose were 2.9M”.??

But look at how GPT-3 adds outright misleading and totally out of context description in its generated text:??

  • Sales in June were 2.9 million, a decrease of 8% from May: where has it picked the reference to May sales??
  • The reason for decrease was because of warmer weather:?Umm... really???
  • Should be on shelves by end of the year: says who???
  • I am meeting with a representative from the company Orange Coffee: this is pure fiction!?

No alt text provided for this image
No alt text provided for this image

SQL translation??

SQL translation can be helpful to automate tasks, but needs to be solved as a problem end-to-end. If you think about it, expert users who have access to data source don’t really need the SQL to be generated for themselves.?The SQL generation is intended for someone who doesn’t have access and who doesn’t know the schema of the data. Even when this whole thing works,?SQL generation is not going to cut it, since the result data needs to be visually presented?and?analyzed as well. I am keeping the end-to-end story?aside for this blog.?

Let me explain with?a?couple of examples:?

  • Input:?how many users signed up in the past month??
  • GPT-3 Response:?SELECT?COUNT(*) FROM users WHERE?signup_time?>?now() - interval '1 month'?

There are a few series?issues with this?SQL response from GPT-3:

The generated SQL?would?find?“signup_time”?dates between?Nov?19th?-?Dec?18th?assuming it is?executed on Dec 18th. This?would be?a?wrong?interpretation of “past month”, but could have been an answer for?“past 30?days”.?The correct answer for an?enterprise?would?be to compute the time period based on a business calendar period, such as Nov 15th?to?Dec?14th?assuming the fiscal calendar starts on Oct 15th. If the business?calendar?is not defined or not?available,?at least choose the?calendar?months.?All I am saying is that “a month” in the business world is not the same as “any?30-day?moving window”.?

I changed the scenario?by?adding?another column called?“signup_exit_time”?in the data source. When I tried the same query again,?the?“signup_exit_time”?column?got picked up instead of?“signup_time”.?SQL generation can’t be demonstrated with such simplistic scenarios, because there is no practical value to it. A data engineer would have written this code in just a few seconds – what is needed is the handling of complex scenarios.??

SQL generation as a feature must be an?actual?enabler and?ambiguity solver, not just a demo use.?

While I can?list?hundreds of issues with this SQL generation feature,?my intention is not to point fingers, but just identify if this?technology?is enterprise-ready or not.?What?enterprises?need the?most is the?ability to?handle ambiguity and incompleteness?in queries.?Think?of?a usual business?scenario where the?sales manager?wants to know?“sales in Alabama”. Now in the data,?there is?the state of?Alabama,?Alabama county in 10 states, and Alabama city in 25 counties across America. The query needs to be?understood for?its?ambiguity first,?then?it needs to be?conveyed?it back?explicitly,?and still?be able to?proceed with?most appropriate choice.

GPT-3 disappoints?on this SQL query generation?technique?because it raises the marketing bar without any?solid?underlying technology.?Let alone?an enterprise,?I am not sure how?can even?a third-party integrator?use?it for any purpose. This particular?feature?of?OpenAI?will continue to remain the most underdeveloped capability of?OpenAI.

Grammar corrections?

Grammar corrections?in?email content?is a commonly found and also?relatively easy to solve?use case,?due to the common correspondences in business email communications?(personal communications are mostly text messages or phone calls these days!).?I tried GPT-3 to resolve enterprise-level use cases of?grammar corrections in business scenarios?– and again?observed disappointing results.?

Grammar corrections can be more complex than commonly understood.?

Assume a scenario where an analyst is preparing a?briefing note?on a campaign that performed well.?One would expect the corrections to provide details about the coupon that boosted the performance,?the regions in which it was most successful, the sales figures, the growth percentage,?and so on.?However, the correction talks about?the fourth?quarter?and how it fared negatively?in terms of gross profit,?which is?totally?out of?context.?

No alt text provided for this image

I tried another use case, where I used the word “Call-in” expecting?it to be corrected or elaborated as?the?Call-in medium of receiving customer orders. However, the suggested correction was exactly same as my entered input.?The value-add of using GPT-3 here?was clearly missing.?

No alt text provided for this image

What Enterprises Really Need from Tech??

The ability to solve specialized problems in a meaningful manner?such that it adds?real?value to their operations. While?OpenAI’s?GPT-3?shows a lot of promise, it would take a really long time?to be of?meaningful use?to?an enterprise.?It’s at a?nascent?stage,?where even for integrators,?it can only give very pointed solutions in a very specific workflow.?Enormous pre and post processing work?will be required?to get these features into a useful state.?

At?MachEye,?we?have already?solved these problems?for our enterprise customers?with our own advanced models, and haven not used GPT-3.?Going beyond?fancy demos that have very little real-world applications, we strive to?build?enterprise-level?solutions. We?dive deep to examine every granular aspect and build robust models that can withstand the test of expectations.

Shyam Pillai

Product Management

2 年

Great Insights

要查看或添加评论,请登录

社区洞察

其他会员也浏览了