Unibet racing.Enjoy Free 888+200 Daily Legal Bonus

My name is Aamir Mirza? ( ????? ????? ????? ?????? ?????) an AI researcher based in Melbourne Australia. If you are not related by profession to the field of computer programming or ML (Machine learning) you might be one among millions who would have missed an important milestone achieved by NLU ( Natural Language understanding ) AI. I am talking about the ability of large ML-based NLU modes to write fully functional computer code based on the declaration of tasks devised by the developer.

To get the ball rolling please have a look at the example.

In the quotes are the directions written by the developer rest of the function code is generated by NLU-based AI. The most obvious question here is how does this even possible, to make the code work the underlying AI has to understand the directions provided by the developer and based on the understanding it generates functional computer code. This is very much similar to how a human would write the code based on some instructions provided.

Since this article is directed toward a general audience, I shall desist from making it too technical. Instead, I shall focus on the overview and possibilities it opens up.

More of the same.

For those not familiar with the intricacies of programming, a typical developer's professional life can be described in just one line “ More of the same ”. For example, they spend most of their precious time writing the same repetitive code over and over again.

Connecting to a Database to retrieve or store information.
Connecting to external service.
Interacting with various hardware devices, printers, scanners etc.
Process control and error handling.
Quality checks on the information they process.
For loops, while loops for iterating over the vast set of data.?
Finally, drawing pretty graphics on the screen for the end-user to interact with.

Of course, this is an oversimplification of what they do. However, in the ballpark, this is pretty much it.

A typical developer can take anywhere from 1 Year to a couple of years to get on top of their game; it is a sizeable investment of time and money. If they ever leave the organisation that invested time and effort training them, this is definitely a huge loss. The other aspect of this is that when talented developers leave the organisation, they take all the knowledge with them.

Let me take care of the mundane.

What if we trained an AI to do this, or better, what if an AI just learned to do this by itself. Let’s dive into the age of Declarative Programming.

As shown from the example above, all a developer has to do is declare what they want to be done, and the AI will generate the code for them in their favourite programming language; it does so without any human intervention and using an ML ( machine learning ) based technology call self-supervised learning.

What is Self-supervised learning?

According to Wikipedia, self-supervised learning is “a method of machine learning. It learns from unlabeled sample data”. Immensely larger language models ( Transformers ) are fed text scraped from the internet and other sources, books, etc., and the most basic task they are trained on is to predict the next word, given some previous word tokens. For example, “ I am going to the —-- “ the possible next word could be “ Shopping-centre” or “park” etc.? In doing so, they understand how to write clear text. The first generation of such models did precisely this.

Some of the applications of first-generation language models were.

Summarisation of customer feedback.
Semantic search.
Classification of text, sentiment analysis.
News feed and blog writing. Etc.

Bigger is better.

Companies and researchers involved in developing such models quickly found out that the bigger these models got in terms of numbers of parameters, the better they were in understanding the subtle nuances of written language. For example, earlier models would have a few billion parameters; the successive few iterations would have a few hundred billion parameters.

The contrast between them could not be more transparent in terms of language generation and the ability to be customised for downstream tasks such as domain-specific language generation, summarisation, search etc. The current stage of NLU model development is as follows.

Another quality of these extreme large language models is that they can be fined tunes for a specific task with very few examples; for example, if a company wishes to summerise legal or medical text, which is very domain-specific then all they have to do is come up with few hundred examples of the same and train their modes. This is a great leap forward because their predecessors would have required tens of thousands of examples. All of this comes down to their usability in the domain where labelled data can only be generated by experts and is hard to come by. In a previous engagement, I trained a language classifier with very few examples using these same techniques.

The subtle realisation.

As I mentioned earlier, the training data for these language models mainly came from the internet, including coding examples, from sites like Github.

In the testing phase of these models, QA and researcher realised that these language models were, at least at a rudimentary level, good at generating computer code when given a suitable prompt. Here is a code snippet generated for Python ( Programming Language ).

We are asking ( OpenAI Co-pilot ) to develop a tic-tac-toe game.?

The resultant code is a fully functional text-based game. Applications are not limited to developing the game; it is entirely domain neutral.

Form research to Application.

Companies at the forefront of NLU research realised the importance of this accidental discovery and started developing language models specifically targeting this application. For example, the OpenAI co-pilot is exclusively trained on tons of Github code repositories.? Some of the products operating in this space are.

Copilot
Tabnine
Airtable.
SurveySparrow.
Appy Pie.
Nintex Process Platform.
Kissflow.
Salesforce Platform.
AppSheet.
Ninox.

Not all offer code generation or completion options; I mention them here because of their ability to provide automation in different domains.

To infinity and beyond.

As a researcher in this field and intimately familiar with how these models are developed and operate, I was utterly blown away by the code generation capabilities of these first-generation language models.

Given the amount of money riding this space in terms of investment and potential revenue, things will only improve. I am confident that the next generation of these models would further extend their automation capabilities. The icing on the cake is that most of these models and related research are available in the public domain and, as such, get a lot of scrutiny.?

Not just text, videos, audio and images too.

In the near future, these automation models will not be limited to text; they will also utilise sound, images and videos. They are roaming the vast reaches of the internet to ingest human-generated data and even becoming more productive and powerful. One example of such integration is OpenAI Dall-e; this particular language model generates images from a text description of the same.

Solving high school Math problems.

Trained on vast internet data and then fine-tuned on specific downstream tasks, these transformer models are incredibly versatile. They solve math problems. For this to happen, this NLU model has to understand advanced math concepts and the relationship between variables. Full details can be found here . Minerva: Solving Quantitative Reasoning Problems .

Let me state that while maths reasoning models solve problems, they can make mistakes; as such, they are included here purely to demonstrate NLU capabilities.

Companies and individuals.

From the invention of the wheel to the latest language models, we as a species strived for speed and time efficiencies; in these days of tight timelines and even stricter development budgets, products like Co-pilot will get us where we are going faster. However, I do wish to sound a word of caution; these models will make mistakes and might generate code that might contain security vulnerabilities or hard-to-detect bugs; it is imperative that all code generated be peer-reviewed by humans or security scanning software before we can call it production-ready.?

In the end, what is possible is only limited by our imagination.

Till next time, another day, another topic. Writing this blog has been a pleasure; I hope you have found it fascinating.

The Age of Declarative Programming.

Aamir Mirza

More of the same.

Let me take care of the mundane.

What is Self-supervised learning?

Bigger is better.

领英推荐

The subtle realisation.

Form research to Application.

To infinity and beyond.

Not just text, videos, audio and images too.

Solving high school Math problems.

Companies and individuals.

更多精彩文章

社区洞察

其他会员也浏览了

18 Top AI Coding Assistants For Programmers

The End of Coding as We Know It? A Look at the Future of Programming

Top 10 AI Tools for Programmers in 2023

Generative AI Tools Landscape - Coding Applications – Part2

Langchain Expression Language—Simplifying Complex Workflows

MetaGPT: Important Conceptual Advance in Multi-Agent Systems

We need to invent new programming languages to interact with LLMs

OpenAI API Calls - Is that the future of programming ?

Devin is Replacing Human Programmers?

Software Testing and Data Science in 2024

More of the same.

Let me take care of the mundane.

What is Self-supervised learning?

Bigger is better.

领英推荐

The subtle realisation.

Form research to Application.

To infinity and beyond.

Not just text, videos, audio and images too.

Solving high school Math problems.

Companies and individuals.

EV: To be or not to be.

2024年7月26日

GPT the next frontier. Multi-Modality a path to AGI.

2024年2月11日

Age of truly immersive gaming, GPT meets NPC.

2023年5月21日

The Age of Prompt Engineering.

2023年4月23日

GPT - Hallucination, toxicity, and other electric dreams

2023年3月22日

Chat GPT unpacked.

2023年1月27日

Security, What Security?

2022年11月20日

Self Driving Cars, what they are and what they are not

2022年5月13日

Using Machine learning for Data anonymization.

2020年9月25日

From Strings to Things.

2018年8月13日

社区洞察

其他会员也浏览了

18 Top AI Coding Assistants For Programmers

The End of Coding as We Know It? A Look at the Future of Programming

Top 10 AI Tools for Programmers in 2023

Generative AI Tools Landscape - Coding Applications – Part2

Langchain Expression Language—Simplifying Complex Workflows

MetaGPT: Important Conceptual Advance in Multi-Agent Systems

We need to invent new programming languages to interact with LLMs

OpenAI API Calls - Is that the future of programming ?

Devin is Replacing Human Programmers?

Software Testing and Data Science in 2024