The Copilot Era:My Speech at Semantic Kernel DevDay in Microsoft Reactor

The Copilot Era:My Speech at Semantic Kernel DevDay in Microsoft Reactor

This is the speech that I presented during the Founding Day of the Semantic Kernel DevDay at the Microsoft Reactor in Shanghai. Special thanks to John Maeda ,Thanks for joining us. I would also like to extend my sincere appreciation to Cornelia Kingsley , Jianhui Lu , and Christina Liang , as well as the Microsoft MVP community for coordinating this event.

Significant transformations have occurred in the realm of technology over the past four years, with one of the most notable being the recent emergence and widespread adoption of Copilot.


Why?

Copilot fundamentally embodies the evolution of software engineering, necessitating a blend of historical insight and imaginative understanding to grasp its essence.

The evolution of Software Engineering.

During the Machine Language and Assembly Language Era in the 1940s and 1950s, software engineering iteration relied on physical devices like perforators and paper tape perforation cards, programmed to execute instructions inherent in the machine.

Moving into the High-level programming language period spanning the 1950s to the 1960s, there was a gradual shift towards high-level languages resembling natural and mathematical languages, resulting in increased program readability.

The subsequent Structured program theory design phase in the 1960s to 1970s saw a significant increase in processor speed, necessitating programs to be written to effectively manage all computer resources, leading to the birth of operating systems.

Additionally, the database management system (DBMS) was introduced, and in 1968, the term "software engineering" was coined. The Structured program era from the 1970s to the 1980s introduced the C language and witnessed the birth of Macintosh visual graphics, revolutionizing human-computer interaction. This period also marked the emergence of various early software developments.

As the Internet development unfolded from the 1990s to the present, significant milestones included the emergence of the World Wide Web, the adoption of object-oriented programming, and the rise of commercial software companies, shaping the landscape of software engineering iterations.


When considering the 70-year span of human and computing history, it becomes evident that the breakthrough lies in the culmination of efforts to develop the most intuitive technology. In other words, the significant advancement lies in the progress made over the past seven decades towards establishing a natural user interface, wherein computers comprehend humans rather than the other way around.


And Copilot is the legacy of that thinking.

Let's delve into the structure of copilot. They are organized into three tiers, analogous to the typical layers of an application: front end, mid-tier, and backend.

In the front end, engineers commence by conceptualizing their product idea. The user experience design for copilots deviates slightly from traditional practices, which have remained largely consistent for over 180 years since Ada Lovelace's pioneering program. Historically, engineers have focused on comprehending machine capabilities and establishing explicit human-machine interactions.

For engineers, this typically involves manipulating user interface elements, menus, binding code to actions, and attempting to accurately anticipate user needs. The goal is to architect applications in a familiar manner, ensuring users can easily access all the functionality and capabilities embedded within the code.

In contrast, copilots entail less emphasis on designing user interface widgets and preempting user preferences, as users can naturally express their intentions through language. Therefore, the design focus shifts towards determining the copilot's intended capabilities and addressing any limitations of the underlying model. This may involve augmenting functionalities through the orchestration layer, plugins, model finetuning, or employing portfolio models. Overall, copilot design requires less intricate mapping of user interface elements to code segments compared to conventional practices.

Additionally, it's essential to consider what tasks you do not want the copilot to perform. This aspect is crucial not only for safety considerations but also because the foundational models underlying the copilot architecture possess extensive capabilities. It's often necessary to confine these capabilities within specific domains to align with the intended use.

For example, in the case of GitHub Copilot, the model should remain focused on its primary task of aiding developers in solving development problems. This means the copilot is not designed to assist with unrelated tasks.

Having covered the user interface aspect broadly, let's now shift our focus to orchestration.

Orchestration serves as the operational logic governing your copilot. As Microsoft embarked on developing the copilot, each team within the company independently constructed their orchestration layers, encompassing the logic necessary to sequence through models, perform filtering, and conduct prompt augmentation essential for app development.

Recognizing commonalities across these efforts, Microsoft opted to unify its approach by establishing a single orchestration mechanism known as Semantic Kernel, which has been made available as open-source software.


However, developer should always explore open-source tools according to preferences and requirements.

The speaker is me:)

Within the orchestration layer, the primary focus is on manipulating prompts. A prompt essentially consists of tokens generated by the user experience layer of the application. It can take various forms, such as a question in Bing Chat or ChatGPT, or a command conveyed from the application to the model.

Prompt handling in the initial stages of orchestration involves prompt response filtering, aimed at ensuring that prompts align with the application's requirements and do not lead to unsafe or undesirable model responses. Filtering of responses occurs both during prompt generation and after the model produces a response, allowing for selective filtration of prompts as needed.

Additionally, there exists a unit of prompt code known as the meta prompt. This serves as a set of instructions passed down to the copilot model in every conversation turn, guiding its behavior and adaptation to the specific copilot being developed. The meta prompt plays a crucial role in safety tuning and personality customization, enabling developers to shape the copilot's behavior according to desired characteristics. For instance, Microsoft utilizes the meta prompt to adjust Bing Chat's tone to be either more balanced or more precise, depending on the context.

It is also a means of introducing new capabilities to the model. Meta prompt design can be likened to a form of fine-tuning, offering a more straightforward approach compared to delving into the lower layers of the infrastructure to develop customized solutions.

Following the meta prompt and prompt filtering stages, attention turns to grounding. Grounding involves supplementing the prompt with additional context that may aid the model in generating a response. In the case of Bing Chat, for instance, which pioneered retrieval-augmented generation before it had a formal name, Microsoft analyzes the user query and queries the search index to retrieve relevant documents, which are then added to the prompt to provide the model with extra context for generating a suitable answer.

Increasingly, vector databases are being utilized for retrieval-augmented generation. This involves computing embeddings for the prompt and conducting a lookup in a vector database indexed by these embeddings to retrieve relevant documents, thus furnishing the model with additional context to enhance its response. Grounding can also involve augmenting prompts using arbitrary web APIs or utilizing plugins.

Subsequently, plugin execution takes place, where plugins may add extra context to the prompt before it reaches the model or execute actions on the system upon returning from the model.

Upon traversing the orchestration layer, which may involve multiple iterations through the pipeline and interactions with multiple models, the foundation models and infrastructure form the bottom layer of the stack. Microsoft offers various options for leveraging foundation models within the Copilot platform on Azure and Windows. Users can opt for hosted foundation models such as ChatGPT or GPT-4 available on the Azure OpenAI API Service, fine-tune these models using APIs, or deploy custom models.

For those unable to address their needs through hosted APIs or finetuning, Microsoft encourages the exploration of open-source community contributions.


I was very impressed with what Kevin Scott said at Microsoft Build 2023.

"Copilot is going to be an amazing part of our productivity story, like GitHub Copilot works great on Windows. But increasingly what you’re going to see is the ability to run these powerful AI models on your Windows PC so that you can develop these true hybrid AI applications that span the edge all the way to the cloud. And it’s just a really, really exciting thing. "




Eng. Muaadh AlSoufi

Projects Operation Manager | PMP? | MBA Candidate Generating… ████████████???? 68% Completed

1 年

Congratulations on the incredible milestone in the realm of computing and human-technology interaction! Over the past 70 years... Wishing you the best of luck in all your goals.

要查看或添加评论,请登录

Yaqi Zhang?????的更多文章

社区洞察

其他会员也浏览了