Mastering AI: How to Become an AI Agent Developer with Microsoft Technologies in 2024
Mario Fontana
Sr. Cloud Solution Architect - Microsoft AI LAB. Linkedin Top Voice Artificial Intelligence. Book Author. International Keynote Speaker. AI Coach for ISV & Startups.
The use of Artificial Intelligence in building and using applications is changing. With applications such as chatbots and recommender systems or processes like computer vision and natural language processing, AI makes application functionalists more usable and valuable than before for several domains and industries’ various areas. Nevertheless, integrating AI into modernizing applications presents new dilemmas and prospects to software developers; they must conform with fresh architectures, tools, and paradigms that require technical expertise enhanced by ethical considerations and user engagement levels while maintaining seamless integration.
AI Agents: What Are They Really?
Using agents is among the most promising patterns emerging in AI development. Through agents, whole systems get built featuring multiple intelligent, autonomous elements referred to as agents. These agents can communicate with each other and the environment in a way that makes it more dynamic and adaptable. Agents in generative AI are, therefore, specialized programs for performing particular kinds of tasks on their own. They can understand user inputs and produce outputs through their programming, depending on the context in which they run, from typing text into a keyboard to producing music from a scores sheet.
Take, for example, the practical implications of having a virtual assistant in your office. This means that an AI agent can perform repetitive chores like writing emails, organizing appointments, putting forward ideas in making reports, etc. These virtual assistants work by receiving instructions from humans through natural speech patterns or keywords and then performing tasks without our intervention. So if you say, “Give me the last update about our current project,” they will find out what you mean by this and provide you with the necessary information. To demonstrate precision in computation, these agents utilize specific code that can execute particular activities. For instance, if you need to prepare a presentation, an AI agent can collect relevant data, draft slides, and even suggest visual elements to enhance your content.
In a more advanced scenario, agents can collaborate to perform complex tasks. Take, for example, where you might have one agent gathering data from different sources while another analyses the same data, and then a third one comes up with a detailed report derived from this analysis. Like any other working group with people with varying skills, backgrounds, and tools for specialized tasks, this collaborative method depends on multiple agents who have unique abilities so that results are delivered swiftly and accurately.
Generative AI agents perform in the same way because they are tailored experts with several different skill sets, just like human beings in different work environments who have varied talents, backgrounds, and tools required for job execution. Every agent is a unique piece of code explicitly crafted for a single task at hand. For instance, in an office, an accountant might use financial software packages, while a graphic designer would opt for designing tools, and there could be another individual who writes using word processing software programs.
By splitting architectures with agents, AI developers can create more modular, flexible, and robust systems that handle complex and dynamic scenarios.
AI Agents vs Humans vs Operating Systems
A recent paper?2304.03442 (arxiv.org)? by Joon Sung Park and colleagues at Stanford University, titled "Generative Agents: Interactive Simulacra of Human Behavior," provides an exemplary study on this topic. This work introduces generative agents—computational software designed to mimic human behavior in interactive applications, offering a profound insight into the future of human-AI interaction.
Generative agents?are portrayed?in a sandbox environment reminiscent of the popular game The Sims. These agents can wake up, cook breakfast, go to work, and interact with one another in a manner that is strikingly similar to human social behaviours. For example, the paper describes how an agent, starting with the simple goal of planning a Valentine's Day party without any further suggestion or information, autonomously invites others, spreads the word, and coordinates the event with a network of 25 agents, each responding based on their unique past experiences and relationships.
While I find it interesting and easy to understand to liken agents to people, I think it is not the most suitable simile because agents today do not have the 'complexity' that human beings have. For example, agents operate based on predefined architectures and memory streams, needing more emotional depth, cognitive flexibility, and a nuanced understanding of the context that characterizes human interactions. Their behaviors, while impressive, are still bounded by their programming and the limitations of current AI technology, which cannot fully replicate the intricacies of human thought and social dynamics. It's important to note that generative agents are not a replacement for human intelligence but rather a tool that can enhance and augment human capabilities.
This is why I prefer the definition of Andrej Karpathy, co-founder of OpenAI, who describes agents closer to the concept of an Operating System.
... a more complete picture is emerging of LLMs not as a chatbot, but the kernel process of a new Operating System. E.g. today it orchestrates Input & Output across modalities (text, audio, vision), Code interpreter, ability to write & run programs, Browser / internet access and Embeddings database for files and internal memory storage & retrieval.
Now, regardless of whether you like more of a matrix scenario with fully autonomous agents as people or, more conservatively, thinking of them as the kernel of an operating system, let's see what the primary skills are for working in this area.
The AI Agent Developer
While the concept of AI agents is gaining a more precise definition (at least at a high level), the roles of AI agent developers and the specific skills required still need to be broadly defined. This?ambiguity stems partly from the novelty of generative AI as a field and?agents as a reference architecture, resulting in a limited pool of experienced professionals in the market.
Based on my experience, AI agent developers are specialised professionals tasked with designing, implementing, and maintaining software that follows the architectural pattern of AI agents. This role demands a comprehensive hard skill set, including an understanding of computer science fundamentals and expertise in programming languages such as Python, C#, and Java and low-code/no code platforms. Proficiency in utilizing libraries such as PyTorch, TensorFlow and similar libraries is also very valuable. Additionally, a deep knowledge of machine learning algorithms, natural language processing, and cloud computing is required. Cloud computing expertise is?very important?because it enables the deployment and scaling of AI agents across distributed systems. Mastery of generative AI, particularly in prompt engineering, is also a crucial component of the skill set necessary for success in this role.
An AI agent developer must have not only technical skills but also soft skills. The role involves more than just coding; it requires creative thinking, problem-solving abilities, and a deep understanding of the practical applications for which the agents are designed. As AI continues to evolve, it is crucial to clearly define the qualifications and skills required for AI agent developers, but for now, let's delve into how to become an AI agent developer because the emerging nature of generative AI and the architectural pattern of AI agents clearly highlights a significant gap in the availability of seasoned professionals.
Why Now?
Today marks a pivotal moment for organizations and professionals to invest in AI development. The 2024 Work Trend Index Annual Report from Microsoft and LinkedIn underscores that AI is no longer a futuristic concept but a present-day imperative transforming workplaces globally. The momentum is undeniable, with 75% of knowledge workers already using AI tools and 46% adopting these technologies in just the past six months. The surge in AI usage is driven by the tangible benefits it offers.?These advantages?not only elevate individual productivity but also?contribute to?broader organizational efficiency and innovation.
Investing in AI development now is essential for staying competitive in an evolving job market. Leaders across industries recognize the importance of AI skills, with?a significant?66% stating they would not hire candidates lacking these competencies. Moreover, 71% of leaders?would?prefer hiring less experienced candidates with AI skills over more experienced ones without them. This shift highlights the growing demand for AI proficiency and the opportunities it creates for career advancement.?
66% of leaders say they would not hire someone without AI skills
Early experts?are likely to?gain a significant edge, securing better roles and responsibilities within their organizations. Furthermore, as AI continues to drive business transformation, those equipped with AI skills will be at the forefront of redesigning workflows, leading innovative projects, and setting new standards in their fields. This makes now the ideal time to invest in AI development, ensuring professionals and organizations are well-prepared to harness the full potential of this transformative technology.
71% of leaders say they'd?rather?hire a less experienced candidate with AI skills than a more experienced candidate without them.
Investing, specifically in AI Agent developer roles, is particularly crucial at this juncture. AI Agents, which automate complex tasks and interact intelligently with users, are becoming integral to various applications across industries. As businesses increasingly seek to enhance customer experiences, streamline operations, and innovate products, the demand for skilled AI Agent developers will only grow and be central to the company's strategy.?
By focusing on this niche, organizations can propel significant advancements in AI capabilities, positioning themselves at the cutting edge of technology ensuring a sustained competitive advantage in a rapidly evolving market, and?you will be the leading actor driving this transformation!
Top 5 Stages for Aspiring AI Agent Developers
At this stage, we need to examine the resources offered by Microsoft and its partners in 2024 for individuals who want to pursue a career as an AI agent developer. Each stage builds upon the previous one, creating a solid foundation and enabling continuous learning. However, if you are already a senior professional in the IT field, you can navigate these stages more flexibly.
Now, let's delve into the details:
Stage 1: Coding
Python
The first step in becoming an AI agent developer is mastering the essential coding skills. Python is the preferred language for building AI software due to its simplicity, extensive libraries, and robust community support. Python's syntax is easy to learn and read, making it an ideal choice for beginners and experts. Additionally, the language boasts a wide range of libraries and frameworks, such as TensorFlow and PyTorch, which are crucial for AI development.
GitHub Copilot
Tools like GitHub Copilot can be incredibly helpful in streamlining your coding experience. This AI-powered assistant integrates directly into your coding environment, offering real-time code suggestions and helping you write more efficient and error-free code. It is a valuable partner in your learning journey, enabling you to focus on solving problems and building projects rather than getting bogged down by syntax errors and code structure.
Visual Studio Code
Using an Integrated Development Environment (IDE) like Visual Studio Code (VS Code) further enhances your development process. VS Code provides a versatile and user-friendly platform with features like debugging, version control, and extensions tailored for AI development. Its seamless integration with GitHub Copilot and support for Python make it a powerful tool for aspiring AI agent developers.
Low-Code/No Code Platforms
Modern developers must also be proficient in low-code and no-code platforms, such as Microsoft’s Power Platform. This suite of tools enables developers to create sophisticated AI-driven applications with minimal hand-coding, streamlining the development process and making it accessible to a broader range of users. The Power Platform includes several tools like Power Apps for building custom applications, Power Automate for automating workflows, and Copilot Studio for creating chatbots and agents.
Learning these tools significantly reduce development time and prototyping other than empower non-technical users to contribute to creating AI solutions. These tools foster innovation and agility, so your projects can rapidly respond to business needs and market changes. As AI continues to integrate into various facets of business operations, the ability to leverage low-code and no-code platforms becomes an indispensable skill for you.
Stage 2: Cloud Computing
Next, becoming proficient in Microsoft Azure, a cloud platform designed for developers, is important. It offers various services for developing, deploying, managing, and scaling AI applications. Azure also provides specialized tools and resources for Python developers to improve the AI development process.
Azure for Python developers
-?Azure for Python developers | Microsoft Learn?where to find essential resources, including getting started guides, tutorials, and quickstart materials.
Azure AI Services
Azure AI services help developers and organizations rapidly create intelligent and responsible applications. These services offer out-of-the-box and prebuilt APIs and models that cover various capabilities. For example, you can use Azure AI services for natural language processing (NLP) in conversations, search, monitoring, translation, speech, vision, and decision-making. Most of these services are accessible through REST APIs and client library SDKs in popular development languages. Other services include Vision, Content Filtering, and Azure OpenAI, where you can find services like GPT4o, Dall-E 3, and all other OpenAI's advanced models.
By leveraging Azure AI Services, you can integrate advanced AI functionalities into your applications and agents, automate processes, and gain deeper insights from your data.
Azure Certifications:
In parallel with gaining practical experience, it is crucial for you, as an aspiring AI agent developer, to pursue the latest Microsoft certifications to validate your skills. Start with foundational certifications such as AZ-900 (Microsoft Azure Fundamentals) and AI-900 (Microsoft Azure AI Fundamentals) to build a solid understanding of core concepts. Progressing to more advanced certifications, such as AI-102 (Azure AI Engineer Associate), will provide you with deeper insights and expertise. These certifications formally recognize your skills in using Microsoft Azure AI Services and other relevant technologies. Obtaining these credentials enhances your credibility, ensures you are up-to-date with the latest advancements and best practices, and significantly boosts your career prospects by showcasing your commitment to professional growth and a high level of competency in leveraging Microsoft's cutting-edge AI tools and services.
Currently, there is no specific certification for AI agent development, but who knows what the future holds? Stay ahead by continuously learning and adapting to new advancements in the field.
AZ-900 Exam
- Understanding the exam:?Microsoft Certified: Azure Fundamentals - Certifications | Microsoft Learn
- Studying for the exam:?Study guide for Exam AZ-900: Microsoft Azure Fundamentals | Microsoft Learn
AI-900 Exam
- Understanding the exam:?Microsoft Certified: Azure AI Fundamentals - Certifications | Microsoft Learn
- Studying for the exam:?Study guide for Exam AI-900: Microsoft Azure AI Fundamentals | Microsoft Learn
AI-102 exam
- Understanding the exam:?Microsoft Certified: Azure AI Engineer Associate - Certifications | Microsoft Learn
- Studying for the exam:?Course AI-102T00-A: Designing and Implementing a Microsoft Azure AI Solution - Training | Microsoft Learn
Stage 3: Generative AI Development
领英推荐
The next crucial step to becoming an AI agent developer is mastering generative AI development. This includes understanding and working with large language models (LLMs), harnessing Azure's powerful AI tools, and effectively utilising a wide range of open-source models.
How to Work with Azure OpenAI: Azure OpenAI provides access to some of the most advanced language models available today. Learning to work with Azure OpenAI allows you to integrate these models into your applications seamlessly. This involves understanding the API structure, managing authentication, and effectively calling the API to generate content, answer questions, and perform other language-based tasks.
- Chat with your data, sometimes referred to as Custom Copilot (see below):?Azure-Samples/azure-search-openai-demo: A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences. (github.com).
Then, you can move to some architectural considerations:
Mastering Prompt Engineering: Prompt engineering is another crucial skill for any AI agent developer. It involves crafting effective and precise prompts to elicit the desired responses from AI models working with large language models. This skill is essential for optimizing the performance of generative AI systems, ensuring they produce accurate, relevant, and contextually appropriate outputs.?Effective?prompt engineering requires understanding the model's capabilities, careful selection of keywords, and iterative testing to refine prompts.?
Bonus: I’ve created a series on prompt engineering, starting from the ground up:
Assistant API in Azure OpenAI:?The Assistant API in Azure OpenAI is a powerful tool that enables developers to create sophisticated conversational agents. By mastering this API, you can build AI systems that engage in natural, context-aware conversations with users. The Assistant API simplifies setting up dialogue systems, managing context, and creating interactive user experiences. This capability is crucial for developing customer service bots, virtual assistants, and more applications. ?
-?Azure OpenAI Assistants API vs Chat Completions API: Which One is Right for Your AI Project? (microsoft.com)
How to Work with the Azure Model Catalog to Access Thousands of Open Source Models: Azure's extensive catalog offers access to thousands of open-source models, which can significantly enhance your development process. Knowing how to navigate this catalog, select appropriate models, and integrate them into your projects is important. This capability allows you to leverage prebuilt models, saving time and resources while enhancing the functionality of your AI applications.?
The Azure Model Catalog is also integrated to Hugging Face Spaces with Visual Studio Code, a resource that you need to know.
With APIs, you can start building your AI-based solutions by assigning specific tasks to different parts of your code, creating a modular and efficient system. These APIs provide the tools and frameworks necessary to integrate advanced AI functionalities seamlessly into your applications. For example, you can use one API to handle natural language processing tasks, another for image recognition, and yet another for data analysis. This approach allows you to leverage specialized capabilities within each module, ensuring that your "AI agents" can perform their designated tasks with precision and efficiency.
To build AI agents, it’s not necessary to use dedicated AI agent frameworks; you can achieve this using standard development tools and APIs without any specialised support. This flexibility allows you to tailor your solutions to meet specific requirements from the ground up. However, in the next stage, we will introduce AI agent frameworks because they provide higher levels of abstraction, simplifying the planning and development of your agents and enhancing your ability to manage complex interactions and functionalities more effectively.
Copilots
The Copilot brand by Microsoft represents a transformative approach to integrating advanced technology, specifically large language models (LLMs) and natural language interfaces, across a diverse range of products. This innovative brand is designed to enhance user experience, streamline workflows, and empower users with intelligent assistance tailored to their?specific?needs. Today, there are several examples of Copilots:?GitHub Copilot?is at the forefront of this transformation (which we discussed in the previous stage). In?the realm of?productivity,?Microsoft 365 Copilot?integrates seamlessly with popular Office applications like Word, Excel, PowerPoint, and Outlook. It automates routine tasks, generates content, and provides intelligent suggestions, significantly boosting productivity and ensuring high-quality outputs with minimal effort. Similarly,?Dynamics 365 Copilot?leverages AI to automate business processes, analyse data, and provide actionable insights, improving customer relationships and optimising operations. Power BI Copilot?takes data analysis and visualization to new heights by enabling users to interact with their data using natural language. At the same time,?Security Copilot?enhances cybersecurity by using AI to detect threats, analyze security data, and recommend actions, thereby ensuring robust protection for organisational assets.
The Copilot brand is not just a tool, but a vision for the future of AI integration. It is set to expand further with future integrations across all major Microsoft products and services. By understanding the Copilot brand, you can position yourself at the forefront of this evolution, ready to leverage these advanced technologies in their AI development projects. The term “Custom Copilot” is commonly used in the market to indicate solutions based on ChatGPT that serve as addons to existing applications, adding AI capabilities, natural language interaction and incorporating specific knowledge of the application. This concept expands the usability of Copilots, making them more versatile and tailored to unique business needs.
As an aspiring AI agent developer, it is essential to know how to create custom Copilots (Chat with your data pattern describer earlier) and integrate Microsoft Copilots with customer data. This involves leveraging the Copilot Stack to build solutions that enhance user interaction and provide intelligent, context-aware assistance. Mastery in this area enables developers to deliver highly customized and effective AI-driven tools that meet specific business requirements.
One essential tool to learn is?Microsoft Copilot Studio, a Microsoft Power Platform component that empowers you to create custom copilots with advanced agent capabilities using a low-code approach. Copilot Studio bridges the gap between low-code development and AI integration, enabling you to build intelligent agents that enhance user experiences and drive organizational efficiency.
From Build 2024:
Stage 4: Agent Frameworks and Friends
As we continue our journey toward becoming AI agent developers, we transition from segmenting our code to deliver different tasks to utilising AI Agent frameworks and formalising the development of AI agents.
This phase requires a deep understanding and effective utilisation of advanced tools and frameworks specially tailored for crafting intelligent, autonomous systems. To provide further insight, let's delve into two crucial components in this step: Autogen and CrewAI.
Autogen: Autogen is a powerful tool designed to automate the creation and training of AI models. It simplifies the development process by providing a streamlined workflow for generating AI agents. Autogen leverages sophisticated algorithms to automatically create model architectures and optimize them for specific tasks. This automation significantly reduces the time and effort required to develop high-performing AI models, allowing developers to focus on fine-tuning and application-specific customization. Using Autogen, you can efficiently create robust AI agents capable of handling complex tasks with minimal manual intervention.
CrewAI :?CrewAI is an innovative platform designed to facilitate collaborative AI development. It emphasizes teamwork and the collective effort of multiple developers working together on AI projects. CrewAI provides seamless collaboration tools, including shared workspaces, version control, and integrated communication channels. This platform allows agents to sync, leveraging each member's expertise to create more sophisticated and well-rounded AI agents.
-?crewAI
-??joaomdmoura/crewAI: Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks. (github.com)
By integrating Autogen and CrewAI into your development workflow, you can streamline the creation of AI agents and enhance your collaborative efforts. These tools?are designed?to?simplify and accelerate the agent development process, enabling you to build more advanced and reliable AI systems.?
Friends of Agents
Other tools (friends) that I suggest exploring are?LangChain?and?Semantic Kernel. LangChain and Semantic Kernel are two powerful frameworks designed to simplify the creation of AI agents for aspiring developers. These frameworks enable the development of sophisticated AI systems that support multi-agent interaction, allowing agents to communicate and collaborate effectively. Both LangChain and Semantic Kernel offer seamless integration with various large language models (LLMs), providing flexibility in choosing the right model for different tasks. They also incorporate a "human-in-the-loop" approach, ensuring that human oversight and input can refine and guide AI behavior. Additionally, these frameworks support diverse communication patterns among agents, such as sequential and parallel interactions, facilitating the development of complex, coordinated AI solutions. Both frameworks are built with modularity and extensibility in mind, allowing you to customize and extend components to meet specific needs. If you work in Python, you can choose between LangChain and Semantic Kernel based on your specific project needs and development preferences. For developers working in C#, Semantic Kernel is the clear choice. It is designed with strong support for C# and .NET environments, making it more compatible and easier to integrate into existing C# projects. LangChain, on the other hand, is primarily focused on Python, which might limit its utility for developers in a C# ecosystem.
LangChain:?LangChain is a robust framework that enables the development of applications powered by language models. It allows developers to chain together multiple language model calls, integrate them with external APIs, and manage complex workflows. Learning LangChain simplifies building sophisticated, dynamic, and responsive AI agents that can handle various tasks seamlessly.
Semantic Kernel:?The Semantic Kernel (both in C# and Python) is a tool that focuses on understanding and generating highly accurate natural language. It provides advanced semantic parsing and generation techniques, allowing your AI agents to understand context, intent, and nuances in human language. While learning the Semantic Kernel can significantly enhance your AI agents' ability to interact naturally and effectively with users, it is considered optional in this journey.? This tool is particularly important for developing applications that require deep language understanding, such as virtual assistants, chatbots, and automated customer service solutions.
Stage 5: Deployments
And here we are! We’ve finally reached the last stage of our journey. This doesn’t mean we’ve mastered everything or our learning is over. Instead, it means we have the foundational knowledge needed to stay continuously updated on the latest advancements in generative AI. This is simply the beginning of our ongoing commitment to learning and growing in this exciting field.?
Deploy and Manage Agent-Based Solutions: Deploying your AI Agent-based solutions necessitates a profound understanding of Azure’s AI infrastructure. Initially, you must provision the required resources in your Azure portal, which includes the setup of Azure OpenAI services or Open Source Models in the Model Catalog. The management of your solutions entails performance monitoring, resource usage optimization, and security compliance assurance. Azure offers a plethora of tools and dashboards for real-time monitoring and management, ensuring that you have all the necessary resources at your disposal to uphold the health and efficiency of your AI solutions.
Azure Landing Zones: Azure Landing Zones are pre-configured environments that streamline the setup of a scalable, secure, and compliant cloud environment. By leveraging Azure Landing Zones, you can ensure that your deployments align with best practices and governance standards. These zones encompass everything from network configurations to identity management, serving as a robust foundation for deploying and managing your AI solutions. The use of Azure Landing Zones simplifies the deployment process, mitigates risks, and accelerates your time-to-market, making them an essential tool in your AI deployment arsenal.
Conclusion
There’s never been a better time to become an AI agent developer. The field of artificial intelligence is evolving rapidly, with new tools and technologies emerging at an unprecedented pace. Organizations are increasingly adopting AI to streamline operations, enhance customer experiences, and drive innovation, leading to a high demand for skilled AI developers.
Following the structured stages outlined in this guide, you will acquire essential technical skills and knowledge. However, it’s crucial to remember that everything you’ve studied so far must be tested with real projects. It is only by working on real solutions that you can truly learn and gain valuable experience. Practical application not only solidifies your understanding but also prepares you for the complexities of real-world scenarios.
Equally important are strong communication skills and an eagerness to learn. Effective communication is vital for collaborating with cross-functional teams in AI projects, while a continuous learning mindset keeps you updated with the latest advancements in a constantly changing field. Integrating these skills makes you a valuable asset in the job market and empowers you to contribute to groundbreaking AI projects.
Job opportunities for AI agent developers are abundant, reflecting the growing reliance on AI across various industries. Seize this opportunity. With dedication, effective communication, and a passion for learning, you can embark on a rewarding career at the forefront of technological innovation.
May you have a fantastic career in one of the most exciting and vibrant fields in the market.
#llms #artificialintelligence #machinelearning #technology
PS: Credits for people in the stages (pictures): https://github.com/moochin
In a world flooded with information about AI, it's refreshing to find content that brings clarity. Thank you for shedding light on such an important topic!
Articolo molto interessante.