登录查看更多内容

AWS re:Invent 2024: Developer-focused event announced new AWS LLMs and latest accelerator chips

Michael Azoff

发布日期: 2025年1月9日

Summary

#AWS #re:Invent 2024, held in Las Vegas from December 2 to 6, was a feast for #Amazon Web Services (AWS) cloud developers, with a host of new feature announcements. One of the challenges of working with serverless computing is the lack of transparency when testing and debugging applications, as observability agents need to be planted on the servers. Amazon is helping serverless developers by adding observability features. It announced new features for working with Kubernetes, including Amazon Elastic Kubernetes Service (EKS) Hybrid Nodes for the unified management of clusters across separate environments and AWS-managed cluster operations with Amazon EKS Auto Mode.

Amazon entered the large language model (LLM) owners arena with its new Nova series of LLMs, an output from the company’s research group focused on keeping up to date with the latest progress in generative artificial intelligence (AI). The support for partner LLMs in Amazon Bedrock has expanded. To accelerate AI with hardware, Amazon in-house originated chips Trainium and Inferentia continue to be iterated. Developers can use Amazon Q Developer as a code assistant, and this is available integrated with Amazon Bedrock for building LLM applications. They can also use Amazon SageMaker to build analytics and machine learning applications.

Notably, Amazon conducts significant in-house research (such as discussed at www.amazon.science). In Omdia’s opinion, analysts and Amazon customers would welcome hearing more about the company’s research efforts.

Analyst view

Amazon Lambda now supports application performance management

Serverless services are so named because they abstract away the chore of infrastructure management from developers, shielding them from having to administer and maintain the infrastructure. However, the downside is that developers building applications using AWS Lambda lack transparency into performance bottlenecks and vital information to troubleshoot test and debug efforts. This has been corrected by observability features that provide logs, allow custom metrics to be installed, and can examine edge cases in detail. Distributed tracing is possible with sensors placed close to generators. These features also make self-healing possible by monitoring metrics and, when issues arise, trigger remediation and failback.

Additionally, AWS Lambda now supports Amazon CloudWatch Application Signals, an application performance monitoring (APM) solution. It enables developers and operators to easily monitor the health and performance of their serverless applications built using Lambda. Application Signals provides prebuilt, standardized dashboards for critical application metrics (such as throughput, availability, latency, faults, and errors), correlated traces, and interactions between the Lambda function and its dependencies (such as other AWS services) without requiring any manual instrumentation or code changes from developers. AWS Lambda also supports CloudWatch Log Insights.

In Omdia’s opinion, these AWS tools fill a gap in serverless services, which could only be partially filled by third-party observability tools. The third-party tools will also benefit from having access to log and metrics data available from the “serverless” servers. So, the AWS tools represent a welcome addition that will give enterprises more confidence in moving to serverless computing.

The future of Kubernetes on AWS

At re:Invent 2024, Nathan Taber, head of product for Kubernetes and registry at AWS, spoke about enhanced image scanning on Amazon Elastic Container Registry (ECR). It is powered by Amazon Inspector and covers over 50 vulnerability databases and more than 12 operating systems. ECR is popular, with over 2 billion image pulls per day. Amazon is providing extended version support on EKS as new releases of Kubernetes take place; customers need some 14 months to move to a new version, so extending the support is vital for enterprises. The company is also providing Upgrade Insights, a report on how API calls would be impacted on different Kubernetes versions, helping users assess the move to a new Kubernetes version. It has added an application recovery controller (ARC) to EKS so that when failures occur, traffic can be automatically routed to alternative availability zones (AZs). This feature can be automatically triggered by AWS or manually by the user. ARC also manages the transfer back to the originally troubled AZ once it is operational again.

Amazon EKS is now available in every global region and across cloud-to-edge deployments. Also announced at re:Invent 2024 is the availability of EKS Hybrid Nodes, where on-premises and edge infrastructure can be used as nodes in Amazon EKS clusters, offering unified Kubernetes management across all the separate environments. Another new tool is Amazon EKS Auto Mode, which automates Kubernetes cluster operations, letting AWS make the operational decisions. Costs are optimized with automatic capacity planning and dynamic scaling.

In the longer term, Amazon intends to reduce the friction for non-tech companies to benefit from the technology available on AWS without having to become tech experts. This entails more focus on ease of management, meeting the workloads where they exist, and simplifying building solution platforms on AWS.

In Omdia’s opinion, one of the biggest challenges that an AWS user faces is the sheer vastness of the services available—and navigating these options is a daunting task. It would help users, whether newbies or experienced, to have a role-based navigation path through AWS that only exposed services related to the level of technology skills available in the user organization.

Amazon enters LLM market with Nova series models

While Amazon competes with Microsoft and Google in the cloud service provider (CSP) space, the company’s ambitions regarding AI to date have been about “selling shovels to the miners” (i.e., providing the tools to build AI applications and being agnostic about any particular LLM). This approach is in contrast with the aforementioned rivals that have deep research projects in AI. They create their own LLMs and partner with assorted independent LLM providers, but they are also engaged in building more intelligent machines (a.k.a. artificial general intelligence [AGI] or human-level AI). However, this position is changing. At re:Invent 2024, Amazon announced a series of its own in-house developed LLMs called Nova.

This announcement indicates Amazon is building competency in LLM creation. The company says little about its wider research goals but is running a pure and applied research program (see www.amazon.science). In contrast, Microsoft and Google are more vocal about their AI research aspirations, with stated aims that may yield a stepping stone to AGI. But this remains to be seen, and there is a lot of skepticism that without additional features, LLMs cannot achieve AGI as they are currently architected (see an article predicting the achievement of AGI in Further reading).

Amazon Bedrock features expand

Amazon Bedrock is the AWS service that creates generative AI applications. Its Marketplace offers access to a host of foundation-level LLMs, with options that keep expanding; model creators include Anthropic, Cohere, Meta, Mistral, and stability.ai. At re:Invent 2024, Amazon announced a host of new features and alliances in Bedrock:

§? poolside: A partnership with poolside for its code Assistants (models malibu and point), which will be available in Bedrock.

领英推荐

My "Aha!" Moment with Amazon Q

Amazon Web Services (AWS) 8 个月前

AWS re:Invent 2024 | 7 takeaways after drinking from…

Constellation Research, Inc. 3 个月前

Why AWS is the Best Cloud Platform for Machine Learning

OneData Software Solutions 1 个月前

§? Amazon Bedrock Distillation: A feature for distilling knowledge in a large model into a more cost-effective smaller model targeting specific use cases. This approach is also faster and only slightly less accurate than the full model.

§? Amazon Bedrock Intelligent Prompt Routing: Makes intelligent decisions on which size model is appropriate to the query, helping reduce costs and latency.

§? Amazon Bedrock Knowledge Base: Designed to make retrieval-augmented generation (RAG) easier to use. Natural language queries can also access structured data sources.

§? Amazon Bedrock Knowledge Graph: Brings the power of knowledge graphs using graphRAG and Amazon Neptune (the serverless graph database).

§? Amazon Bedrock Data Automation: Makes unstructured data into structured data.

§? Amazon Bedrock Guardrails: Has multimodal toxicity detection.

In Omdia’s opinion, Amazon Bedrock is an excellent starting point for exploring developing applications with LLMs. When used with Bedrock, tools such as Amazon Q Developer will reduce the time to production when working with the latest generative AI advances.

Amazon Q Developer gives developers advanced code assistance

Amazon Q was first launched in April 2024 and was a key announcement at re:Invent 2024. It is available in three versions: Developer for software development, Business for business analytics, and Apps for building AI-powered apps. Amazon Q Developer (which is built on top of and supersedes Amazon CodeWhisperer) is a chatbot integrated into a developer’s IDE, and it provides a host of features across the software development lifecycle. Enhancements announced at re:Invent include agents that automate unit testing, documentation, and code reviews and have the capability to help users address operational issues in a fraction of the time. Amazon Q Developer can automate the upgrading of legacy Java code to newer versions and help transform applications in IBM z/OS mainframe migrations by automating analysis, refactoring code, and generating code documentation.

Amazon Q Developer is also available in Amazon SageMaker Canvas, a no code tool for building machine learning applications with SageMaker. Amazon Q Business is available in Amazon QuickSight, the business analytics tool, and it allows business analysts to use natural language prompts to perform business analytics.

Analysis of Amazon Q Developer will appear in Omdia’s forthcoming Omdia Universe assessment of no code, low code, and professional code assistants, to appear in 2Q25.

Amazon chips anticipate trillion parameter LLMs

At the re:Invent 2024 keynote, Peter Desantis, senior vice president of AWS Utility Computing Products, anticipated the need to support LLM models with trillions of parameters. Along with including partner chip providers such as Nvidia on AWS instances, Amazon is investing in in-house developed accelerators. Desantis revealed how the latest Trainium2 chip has a data flow architecture using systolic arrays (using a custom neural network array design), which has 1.3 PFLOPS performance (at dense FP8 numeric format).

The Trainium2 package includes eight NeuronCore (third generation) compute chips and four high bandwidth memory (HBM) modules. This chip is available in the Trn2 Amazon EC2 instance, combining 16 Trainium2 chips interconnected with Amazon’s latest NeuronLink high bandwidth, low latency chip interconnect (2TBps bandwidth and 1 microsecond latency). It offers 20.8 peak PFLOPS performance in dense FP8 compute and 83.2 PFLOPS of sparse FP8 compute. The new network in the AWS cloud is called 10p10u, as it provides 10 petabits of network capacity and under 10 microseconds latency.

Amazon is running a “Build on Trainium” research and education program, invested with $110m funding, to support next-generation novel AI research in academia, exploiting Trainium accelerators. Institutions on board this program include UC Berkeley, Texas University, Carnegie Mellon University, and the University of Oxford.

In Omdia’s opinion, Amazon’s strategy of hedging its position to ensure that it meets cloud users’ demands for running AI applications is sound. Amazon builds its own state-of-the-art AI accelerator chips while also offering chips from companies in the open market, such as Nvidia.

Appendix

Michael Azoff的更多文章

What the customer wanted, AI redux!

2025年1月22日

What the customer wanted, AI redux!

What the customer wanted – the old way Some decades ago a cartoon went round the IT community showing “what the…
Prediction for when artificial general intelligence will happen

2024年11月28日

Prediction for when artificial general intelligence will happen

This is my prediction for when I think artificial general intelligence (AGI) will happen. #AGI is human-level #AI.
FinOps X Barcelona 2024: AI for FinOps/FinOps for AI, and other highlights

2024年11月28日

FinOps X Barcelona 2024: AI for FinOps/FinOps for AI, and other highlights

Omdia view FinOps Foundation were kind enough to invite me to attend their FinOps X EU event in Barcelona earlier this…
Bad designs around us that persist

2024年7月29日

Bad designs around us that persist

Driver blind spot I used to ride a motorcycle and was trained to glance over my shoulder when overtaking, it’s called…

2 条评论
Artificial general intelligence, are we there yet?

2024年5月15日

Artificial general intelligence, are we there yet?

Introduction The current state of the art in artificial intelligence (AI) is generative AI and large language models…
Ladder of increasingly intelligent systems

2024年3月4日

Ladder of increasingly intelligent systems

Introduction The aim for many artificial intelligence (AI) researchers is to build intelligent machines that achieve…
Omdia Market Radar: AI-Assisted Software Development, 2023–24 – an extract

2024年1月9日

Omdia Market Radar: AI-Assisted Software Development, 2023–24 – an extract

Summary Catalyst In the space of one year since the launch of ChatGPT in November 2022, the market and appetite for…
OpenUK reveals that the UK is a world leader in open source software

2023年8月17日

OpenUK reveals that the UK is a world leader in open source software

OpenUK is a not-for-profit company founded in 2018 to support open technology within the UK. By open technology, the…

1 条评论
Wing Cloud claims first programming language for the cloud

2023年8月17日

Wing Cloud claims first programming language for the cloud

For a new programming language to enter the market successfully, it must fulfill a need that other languages do not…
Speaking on genAI at AI Summit and AI World Congress, both in London

2023年6月13日

Speaking on genAI at AI Summit and AI World Congress, both in London

I'll be speaking on the topic of generative AI and its use in software development at the AI Summit London on 15 June…

2 条评论

See all articles

AWS re:Invent 2024: Developer-focused event announced new AWS LLMs and latest accelerator chips

Michael Azoff

Summary

Analyst view

Amazon Lambda now supports application performance management

The future of Kubernetes on AWS

Amazon enters LLM market with Nova series models

Amazon Bedrock features expand

领英推荐

Amazon Q Developer gives developers advanced code assistance

Amazon chips anticipate trillion parameter LLMs

Appendix

Further reading

Michael Azoff的更多文章

社区洞察

其他会员也浏览了

AWS re:Invent 2024 Highlights

Forte Spotlight: Hello from AWS re:Invent 2024

AWS re:Invent - Announcements And Recap

How to Install LLAMA 3 Simply on AWS via AMI

OctoAI Compute Service Launch Recap

AWS Certified AI Practitioner Exam – AIF-C01 Study Path Exam Guide

Top 10 Reasons AWS is the Best Choice for AI/ML Solutions in Organizations

Ground Zero to Cloud Nine: How to use AWS Bedrock using Llama 2

Deploying a Trained CTGAN Model on an EC2 Instance: A Step-by-Step Guide

What is AWS Bedrock? AWS Bedrock Pricing Simplified

Summary

Analyst view

Amazon Lambda now supports application performance management

The future of Kubernetes on AWS

Amazon enters LLM market with Nova series models

Amazon Bedrock features expand

领英推荐

Amazon Q Developer gives developers advanced code assistance

Amazon chips anticipate trillion parameter LLMs

Appendix

Further reading

Michael Azoff的更多文章

What the customer wanted, AI redux!

Prediction for when artificial general intelligence will happen

FinOps X Barcelona 2024: AI for FinOps/FinOps for AI, and other highlights

Bad designs around us that persist

Artificial general intelligence, are we there yet?

Ladder of increasingly intelligent systems

Omdia Market Radar: AI-Assisted Software Development, 2023–24 – an extract

OpenUK reveals that the UK is a world leader in open source software

Wing Cloud claims first programming language for the cloud

Speaking on genAI at AI Summit and AI World Congress, both in London

社区洞察

其他会员也浏览了

AWS re:Invent 2024 Highlights

Forte Spotlight: Hello from AWS re:Invent 2024

AWS re:Invent - Announcements And Recap

How to Install LLAMA 3 Simply on AWS via AMI

OctoAI Compute Service Launch Recap

AWS Certified AI Practitioner Exam – AIF-C01 Study Path Exam Guide

Top 10 Reasons AWS is the Best Choice for AI/ML Solutions in Organizations

Ground Zero to Cloud Nine: How to use AWS Bedrock using Llama 2

Deploying a Trained CTGAN Model on an EC2 Instance: A Step-by-Step Guide

What is AWS Bedrock? AWS Bedrock Pricing Simplified