课程: Advanced AI Analytics on AWS: Amazon Bedrock, Q, SageMaker Data Wrangler, and QuickSight

Introduction to analytics with AI on AWS

- [Instructor] AWS Advanced AI Analytics. We're going to talk about many services, including Bedrock, which we see here, Amazon Q, SageMaker, Data Wrangler, and QuickSight. All of these work together to promote actionable business insights, and we'll also get into some performance optimizations with Lambda, which is also very critical for cost optimization, and we'll also talk about emerging research. So beyond Bedrock, we have Q here, which is for NLP, SageMaker Data Wrangler for transformations and QuickSight for intelligence. In terms of the Amazon Bedrock architecture, the key takeaway is that it's a unified API that's serverless for multiple foundation models, although you can provision dedicated I/O. In terms of security, it has the best practices that many services already used, like VPCs and IAM controls. We also have the ability to do model customization and fine-tuning and even distillation. Serverless is a key component here because it allows us to not have to worry about some of the finer details, and we can just start calling the API and more dealing with token-level scaling. So Amazon Q Query Intelligence is a natural language service that has deep integration with many different components. The multi-service integration is a big part of what Amazon Q does. For example, if you open up CloudShell, it has integration with chat and also the ability to help you build command line tools, and also you can use it as a development environment amongst other things like security. In terms of context-aware processing, this is another key component. So it knows the problem you're working on when you're working on it, and it has the ability to do a response synthesis. So what this means is that you have the ability to take multiple inputs and come up with a conclusion. And in terms of Data Wrangler pipeline, this is an automated data transformation service. Some of the things that it does include automated data transformation, intelligent feature engineering, and also quality validation gates as well. So you can say, for example, these are the specific metrics that we care about. Things have to pass those metrics before we can go to the next stage and go to the ML-ready pipeline. The ML-ready output streams allow you to validate that what has been processed through the pipeline is finally ready to be trained on. Now QuickSight visual answers allow you to ask business questions and then get results like charts or maybe anomaly detection. And if we look at the connection with live data, this is also a very important feature of the platform. It can reveal hidden patterns through things like clustering and anomaly detection. And it also allows you to have a driver for decision-making because you can have a dashboard that comes from a software engineering pipeline or a business intelligence pipeline or product management KPIs, and that's how you can drive the actions. Now in terms of the performance, one of the areas in analytics and data engineering that's a hot topic is what language should I use? And some of the emerging research shows that Rust can be a huge performance gain, especially in the era of generative AI because it's easier to write code and memory efficiency definitely plays a role. You can see that cold start times as well play a huge role because with Python, oftentimes you're using a container runtime because it needs a bunch of packages. With Rust, it's a small binary. Request latency, if it's a very efficient language like Rust, there'll be low latency. And then in terms of cost optimization, we have some benchmarking that all show that shows how you can reduce cost by let's say 90% depending on certain workloads that are computationally expensive. The Unified Analytics Flow is about doing things in a way that allows you to make decisions across all of the context of the data you have. So what this means is cross service data flow. And what this really comes down to is being able to look at a dashboard that shows you what's happening with more than one service. This is automated processing as well. So if the results are coming in an automated fashion because it's clean data, then you're going to be able to make these dashboards. The real-time integration is another key component of modern analytics pipeline, so you can make actionable insights without having to wait for a day or a week. And then if we look at scalable architecture as well, this is a important component, especially using serverless technologies. So the next emerging horizons for analytics pipelines around AI could include things like neural search. This is a important component to consider. Also zero-shot learning, quantum-ready API development, and then automatic optimizations as well. So maybe automatically fine-tuning or automatic dashboard creation. These are some of the emerging horizons that we're seeing as well with analytics and data engineering.

内容