MemVerge转发了
What are the primitives for Generative AI? If you said NVIDIA GPUs and PyTorch you’d be mostly right in the present day, but what about 5 years from now? Andy’s 2023 shareholder letter lends some clues to what the future looks like. 1. Bottom Layer: the hardware and infrastructure level developer tools Cerebras, Sambanova, AWS’s Inferentia, Trainium, AMD’s MI300, the competition is and will continue to be fierce to gain share vs NVIDIA for cost per token/inference, inference latency, throughput and training performance. 2. Middle Layer: foundation models, orchestration and productivity tools for the development, management and automation of GenAI workflows Ray, PyTorch, Flyte, Kueue, IBM’s WatsonX, AWS’s Bedrock and SageMaker compete to make the ML and AI developer and ops teams more productive and efficient. 3. Top Layer: the application layer, think AI agents, and other end-user facing solutions like ChatGPT and others that leverage open source models like Llama 2, Mistral, and Anthropic’s Claude 3. — This layer is where I think the next Netflix, Uber, Snowflake, and Databricks will be born. The possibilities are endless. AWS released an Amazon Q last year which can help write, debug, test and implement code. Would love to hear comments and reviews on this service from the user community. We have been looking into ways to help HPC code run better on AWS (better recommendations for VMs, better workflows and scripts, better use of the availability inventory of EC2 Spot instances). There are many companies that seem to shy away from using GPUs on the public cloud due to sticker shock, but I’m not so sure that is the real problem or biggest risk. The biggest risk is not executing ROI positive GenAI projects quickly enough to keep up with the competition, regardless of industry vertical. What are you doing at each layer of the GenAI stack to innovate and keep up with competition? #AWS #HPC #AI #ML #NVIDIA #ChatGPT