课程: Large Language Models on AWS: Building and Deploying Open-Source LLMs

免费学习该课程!

今天就开通帐号,24,700 门业界名师课程任您挑!

Llama.cpp on AWS G5 demo

Llama.cpp on AWS G5 demo

- [Instructor] All right, so we're on a AWS machine that's a monster machine. It's a G5 12Xlarge. It has GPUs attached to it. And this would be a perfect type of machine for doing inference for open source large language model. So, you don't need to use any service. You can just SCP a model that you compiled somewhere, or even get it compiled yourself. More likely what you would do is compile it on a build server. For this particular architecture, optimize it, and then do a copy to S3, and then pull it in. But in our case, we're going to do more of a kick the tires type approach here. So, the first thing that I'll mention is let's go ahead and take a look at what this actually has going. So if I type htop, you can see a lot of cores available right here. And you can see here that it's got 187 gigs of RAM as well. Now in terms of the GPUs, we have lots and lots of GPUs as well. So, we have four different GPUs, NVIDIA A10Gs. These are also pretty powerful machines here with about 23…

内容