Beginners: Building your own GPU cloud on consumer hardware & software

Beginners: Building your own GPU cloud on consumer hardware & software

Sometimes heavy compute needs don't scale well across multiple GPUs, requiring virtualization to make the best use of hardware.

The good news is that setting up virtual machines to make use of GPUs is straightforward and only takes maybe half an hour to configure - even if you only have a single GPU you want to spread across several VMs!

Prepping Windows

You'll need to enable Hyper-V, Microsoft's native hypervisor, in Windows:

  1. Press Start, type "turn windows features on or off"
  2. Enable the entire Hyper-V section and Virtual Machine Platform
  3. Reboot your PC

Verifying prerequisites

You'll need a motherboard, GPU and OS that support virtualization and paravirtualization of hardware. Fortunately, a lot of consumer-level hardware does.

To check if your motherboard supports virtualization in its BIOS, the process varies on whether your system is AMD or Intel based:

  • If AMD, you need to have the option to enable SVM (Secure Virtual Machine).
  • If Intel, you'll need to have the option to enable VMX (Intel Virtualization Technology).

In terms of OS, we'll be using Windows 11 Pro in this guide, but it should also work in Windows 10 Pro, or any recent edition of Windows Server. As always, you could use Linux, but it's a different process than this guide.

Lastly, to see if your GPU is compatible in Windows 11 Pro, open an elevated (run as admin) PowerShell prompt and enter the following:

Get-VMHostPartitionableGpu        

If you are running Windows 10 Pro, the command is:

Get-VMPartitionableGpu        

This will return all GPUs available for paravirtualization on the host.

If all this looks good, then go download a Windows 11 Pro ISO from Microsoft. If you are using Windows 10 Pro, go download an ISO that matches your OS. We'll need it later when we set up the virtual machine.

Lastly, before we proceed, update your graphics drivers to the newest available.

Configuring the hypervisor

To support GPU paravirtualization, you'll need to disable Enhanced Session Mode:

  1. In Hyper-V Manager, right-click your computer's name again, and select Hyper-V settings
  2. Under User (not server), select Enhanced Session Mode, and disable Use Enhanced Session mode

Next, we'll set up a virtual network switch so your virtual machines can access the internet:

  1. Open Hyper-V Manager, and in the first panel right-click your computer's name, then select Virtual Switch Manager
  2. Select New Virtual Network Switch, select External, then press Create Virtual Switch
  3. Enter whatever name you want then select your network adapter from the dropdown menu
  4. You'll also want to ensure the checkbox for allowing the management OS to share the adapter is enabled
  5. When you hit ok/apply, you'll be notified this will briefly disconnect your machine from the internet as the settings are applied

Virtual machine setup

  1. In Hyper-V, right click your computer's name again and under New, select Virtual Machine
  2. Enter whatever name you want, hit next
  3. Select Generation 2, hit next
  4. Assign whatever amount of RAM your task will need. You can adjust this amount later. If you plan on running multiple VMs, consider how much you can allocate to each machine
  5. Before proceeding to the next step, you must uncheck Use Dynamic Memory to support GPU paravirtualization
  6. On networking, in the drop-down for connection, select the virtual network switch you set up earlier, hit next
  7. On virtual hard disk, specify the maximum amount of storage you want this VM to be able to use. As you did with RAM, consider if you plan on running multiple VMs. Its important to note that the virtual hard disk will not start at the amount specified, you are just setting a limit. You can also adjust this later if needed.
  8. On Installation options, select install operating system from bootable image file, then locate your Windows ISO you downloaded earlier

You'll now see a new virtual machine in Hyper-V, but before we fire it up, there's a few settings that we have to change to support Windows 11 installation as well as GPU paravirtualization.

Configuring the virtual machine

  1. In Hyper-V, right click on the name of your new virtual machine, and select settings
  2. Under Security, select Enable Trusted Platform Module
  3. Under processor, assign the VM enough virtual processors for your use case. The amount of virtual processors available will match the amount that the host machine has. To see the amount your machine has, open Task Manager (control-shift-escape), click on Performance, and under CPU below your utilization graph, you will see the number of Logical Processors.
  4. Under Checkpoints, unselect Enable Checkpoints

With all of these changes in place, you should now be able to run the virtual machine and install Windows as you normally would. If the machine instantly crashes while booting into the installation media, try holding the spacebar as you turn on the VM.

Driver setup

GPU paravirualization requires some manual copying of drivers from your parent OS to the virtual OS. Below is the process for Nvidia GPUs - a similar process can be followed for AMD:

  1. Create a new temporary folder on your computer, anywhere easy to access
  2. Navigate to C:\Windows\System32\ and sort the list by file name
  3. Copy every file that begins with nv and paste them into the new temporary folder you made
  4. Navigate to C:\Windows\System32\DriverStore\FileRepository\ and look for a folder that starts with nv_dispi.inf_ in its name, proceeded by the ID of the device it is for. If you have multiple GPUs in the machine, you will have several folders that start with this. You need to select the one for the exact GPU you intend on giving to the virtual machine. You can find that ID by going to Device Manager, right clicking on the GPU, opening properties, going to the Details tab, and selecting Hardware Ids.
  5. Once you have located the right folder, copy the entire folder into the temporary folder that you have made as a new directory

Next, we need to get these files you've copied into the VM's virtual hard disk:

  1. In Hyper-V manager, right click on the VM and click Settings
  2. Under SCSI controller, click on Hard Drive, and then hit browse where you see the .vhdx file is for your VM
  3. Right click on the .vhdx file, and click Mount. You can now navigate the VM's hard drive as you would any other media attached to your machine
  4. On the VM's hard disk, navigate to its C:\Windows\System32\ folder
  5. In the temporary folder you created on your PC, copy the files that start with nv to the root of the System32 folder
  6. Create a new folder in System32 called HostedDriverStore and then inside of that folder create another titled FileRepository
  7. Inside of FileRepository, place the nv_dispi.inf_ folder from your temporary directory
  8. Right click the mounted virtual hard drive and unmount it so the VM can access it

Please note that if you update the drivers on the parent OS running the hypervisor, you will need to redo this process to update the drivers on your virtual machine(s).

Final VM configuration

Open an elevated PowerShell prompt, and enter the following commands one by one replacing YourVMNameHere with what you named the VM in Hyper-V. They configure memory spaces and cache settings for the virtual machine and what the machine should do when turned off (a requirement for paravirtualization):

Set-VM -GuestControlledCacheTypes $true -VMName YourVMNameHere        
Set-VM -LowMemoryMappedIoSpace 3Gb -VMName YourVMNameHere        
Set-VM -HighMemoryMappedIoSpace 33280Mb -VMName YourVMNameHere        
Set-VM -Name YourVMNameHere -AutomaticStopAction TurnOff        

Assigning the GPU

This is the moment it all comes together.

First, we need to get the location path of your GPU:

  1. Open Device Manager, right click on the GPU you are assigning, click on Properties, then go to the Details tab
  2. Select Device paths, and copy the shorter of the two results. It should look something like: PCIROOT(0)#PCI(0103)#PCI(0000)

In an elevated PowerShell prompt, run the following command to tell the hypervisor to give the VM up to 100% of the GPU:

Add-VMGpuPartitionAdapter -VMName "YourVMNameHere" -InstancePath "YourDevicePathHere"        

That's it! Fire up your VM and go look in Device Manager to see your paravirtualized GPU.

Appendix: Splitting up your GPU(s)

You don't have to assign up to 100% of the GPU to a single VM, you can divide each GPU into up to 32 blocks that you can assign to different VMs. This is useful when the VM is will be tasked with things that won't utilize an entire GPU. See here for more info regarding how to do this as a starting point.

Appendix: DDA vs paravirtualization

Since the GPU is still being managed by the hypervisor OS when paravirtualizing, you are losing a small amount of performance and adding a small amount of latency. For most use cases, this is negligible.

However, professional cards and the latest consumer Nvidia cards support removing the GPU entirely from the host OS and having the VM entirely run the card which removes this performance overhead. This is called DDA (Discrete Device Assignment).

You need to be running Windows Server 2016 or later for this.

The card must support SR-IOV (Single-root input/output virtualization). On the Nvidia side, Ada Lovelace consumer GPUs seem to support this as well as newer Quadro cards.

The setup process for this looks a little different at the end:

  1. Open Device Manager, right click the GPU you are assigning, and select Disable
  2. Right click the GPU again, click Properties, then go to the Details tab
  3. Select Device paths, and copy the shorter of the two results. It should look something like: PCIROOT(0)#PCI(0103)#PCI(0000)

In an elevated PowerShell prompt, enter the following command to remove the GPU from showing up to the parent OS:

Dismount-VMHostAssignableDevice -Force -LocationPath "YourDevicePathHere"        

You should see the GPU vanish from your device manager.

Next, assign it to the VM with the following command:

Add-VMAssignableDevice -LocationPath "YourDevicePathHere" -VMName YourVMNameHere        

Now you should be able to boot up your VM and see the GPU there as you would have in the parent OS. If Hyper-V errors out when launching the VM saying that "A hypervisor feature is not available to the user." It is likely you aren't running a Windows SKU with access to DDA, or your GPU doesn't support being directly passed to the virtual OS.

Appendix: Troubleshooting

Setting up VMs with GPUs often run into issues that surface when you try to boot up your VM or see an Error 43 in Device Manager on the GPU within the VM.

This often seems to be related to IOMMU (Input-Output Memory Management Unit) settings - which can be remedied by changing your computer's Core Isolation Memory Integrity settings:

  1. Outside of the VM, press start, type cmd.exe and open it with elevated privileges
  2. Enter: bcdedit /set hypervisoriommupolicy enable
  3. Restart your PC


要查看或添加评论,请登录

Evan Guzman的更多文章

社区洞察

其他会员也浏览了