Using GPUs in KVM Virtual Machines

Pavan YalamanchiliHardware & Infrastructure, Open Source 2 Comments

Introduction

A couple of months ago, I began investigating GPU passthrough on my workstation to test ArrayFire on different operating systems. Around the same time, we at ArrayFire found ourselves with a few surplus GPUs.

Having had great success with my virtualization efforts, we decided to build a Virtualized GPU Server to utilize these GPUs. Building a Virtualized GPU Server alleviated one of the pain points at our company: We no longer need to swap GPUs or Hard Disks to test a new environment.

To maximize the number of GPUs we can put in a machine, we ended up getting a Quantum TXR430-0768R from Exxact Computing which comes in a 4U form factor and supports upto 8x double width GPUs.

server_front server_7_gpus_top

The GPUs used for this build include:

Special Thanks to AMD and NVIDIA for providing us with the GPUs used in this build.

Note: If you are planning to purchase hardware for PCI passthrough, ensure both your motherboard and the processor support IOMMU. A short (but by no means comprehensive) list can be found here.

Setting up PCI passthrough

This section describes how to setup PCI passthrough on Ubuntu 16.04. This
is based on this arch wiki article about the same topic.

Setup machine for virtualization
  • Add intel_iommu=on to the kernel parameter in grub
  • Edit /etc/default/grub and add intel_iommu=on (or amd_iommu=on, if you have AMD) to the GRUB_CMDLINE_LINUX_DEFAULT line.
  • Make sure you create the new grub.cfg file by running the following command
sudo grub-mkconfig -o /boot/grub/grub.cfg
  • Blacklist nouveau, radeon, amdgpu
  • create a file called /etc/modprobe.d/blacklist-gpu.conf and add the following lines
blacklist nouveau
blacklist radeon
blacklist amdgpu
  • This step may break GUI login on your server machine. Make sure you have an SSH or VNC login to your server.
  • Reboot
Adding Devices for PCI passthrough
  • Run the following command to get the list of IOMMU groups
for iommu_group in $(find /sys/kernel/iommu_groups/ -maxdepth 1 -mindepth 1 -type d); do echo "IOMMU group $(basename "$iommu_group")"; for device in $(ls -1 "$iommu_group"/devices/); do echo -n 
  • The PCI IDs are of the form [abcd:xyzw]
  • Add the PCI ids you want to pass through to /etc/modprobe.d/vfio.conf. Create the file if it does not exist.
  • A sample configuration is given below.
options vfio-pci ids=10de:1024,10de:102d
  • Add the following vfio modules to /etc/initramfs-tools/modules
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
  • Run sudo update-initramfs -u to generate the new initrd image.
  • Reboot
Setting up QEMU + KVM + libvirt for PCI passthrough
  • Install the required virtualization packages
sudo apt-get install qemu-kvm ovmf bridge-utils uml-utilities libvirt-bin virt-manager
  • Add the user to the kvm and libvirtd groups
sudo usermod -a -G kvm userid
sudo usermod -a -G libvirtd userid
  • Edit /etc/libvirt/qemu.conf and add (or uncomment) the following line if not already present
nvram = [    "/usr/share/OVMF/OVMF_CODE.fd:/usr/share/OVMF/OVMF_VARS.fd"
]
  • restart libvrit-bin service
sudo systemctl restart libvirt-bin
  • You are now ready to pass through the GPU!

Setting up the virtual machines

This section describes how to set up and maintain VMs using libvirt in the frontend and qemu+kvm in the backend. This can be achieved by either using the command line (virsh) and GUI (virt-manager).

Creating a storage pool

If you plan on using the default pool, skip this section.

Creating a volume
  • For creating a volume from the command line using virsh, use the following commands
virsh vol-create-as --pool my-storage-pool --name disk-01.qcow2 --capacity 80G --format qcow2
  • For creating a volume using virt-manager, double click on your connection,

navigate to storage, choose your pool, press + and follow the instructions.

Networking

libvirt defaults to creating a NAT interface for networking. If you want to use networking through a NAT interface, you can skip this section.

The rest of the section deals with creating a bridge interface which can later be used by the VMs.

  • Edit /etc/network/interfaces
  • Add the following lines towards the end. Use the appropriate network interface instead of eth1
auto br0
iface br0 inet dhcp
    bridge_ports eth1
  • Restart the network
systemctl restart networking network-manager
Installing an OS

Installing an OS from command line

  • Run a command similar to the following
virt-install --name test-vm-01 --memory 8192\
--vcpus 4 --disk /my/storage/location/disk-01.qcow2\
--graphics vnc,password=foobar,listen=0.0.0.0\
--cdrom /path/to/os-install.iso\
--network bridge=br0\
--host-device=pci_abcd_vw_xy_z
  • Note the pci id passed to host-device is in a slightly different format than the output from lspci
  • The vw_xy_z format is extracted from the output of lspci which returns the ids in vw:xy.z format.
  • You can now run the following command to get the appropriate pci id for virsh
virsh nodedev-list | grep vw_xy_z
  • Note, listen=0.0.0.0 makes the vm visible to everyone. Mask it to the required subnets appropriately.
  • change the password to be something more appropriate.
  • Once the command is run, you can queiry the vncdisplay using the following command:
virsh vncdisplay test-vm-01
  • You can then use this information to connect to the server using a vnc client of choice

Installing an OS using virt-manager

  • Click on create new virtual machine (top left icon)
  • Local install media. Browse and choose install media
  • Optional: Set OS type and version appropriately
  • Choose the desired amount of RAM and CPU cores
  • Expand Network selection and make sure the Bridge br0 is selected
  • Select or create custom storage. Browse to the storage disk created earlier
  • Check the box that says Customize configuration before install
  • Click on Add Hardware
  • Choose PCI Host Device and select the device you want to pass through
  • Click Begin Installation on top left corner.

If the GPU being passed through has audio support, you need to add it to the VM as well.

Accessing the VMs
  • If you setup the VM to use bridge networking, it can be accessed directly via the ip within the LAN.
  • If you setup the VM to use VNC, you can access the VM using any VNC client software.
  • If you are on Linux, you can install virt-manager and connect to the remote server and all of it’s VMs.
Cloning the VMs
  • From command line run the following command
virt-clone --original original-vm-name --name new-vm-name --file /path/to/new/disk.qcow2
  • Alternatively, from virt-manager simply right click on the vm and choose clone.

Conclusion

Although time consuming, this build has been very fruitful for us. It allows us to seamlessly snapshot, clone, restore, and switch between various configurations. We plan on doing interesting things with this setup including integrating it with our continuous integration system.

We also have other interesting ideas brewing internally to leverage this infrastructure. Follow this blog for updates in this regard!

\t’; lspci -nns “$device”; done; done

  • The PCI IDs are of the form [abcd:xyzw]
  • Add the PCI ids you want to pass through to /etc/modprobe.d/vfio.conf. Create the file if it does not exist.
  • A sample configuration is given below.

  • Add the following vfio modules to /etc/initramfs-tools/modules

  • Run sudo update-initramfs -u to generate the new initrd image.
  • Reboot
Setting up QEMU + KVM + libvirt for PCI passthrough
  • Install the required virtualization packages

  • Add the user to the kvm and libvirtd groups

  • Edit /etc/libvirt/qemu.conf and add (or uncomment) the following line if not already present

  • restart libvrit-bin service

  • You are now ready to pass through the GPU!

Setting up the virtual machines

This section describes how to set up and maintain VMs using libvirt in the frontend and qemu+kvm in the backend. This can be achieved by either using the command line (virsh) and GUI (virt-manager).

Creating a storage pool

If you plan on using the default pool, skip this section.

Creating a volume
  • For creating a volume from the command line using virsh, use the following commands

  • For creating a volume using virt-manager, double click on your connection,

navigate to storage, choose your pool, press + and follow the instructions.

Networking

libvirt defaults to creating a NAT interface for networking. If you want to use networking through a NAT interface, you can skip this section.

The rest of the section deals with creating a bridge interface which can later be used by the VMs.

  • Edit /etc/network/interfaces
  • Add the following lines towards the end. Use the appropriate network interface instead of eth1

  • Restart the network

Installing an OS

Installing an OS from command line

  • Run a command similar to the following

  • Note the pci id passed to host-device is in a slightly different format than the output from lspci
  • The vw_xy_z format is extracted from the output of lspci which returns the ids in vw:xy.z format.
  • You can now run the following command to get the appropriate pci id for virsh

  • Note, listen=0.0.0.0 makes the vm visible to everyone. Mask it to the required subnets appropriately.
  • change the password to be something more appropriate.
  • Once the command is run, you can queiry the vncdisplay using the following command:

  • You can then use this information to connect to the server using a vnc client of choice

Installing an OS using virt-manager

  • Click on create new virtual machine (top left icon)
  • Local install media. Browse and choose install media
  • Optional: Set OS type and version appropriately
  • Choose the desired amount of RAM and CPU cores
  • Expand Network selection and make sure the Bridge br0 is selected
  • Select or create custom storage. Browse to the storage disk created earlier
  • Check the box that says Customize configuration before install
  • Click on Add Hardware
  • Choose PCI Host Device and select the device you want to pass through
  • Click Begin Installation on top left corner.

If the GPU being passed through has audio support, you need to add it to the VM as well.

Accessing the VMs
  • If you setup the VM to use bridge networking, it can be accessed directly via the ip within the LAN.
  • If you setup the VM to use VNC, you can access the VM using any VNC client software.
  • If you are on Linux, you can install virt-manager and connect to the remote server and all of it’s VMs.
Cloning the VMs
  • From command line run the following command

  • Alternatively, from virt-manager simply right click on the vm and choose clone.

Conclusion

Although time consuming, this build has been very fruitful for us. It allows us to seamlessly snapshot, clone, restore, and switch between various configurations. We plan on doing interesting things with this setup including integrating it with our continuous integration system.

We also have other interesting ideas brewing internally to leverage this infrastructure. Follow this blog for updates in this regard!

Comments 2

  1. Question with the K80 or any of those for that matter are you able to split that between both virtual desktop acceleration and opencl/cuda usage at the same time? the info on AMD’s site is a bit lacking.

Leave a Reply

Your email address will not be published. Required fields are marked *