Remote Off-Screen Rendering with OpenGL

Shehzan MohammedArrayFire, OpenGL 18 Comments

At ArrayFire, we constantly encounter projects that require OpenGL and run on a remote server that does not have a display.
In this blog, we have compiled a list of steps that users can use to run full profile OpenGL applications over SSH on remote systems without a display.

A few notes before we get started.

  • This blog is limited to computers running distributions of Linux.
  • The first part of the blog that shows the configuration of the xorg.conf file is limited to NVIDIA cards (with display).
  • AMD cards support this capability without the modification of xorg.conf file. However, we have not been able to get a comprehensive list of supported devices.

Requirements

You will need access to the remote system over SSH.
To run the tool, you will need libGL.so and libX11.so. Another tool I would recommend strongly is glewinfo. Most linux distributions ship this with the glew-utils package. An alternate to glewinfo is glxinfo which is present on all systems with X. You can substitute glewinfo with glxinfo in all the commands below if needed.

Configuring X (NVIDIA Only)

To get X to running on NVIDIA cards, we need to make changes to the xorg.conf file. Before making the change, make sure you create a back up of the current version on your system (name it xorg.conf.stable).
You can find a sample xorg.conf file here. The sample file is for a NVIDIA GTX 690. The specific things to notice are the use of “UseDisplayDevice” option under “Screen” and the “Virtual” option under “SubSection Display”. You can use parts of this file to configure your own config file. Make sure the options listed in the sample file get listed in your config file as well.
Save and close the file.

Now run this command: # nvidia-xconfig -a --use-display-device=None --virtual=1280x1024
Restart the system.

Note: To abort trying this, just copy xorg.conf.stable back to xorg.conf and restart.

The following is only for GeForce cards. Quadro and Tesla cards can skip to Initial Diagnosis section.

On restart, run the command # /usr/bin/X :0 &. Ideally, this command should give an output similar to:

X.Org X Server 1.13.0
Release Date: 2012-09-05
X Protocol Version 11, Revision 0



Loading extension GLX
Loading extension NV-GLX
Loading extension NV-CONTROL

This means X has started successfully on the virtual display.
If it fails, restart the system and try again. You can also look at /var/log/Xorg.0.log for the log of the failure.

Update: Known issue with Starting X

If the output is not similar to the one shown above, run the Initial Diagnosis section below. If the output shows an error as follows:

$ env DISPLAY=:0 glewinfo
X Error of failed request:  BadWindow (invalid Window parameter)
  Major opcode of failed request:  137 (NV-GLX)
  Minor opcode of failed request:  4 ()
  Resource id in failed request:  0x200004
  Serial number of failed request:  39
  Current serial number in output stream:  39

run the following commands:

sudo mv /usr/lib/xorg/modules/extensions/libglx.so /usr/lib/xorg/modules/extensions/libglx.so.orig
sudo ln -s /usr/lib/xorg/modules/extensions/libglx.so.XXX.YY /usr/lib/xorg/modules/extensions/libglx.so

Where XXX.YY is the NVIDIA driver version.

Now try starting X again.

Initial Diagnosis

Once X has started successfully, run echo $DISPLAY. It is very likely that the output of this will be empty.
If you have glewinfo installed, run the following command env DISPLAY=:0 glewinfo | less.
The goal of this command is to run glewinfo having temporarily set DISPLAY to :0 (virtual display on remote system). If this command runs successfully, you should be able to see the graphics card on the remote system along with the full OpenGL profile. And now you are ready to deploy applications using X.

If you want to set DISPLAY for the entire session, run export DISPLAY=:0. To set DISPLAY permanently, add the same line to your bashrc file.

Deploying Off-Screen Rendering Applications on Remote System

This is where X becomes crucial. Tools like GLFW may or may not work on remote systems because of their dependence on Xrandr and other software. The trick is use X to create an OpenGL context, and the run everything using off-screen rendering using framebuffers and renderbuffers. I took the source to create an OpenGL context using X from and modified it slighly. Thanks to the folks at OpenGL.org for providing this. The source code with the changes I made can be found here: glContext.hpp.

Include this in your source code. To create context and delete contexts run createGLContext() and deleteGLContext() respectively.

I have specified that the minimum OpenGL version should be 4.4 with forward context enabled. You can modify this version by changing the values at lines 26 and 27 of glContext.hpp.
The forward compatibility can be disabled at line 241 by changing line 241 to None.
If the application fails to create the specified version of the context, the application will exit.

At line 189, we create the window. The API description can be found here. “0, 0, 10, 10” specifies the top-left corner (0,0) and the width and height of the window (10, 10). Since the goal of this is to be used for off-screen rendering, the size of the window has no effect on the rendering.
Line 202 specifies the window title.

If everything goes successfully, you should see an output like this:

glContext.hpp:297: GL Version  = 4.4.0 NVIDIA 331.79
glContext.hpp:298: GL Vendor   = NVIDIA Corporation
glContext.hpp:299: GL Renderer = GeForce GTX 690/PCIe/SSE2
glContext.hpp:300: GL Shader   = 4.40 NVIDIA via Cg compiler

If you wish to use a lower minimum version, say 3.0, then the version provided by the output will reflect 3.0 since GL fetches version from the context.

To test the context creation as a stand alone, create a cpp file with the following contents (glContext.cpp):

#include "glContext.hpp"

int main(int argc, char* argv[])
{
    createGLContext();
    deleteGLContext();
    return 0;
}

Compile this with g++ -o gl glContext.cpp -lGL -lX11 and run with ./gl (make sure DISPLAY is :0). If this works successfully, any off-screen rendering code will work perfectly with this.
Note: If you use GLEW, make sure you include glew.h before including gl.h.

If you wish to know about remote OpenGL further or work with us on remote rendering, contact us at technical@arrayfire.com.

Links:
OpenGL Context Creation Tutorial
ArrayFire glContext repo on Github

Comments 18

  1. Hi,
    Thanks for the awesome post. I had some issues using ubuntu 12.04 LTS and GTX 580. I run into issues with the initial diagnostics. Namely, “$env DISPLAY=:0 glewinfo | less” reports:
    No protocol specified
    No protocol specified
    Error: glewCreateContext failed

    The xorg log does not show any errors and “$/usr/bin/X :0 &” tells me that X is already running.

    Any ideas on how to troubleshoot some more?
    Thanks in advance.

    1. Try running glxinfo with and without the evn DISPLAY=:0. I would assume that the one without the display would give OpenGL version 2.1. With the display, I thing it would error out like glewinfo.

      Did you make the required changes to the xorg.conf and also run the nvidia-xconfig command?

      Lastly, did you the see the “Loading extension GLX…” output from the the log?
      Note: all the commands with # need to be run as root or using sudo.

      Try this: restart your system. Run “sudo ls”. This will ask for password and get you a buffer time to run more sudo commands without entering the password each time. After sudo ls, run the “sudo /usr/bin/X :0 &” command.

      1. Hi,
        glewinfo (and glxinfo) without DISPLAY=:0 both work. (but they give me the info of my local machine. The graphics card info is the local one not the remote one, openGL version is 2.1.2 and display is localhost:10.0. glxinfo produces the same error as glewinfo when using DISPLAY=:0.

        My xorg.conf has the same format as the sample one, with the exceptions that I have a single monitor/screen/device. I’ll try with the double configuration.

        Yeah, I see the “[ 41.772] (II) Loading extension GLX” log entry

        X is running when the machine restarts. running “sudo /usr/bin/X :0 &” yields:
        “Fatal server error:
        Server is already active for display 0
        If this server is no longer running, remove /tmp/.X0-lock
        and start again.”

        I also see /user/bin/X under list of current processes.

        Any other idea of what to try?
        Thanks

        1. You can keep the single configuration. The sample one has 2 of all because a 690 has 2 GPUs.

          It looks like X is not starting properly. I’m not completely sure why this is.

          Can you revert back to your old xorg.conf file, restart the computer and confirm that X is not running when it restarts?

          It may also be an architecture issue. All the GPUs we tested were Kepler. I’ll try testing it on a Fermi and get back to you.

          1. Hi,
            I kept the single configuration as you suggested.
            I reverted to the old xorg.conf file and indeed X did not start when the machine is restarted.

            I also tried on a different machine with a GTX 780 (kepler) and had the same issue.
            I checked the log entries, it was fine, no errors and GLX loaded.

            Any ideas?
            Thanks

          2. Hello,
            After some debugging and troubleshooting with Shehzan we have figured out that my system had X running at boot for some reason. The solution is to kill it and then the procedure would work as outlined above.

            1. Check if X is running
            $ps aux | grep X

            my system produced:
            “root 1945 0.1 0.1 212380 51960 tty7 Ss+ 14:23 0:04 /usr/bin/X :0 -auth /var/run/lightdm/root/:0 -nolisten tcp vt7 -novtswitch”

            2. If you get a similar output, kill X:
            $sudo service lightdm stop

            3. Check if X is still running
            $ps aux | grep X

            4. start local X
            $/usr/bin/X :0 &

            5. check the configuration
            $env DISPLAY=:0 glewinfo | less

            Note the first time step 5 is ran from any SSH session it would be instant, subsequent calls will be much slower (~ 1.5 minutes for the info to show up.) Not sure why…

  2. Are you sure that gl context init code will actually work on remote headless servers?From my previous attempts I always had to create also a dummy PBuffer to get the actual context.In your case you rely upon Window object which is not accessible(at least in my attempts) on the remote servers like those of Amazon EC2 .

    1. I just got this working on an GPU Enabled Ubuntu AWS instance. I didn’t have to modify the code, just some conf files and drivers. The sample .cpp program he provided works for me. I hope that helps narrow down any issues you might encounter.

  3. Thanks for posting this. I had to change a few things around to get this to work on the system I’m supporting (and I need to integrate it into the whole codebase). That said, this guide was a way better starting point than anything else I found.

  4. Pingback: Unix:How to efficiently use 3D via a remote connection? – Unix Questions

    1. Thanks for that. I tried testing this out but was unsuccessful. I got the XVFB to run, but the OpenGL context fails. Here are the steps I used:

      sudo Xvfb :12 -screen 0 800x600x24 &
      [1] 24161
      Initializing built-in extension Generic Event Extension
      Initializing built-in extension SHAPE
      Initializing built-in extension MIT-SHM
      Initializing built-in extension XInputExtension
      Initializing built-in extension XTEST
      Initializing built-in extension BIG-REQUESTS
      Initializing built-in extension SYNC
      Initializing built-in extension XKEYBOARD
      Initializing built-in extension XC-MISC
      Initializing built-in extension SECURITY
      Initializing built-in extension XINERAMA
      Initializing built-in extension XFIXES
      Initializing built-in extension RENDER
      Initializing built-in extension RANDR
      Initializing built-in extension COMPOSITE
      Initializing built-in extension DAMAGE
      Initializing built-in extension MIT-SCREEN-SAVER
      Initializing built-in extension DOUBLE-BUFFER
      Initializing built-in extension RECORD
      Initializing built-in extension DPMS
      Initializing built-in extension Present
      Initializing built-in extension DRI3
      Initializing built-in extension X-Resource
      Initializing built-in extension XVideo
      Initializing built-in extension XVideo-MotionCompensation
      Initializing built-in extension SELinux
      Initializing built-in extension GLX

      ~$ export DISPLAY=:12
      ~$ glewinfo
      Xlib: extension “GLX” missing on display “:12”.
      Error: glewCreateContext failed
      Xlib: extension “GLX” missing on display “:12”.
      6 XSELINUXs still allocated at reset
      SCREEN: 0 objects of 256 bytes = 0 total bytes 0 private allocs
      DEVICE: 0 objects of 96 bytes = 0 total bytes 0 private allocs
      CLIENT: 0 objects of 144 bytes = 0 total bytes 0 private allocs
      WINDOW: 0 objects of 48 bytes = 0 total bytes 0 private allocs
      PIXMAP: 1 objects of 16 bytes = 16 total bytes 0 private allocs
      GC: 4 objects of 16 bytes = 64 total bytes 0 private allocs
      CURSOR: 1 objects of 8 bytes = 8 total bytes 0 private allocs
      TOTAL: 6 objects, 88 bytes, 0 allocs
      1 PIXMAPs still allocated at reset
      PIXMAP: 1 objects of 16 bytes = 16 total bytes 0 private allocs
      GC: 4 objects of 16 bytes = 64 total bytes 0 private allocs
      CURSOR: 1 objects of 8 bytes = 8 total bytes 0 private allocs
      TOTAL: 6 objects, 88 bytes, 0 allocs
      4 GCs still allocated at reset
      GC: 4 objects of 16 bytes = 64 total bytes 0 private allocs
      CURSOR: 1 objects of 8 bytes = 8 total bytes 0 private allocs
      TOTAL: 5 objects, 72 bytes, 0 allocs
      1 CURSORs still allocated at reset
      CURSOR: 1 objects of 8 bytes = 8 total bytes 0 private allocs
      TOTAL: 1 objects, 8 bytes, 0 allocs

      As you can see, the OpenGL part failed to initialize. My hunch is that, even if it does get initialized, it be software based rendering which may limit it to OpenGL 2.1. I don’t think you will be able to get 4.x.

      If you have any other information, I will be happy to test it.

  5. Hi,

    this was great, thanks heaps for the blog post. FYI I was trying to get this to work on a Centos 7 x64 box with a Tesla K40m (if it makes a difference – probably not) and for the “Known issue starting X” the link I had to make was:

    ln -s /usr/lib64/nvidia/xorg/libglx.so /usr/lib64/xorg/modules/extensions/libglx.so

    /usr/lib64/nvidia/xorg/libglx.so is itself a symlink, as per:

    [zgraphics@hamlmg01 xorg]$ pwd
    /usr/lib64/nvidia/xorg
    [zgraphics@hamlmg01 xorg]$ ls -l
    total 13684
    lrwxrwxrwx. 1 root root 16 Apr 17 16:46 libglx.so -> libglx.so.390.30
    lrwxrwxrwx. 1 root root 16 Apr 17 16:46 libglx.so.1 -> libglx.so.390.30
    -rwxr-xr-x. 1 root root 14008936 Feb 1 18:31 libglx.so.390.30

    I do have a problem that although user root seems to be able to start our graphical job, a “normal” user can’t, and e.g. gets

    [zgraphics@hamlmg01 ~]$ glewinfo
    No protocol specified
    Error: glewCreateContext failed

    Any idea what rights the normal user account needs?

    Thanks!
    Dylan

Leave a Reply to xear Cancel reply

Your email address will not be published. Required fields are marked *