Almost all of the machines in our classroom and labs (though not the machines in the two ``computational clusters'') have graphics cards that support GPGPU (General Purpose computing using Graphics Processing Units). On these machines we have installed proprietary drivers (from AMD or NVIDIA) and toolkits (AMD's Accelerated Parallel Processing and/or NVIDIA's CUDA) that support OpenCL. Where possible, we have also installed a driver from Intel that allows running OpenCL programs using the CPU rather than a GPU. This note describes how to develop OpenCL programs in C or C++ using one of these toolkits. Notice that you may have other options for GPGPU programming (e.g., the AMD toolkit includes a Java library called Aparapi meant to support this kind of programming), in which case the sections on compiling and running OpenCL programs are not relevant, but the other sections are.
The first thing to do is to confirm that the machine you want to use has the right combination of hardware and software. Probably the easiest way to do this is with the command opencl-info, which lists available OpenCL ``platforms'' and ``devices''. (On machines with AMD cards, if you are logged in remotely you will get an error message ``No protocol specified''. Unfortunately at present this message occurs both in situations in which there is no suitable GPU and situations in which there is one but you do not have access to it. A possible fix for this is in work, but for now you can probably assume that it means ``no suitable GPU'' only for the machines that are running ``headless'' in a server room.)
Next you will need to add the installation path for the appropriate toolkit to your search path and set related environment variables. For AMD cards, you can do that with the command
module load amdapp
For NVIDIA cards, you use the command
module load cuda-latest
(See my notes on the ``Modules package'' for more about the module command.)
You should be able to compile with either a C compiler (e.g., gcc) or a C++ compiler (e.g., g++). You will need a number of flags; details depend on which toolkit you're using.
For the AMD toolkit, the flags are as follows:
-DAMD_OS_LINUX -I$AMDAPPSDKROOT/include -L$AMDAPPSDKROOT/lib/x86_64 -lOpenCL($AMDAPPSDKROOT is an environment variable set by module load amdapp.)
For the NVIDIA toolkit you will need to use a different compiler, namely nvcc, with the following flags:
-L$CUDA_HOME/lib64 -L/usr/lib64/nvidia/ -lOpenCL($CUDA_HOME is an environment variable set by module load cuda-latest, which also adds the directory containing nvcc to your search path.) nvcc seems to work better with C++, but if what you have is straight C, simply changing the source-file extension from .c to .cpp seems to work.
With either toolkit your program will probably need a #include directive for either CL/cl.h (plain C) or CL/cl.hpp (C++).
To execute the compiled program, just type the name of the executable. If you've set things up correctly, and you have access to the GPU (as discussed below), all should be well (assuming no bugs in the program itself!). Notice, however, that many published examples make assumptions about the number of OpenCL ``platforms'' and ``devices'' that are not true in our environment. So they may need modification in order to work.
(Notice that the following applies only to machines with AMD graphics cards. On machines with NVIDIA graphics cards, everything should ``just work''.)
AMD's implementation for its own hardware apparently accesses the GPU by way of the X server (the graphical subsystem in most UNIX/Linux systems). This works fine for regular users logged in via a graphical login (even in terminal windows) and for the root user. For non-root users logged in remotely, e.g., via an SSH connection, it doesn't work so well. We have installed a workaround that allows regular users logged in remotely to access the GPU. To use it you need to do two things:
First, you need to be a member of the ``GPU users'' group; one of the sysadmins can add you to this group, so ask your instructor or research supervisor, or get in touch with one of the admins directly.
Next, when you log in remotely, you need to run the following command:
(Notice that the word in the middle of the filename is amd (as in AMD) and not and.) You can check that all is well by executing the clinfo command; it should report information for a device of type CL_DEVICE_TYPE_GPU. If it doesn't, something has gone wrong, and we (the admins) would like to hear about it.