USING CUDA.ACAD FOR GPU PROGRAMMING
========================================================================

LOGGING INTO CUDA.ACAD
------------------------------------------------------------------------

	Machine name is: cuda.acad.ece.udel.edu
	Username is your eecis username
	Password is your eecis password

	If you do not have an EECIS account, request one!



ENVIRONMENT SETUP
------------------------------------------------------------------------

NOTE:
	The following instructions will assume that you are using the BASH shell (Bourne Again Shell)
	To switch to bash, invoke the following command: "exec /bin/bash"

1.  Log into cuda.acad.ece.udel.edu using eecis user name/password

2.  Set the following environment variables:

	FOR BASH SHELL
	++++++++++++++++++++++++++++++++++++++++
	
	You can place this in ~/.bashrc so it will always be performed when you log in
	
		# FOR SLURM (run/job submissions)
		export PATH=$PATH:/software/slurm/bin
	
		# FOR CUDA
		export PATH=$PATH:/software/cuda/bin
		export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/software/cuda/lib64:/software/cuda/lib:/software/cuda-sdk/shared/lib

		# FOR OpenCL
		export OCL_INCLUDE_PATH=/software/cuda-sdk/OpenCL/common/inc
		export OPENCL_INC_PATH=/software/cuda-sdk/OpenCL/common/inc
		export C_INCLUDE_PATH=$OCL_INCLUDE_PATH:$C_INCLUDE_PATH
		export CPLUS_INCLUDE_PATH=$OCL_INCLUDE_PATH:$CPLUS_INCLUDE_PATH
		export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/software/cuda-sdk/OpenCL/common/lib/
		
		# FOR OpenACC / HMPP Workbench
		source /software/HMPPWorkbench-3.2.1/bin/hmpp-env.sh
	
	FOR C SHELL
	++++++++++++++++++++++++++++++++++++++++
	
	You can place this in ~/.login so it will always be performed when you log in

		# FOR SLURM (run/job submissions)
		setenv PATH $PATH:/software/slurm/bin
	
		# FOR CUDA
		setenv PATH $PATH:/software/cuda/bin
		setenv LD_LIBRARY_PATH $LD_LIBRARY_PATH:/software/cuda/lib64:/software/cuda/lib:/software/cuda-sdk/shared/lib
	
		# FOR OpenCL
		setenv OCL_INCLUDE_PATH /software/cuda-sdk/OpenCL/common/inc
		setenv OPENCL_INC_PATH /software/cuda-sdk/OpenCL/common/inc
		setenv C_INCLUDE_PATH $OCL_INCLUDE_PATH:$C_INCLUDE_PATH
		setenv CPLUS_INCLUDE_PATH $OCL_INCLUDE_PATH:$CPLUS_INCLUDE_PATH
		setenv LD_LIBRARY_PATH $LD_LIBRARY_PATH:/software/cuda-sdk/OpenCL/common/lib/
	
		source /software/HMPPWorkbench-3.2.1/bin/hmpp-env.csh



INSTALLING THE NVIDIA GPU COMPUTING SDK (OPTIONAL)
------------------------------------------------------------------------

1.  Download the NVIDIA GPU Computing SDK with the following command:

	wget http://developer.download.nvidia.com/compute/cuda/4_0/sdk/gpucomputingsdk_4.0.17_linux.run
    
    Alternatively, to reduce bandwidth usage, consider copying the SDK from ~wkillian/Software

2.  Install the SDK:
        a. Adjust permissions for installation with command "chmod +x gpucomputingsdk_4.0.17_linux.run"
        b. Install NVIDIA GPU Computing SDK with following command: ./gpucomputingsdk_4.0.17_linux.run
        c. Use the default "~/NVIDIA_GPU_Computing_SDK" directory for "install path"
        d. When prompted for CUDA install path, give "/software/cuda"

3.  Configuring the SDK to run Example Codes:

	NOTE: NVIDIA's examples require the compilation of three libraries (one for shared utilities, one for OpenCL, and the final for CUDA). The steps listed below will help you compile these libraries used in the SDK. 
			
	a.	compile the first set of (shared) libraries

		make -C ~/NVIDIA_GPU_Computing_SDK/shared

	b.	compile the second set of (CUDA) libraries

		make -C ~/NVIDIA_GPU_Computing_SDK/OpenCL/common

	c.	compile the third set of (OpenCL) libraries

		make -C ~/NVIDIA_GPU_Computing_SDK/C/common


			
REFERENCING EXAMPLE CODE (CUDA, OPENCL, HMPP, OPENACC)
------------------------------------------------------------------------

You can always refer to Samples provided for each platform.

CUDA and OpenCL examples will be found under the GPU Computing SDK.

	CUDA Examples are found under ~/NVIDIA_GPU_Computing_SDK/C/src/
	
	OpenCL Examples are found under ~/NVIDIA_GPU_Computing_SDK/OpenCL/src/

For OpenACC and HMPP programs, examples may be found under ~wkillian/TutorialLabs/. Copy the tarballs to your own directory and extract using "tar xf <tarball name>"



COMPILING CODE
------------------------------------------------------------------------

Each platform comes with Makefiles with examples; however, they may not always work as intended, especially if you do not have the SDK/compiler installed in your home directory. If you find it easier to leverage pre-existing Makefiles to compile your code, install the GPU Computing SDK to your home directory as well as HMPP Workbench 3.2.1. You can obtain HMPP Workbench 3.2.1 from ~wkillian/Software. NOTE: you can only compile and run OpenACC code on cuda.acad.

To compile an example program/lab: invoke "make" from the sample lab folder you are trying to compile.

	CUDA:
		make -C ~/NVIDIA_GPU_Computing_SDK/C/src/deviceQuery/

	OpenCL:
		make -C ~/NVIDIA_GPU_Computing_SDK/OpenCL/src/oclDeviceQuery/

	NOTE:
		Binaries are placed under ~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release for CUDA code
		and ~/NVIDIA_GPU_Computing_SDK/OpenCL/bin/linux/release for OpenCL code


If you choose to remember compiling commands well, feel free to reference a brief guide to compiling various code below.

CUDA:
	
	nvcc [nvcc options] <inputfiles>
	
	ex). nvcc -O2  -o matmul matmul.cu
	
	If you need to pass anything specifig to GCC (host code generation) use -Xcompiler
	
	ex). nvcc -O2 -Xcompiler "-O3 -msse -msse2 -ffast_math -Wall" -o matmul matmul.cu


OpenCL:
	
	Because of how we set up our development environment, we only need to tell the compiler to link with the OpenCL runtime library:

	gcc [options] <inputfiles> -lOpenCL

	OpenCL kernel code is compiled at runtime; only the host code is generated at this time. You will only receive compilation errors in your kernels when you run your code.


OpenACC:
	
	hmpp [hmpp-options] gcc [gcc-options] <inputfiles>

	Listed below are normal compilation options:
	
	hmpp --openacc-target=CUDA --codelet-required gcc -O2 -o mvmult mvmult.c

	NOTE: --openacc-target can either be CUDA or OPENCL but currently CUDA is the only target option working on cuda.acad.



RUNNING CODE
------------------------------------------------------------------------

cuda.acad uses Simple Linux Utility for Resource Management (SLURM) to manage the GPUs. In order to run code on the GPUs we need to request access to the GPU.

To request access we issue the following command:

	srun -N1 --gres=gpu:1 <path to executable>

Notes:

  * Do not change -N1 (we only have one node available)
  * If running on multiple GPUs (not likely) then change the number for -gres=gpu:1 to the desired number <= 4

example execution:

	srun -N1 --gres=gpu:1 ./mvmult



ADDITIONAL INFORMATION
------------------------------------------------------------------------

EECIS CUDA system documentation: https://www.eecis.udel.edu/wiki/ececis-docs/index.php/FAQ/Applications#toc22

NVIDIA GPU Computing SDK info from NVIDIA: http://developer.nvidia.com/gpu-computing-sdk

If you would like a reference .bashrc file for you to use where everything is already configured you may access one at ~wkillian/Resources/dot_bashrc. Remember to store it in your home directory as ~/.bashrc