Costs, License, Privacy
Campy is free and open source software under the MIT license. No information is collected, except for comments in this website. For further information on privacy, see this page.
Campy in under a minute
To try out Campy, you will need to be running Windows 10 or Ubuntu 16.04 on a x64 processor. Further, I assume you have an NVIDIA GPU Kepler (sm_30) or newer architecture installed (Maxwell/sm_50, Pascal/sm_60, Volta/sm_70), as well as the CUDA GPU Toolkit 9.1.85 installed. I recommend you use Net Core 2.1 (install NET SDK, ~5 minutes to install), and a Bash shell with the script given below. On Windows, you can use, for example, Cygwin, or MinGW, which is installed when you install Git (~5 minutes to install). If you prefer, you could use Powershell or even Cmd to perform the equivalent commands below. Note, I haven’t tried Campy in a Windows for Subsystem Linux, but I suspect it won’t work because of the issues in sharing the GPU with a Windows host. Campy works under the Mono system.
Finally, within Bash, copy and paste the following code.
#!/bin/bash
mkdir test
cd test
dotnet new console
cat - << HERE > Program.cs
namespace test
{
class Program
{
static void Main(string[] args)
{
int n = 4;
int[] x = new int[n];
Campy.Parallel.For(n, i => x[i] = i);
for (int i = 0; i < n; ++i)
System.Console.WriteLine(x[i]);
}
}
}
HERE
dotnet add package Campy
dotnet build
unameOut="$(uname -s)"
case "${unameOut}" in
Linux*)
dotnet publish -r ubuntu.16.04-x64
cd bin/Debug/netcoreapp2.1/ubuntu.16.04-x64/publish/
./test
;;
Darwin*)
echo Cannot target Mac yet.
exit 1
;;
CYGWIN*)
dotnet publish -r win-x64
cd bin/Debug/netcoreapp2.1/win-x64/publish/
./test.exe
;;
MINGW*)
dotnet publish -r win-x64
cd bin/Debug/netcoreapp2.1/win-x64/publish/
./test.exe
;;
*)
echo Unknown machine.
exit 1
;;
esac
echo Output should be four lines of integers, 0 to 3.
Once an app is “published” as a self-contained deployment, it is completely sufficient. Non self-containing apps run the risk of Campy unable to resolve assemblies used by the program. I currently do not implement the rules outlined by Microsoft, but I will at some point.
As an alternative to a Net Core 2.0 app, you can install MS Visual Studio 2017 for development, Nsight for debugging, and create a Net Framework 4.71 app. Note: Nsight does not work with Net Core apps.
Examples
Examples of Campy are in Git, https://github.com/kaby76/Campy/tree/master/Tests, including Reduction, various sorting algorithms, FFT, etc.
The API
Philosophy of the API
- The API must be very small. If the API is over just a handful of methods, it may be useful in optimizing the implementation for the GPU, but it’s impossible to remember.
- GPU independent. There should not be any CUDA-specific code in the API. There should be a simple, idealized model of a GPU.
- Memory management should be determined by the compiler. The user should not be burdened with knowing what to transfer to GPU global memory, or CPU pinned memory. The C# operator new should work on the GPU.
- All of C# should work within kernel code, with the exception that methods that are clearly CPU bound, such as:
- Thread, a thread is a CPU artifact. I haven’t decided how to support dynamic parallelism, but it probably won’t be through the System.Threading API.
- Marshal.AllocHGlobal, as this is CPU oriented, indicating the memory pool characteristics. Use the C# new operator.
Namespace: Campy
Classes
Parallel | Provides support for parallel loops. |
Sequential | Provides support for sequential loops in the same syntax as with Parallel. |
Delegates
KernelType | Encapsulates a basic kernel code for GPU that takes one parameter (integer index) and does not return a value. |
Syntax
public delegate void KernelType(int idx)
You can use this delegate to pass a method as a parameter without explicitly declaring a custom delegate. The encapsulated method must correspond to the method signature that is defined by this delegate. This means that the encapsulated method must have one parameter and no return value. For information on delegates, see the Microsoft documentation.
public class Parallel
static void For(Int32, KernelType) | Executes a for loop in which iterations may run in parallel on a GPU. |
static void Readonly(object) | Indicates to never copy object from GPU back to the CPU. |
static void Sticky(object) | Indicates to keep object on GPU until Sync() is called. |
static void Sync() | Indicates to copy objects on GPU back to the CPU. |
public class Sequential
static void For(Int32, KernelType) | Executes a for loop in which iterations run sequentially on a GPU. |
NB: Sorry, cooperative threads, the great power of GPUs, are currently not supported but will be when Campy is far enough along and stable. Expect them to be added by July 2018.
Architecture
Please see this page for some documentation on the organization of Campy.
Comparison with other GPU/C# systems
Please see this page for a comparison of various GPU/C# software.
GPU/C# in the Wild
Latest issues in CoreCLR for GPGPU
Latest issues in CoreFx for GPGPU
Latest issues in Roslyn for GPGPU
CIL Instructions Implemented in Campy
Please see this page for a table on the instructions implemented.