First install PyCuda. You can fetch the latest package from http://pypi.python.org/pypi/pycuda.
Before you can use Cuda you must initialize the device the same way as you would in your C program.
import pycuda.driver as pycuda
pycuda.init()
assert cuda.Device.count() >= 1
cudaDev = cuda.Device(0)
cudaCTX = dev.make_context()
For a cuda program the basic methodolgy is to copy from system memory to devices memory, perform processing, then copy data back from the device to the system. PyCuda provides facilities to do this.
First let's create a numpy array of data that we wish to transfer:
import numpy
a = numpy.random.randn(4,4)
a = a.astype(numpy.float32)
a_gpu = cuda.mem_alloc(a.size * a.dtype.itemsize)
pycuda.memcpy_htod(a_gpu, a)
We now have our data on the device, we need to instruct the GPU to execute our Kernel. A Kernel, when talking about CUDA, is the actual code that will be executed on the GPU. PyCuda requires that you write the kernel in C and pass it to the device.
For example here is a kernel that adds one to the value of each element.
mod = cuda.SourceModule("""
__global__ void addOne(float *a)
{
int idx = threadIdx.x + threadIdx.y*4;
a[idx]+= 1;
}
""")
Now tell the device to execute our kernel.
func = mod.get_function("addOne")
func(a_gpu, block=(4,4,1))
Lastly we copy the contents from the device back to system memory and print the results.
a_addOne = numpy.empty_like(a)
pycuda.memcpy_dtoh(a_doubled, a_gpu)
print a_doubled
print a
Other Resources:
1 comment:
Finally I have found something which helped me. Thank you!
visit us
Post a Comment