303x Filetype PDF File size 2.25 MB Source: wrigstad.com
An introduction to
many-core parallel
computing with
OpenCL
Simon McIntosh-Smith
Twitter: @simonmcs
UPMARC summer school
th
Uppsala 28-29 2014
Recap
5 simple steps in a basic OpenCL program:
1. Define the platform = devices + context +
queues
2. Create and Build the program (dynamic
library of kernels)
3. Setup memory objects
4. Define the kernels
5. Submit commands … transfer memory
objects and execute kernels
We have now covered the basic
platform runtime APIs in OpenCL
CPU GPU
Context
Programs Kernels Memory Objects Command Queues
Programs
__kernel void dp_mul Buffers Images
arg [0]
dp_mul(global const float *a, dp_mul arg [0] In Out of
value In Out of
arg[0] value
global const float *b, CPU program binary value Order Order
global float *c) arg [1] Order Order
{ arg [1]
dp_mul value Queue Queue
arg[1] value Queue
int id = get_global_id(0); GPU program binary value Queue
c[id] = a[id] * b[id]; arg [2]
arg [2]
} value
arg[2] value
value GPU
Compute Device
OPENCL KERNEL PROGRAMMING
no reviews yet
Please Login to review.