Image Component Library (ICL)
|
Main class for OpenCL based accelleration. More...
#include <CLProgram.h>
Public Member Functions | |
CLProgram () | |
Default constructor (creates dummy instance) | |
CLProgram (const string deviceType, const string &sourceCode) throw (CLInitException, CLBuildException) | |
create CLProgram with given device type (either "gpu" or "cpu") and souce-code | |
CLProgram (const string deviceType, ifstream &fileStream) throw (CLInitException, CLBuildException) | |
create CLProgram with given device type (either "gpu" or "cpu") and souce-code file | |
CLProgram (const CLProgram &other) | |
copy constructor (creating shallow copy) | |
CLProgram const & | operator= (CLProgram const &other) |
assignment operator (perorming shallow copy) | |
~CLProgram () | |
Destructor. | |
CLBuffer | createBuffer (const string &accessMode, size_t size, const void *src=0) throw (CLBufferException) |
creates a buffer object for memory exchange with graphics card memory | |
CLImage2D | createImage2D (const string &accessMode, const size_t width, const size_t height, int depth, const void *src=0) throw (CLBufferException) |
creates a image2D object for memory exchange with graphics card memory | |
CLKernel | createKernel (const string &id) throw (CLKernelException) |
extract a kernel from the program | |
void | listSelectedPlatform () |
lists various properties of the selected platform | |
void | listSelectedDevice () |
lists various properties of the selected device | |
Static Public Member Functions | |
static void | listAllPlatformsAndDevices () |
lists various properties of all platforms and their devices | |
static void | listAllPlatforms () |
lists various properties of all platforms | |
Private Attributes | |
Impl * | impl |
Main class for OpenCL based accelleration.
The CLProgram is the based class for ICL's OpenCL support framework. A Program instance can be used to create all other neccessary OpenCL support types. In contrast to OpenCL's C++ framework, ICL's framework is even settled on a higher level providing easier access to the relavant functionality.
A CLProgram is -- as presented above -- the main class for OpenCL acellerated implementations. It allows for selected a particular device, i.e. "cpu" or "gpu" devices and it automatically compiles the given OpenCL source code (either passed as string or as an input stream)
A CLBuffer is a memory segment located in the graphics-card's memory. Buffers are associated with a certain CLProgram and they can only be created by that program. Buffers are either read-only, write-only or read-write. This access mode always refers to how the buffer can be accessed by the OpenCL source code, i.e. a "read-only" buffer becomes a "const" data pointer in the corresponding OpenCL kernel interface. And a "write only" buffer cannot be written but not read by the OpenCL source code (?).
TODO: what are the differences ? Why can i use "r" and "w" without producing error messages? (seems to be an OpenCL bug/feature)
A CLImage2D is similar to the CLBuffer but additionally offers interpolation functionality which can be accessed in the kernel by using the sampler_t type. Also a build-in 2D access to the image pixels is supported.
A CLKernel is a callable OpenCL function. CLKernel instances are also created by a program (see Example) and each kernel refers to a single function in the OpenCL source code that is declared as "__kernel".
Before a kernel is called, its arguments are given by using the overloaded CLKernel::setArgs method.
#include <ICLUtils/CLProgram.h> #include <ICLQt/Common.h> struct Conv{ CLProgram program; // main class CLBuffer input,output,mask; // buffers for image input/output and the 3x3-convolution mask CLKernel kernel; // the OpenCL function Size size; // ensure correct buffer sizes Conv(const float m[9], const Size &s):size(s){ // source code static const char *k = ("__kernel void convolve(constant float *m, \n" " const __global unsigned char *in, \n" " __global unsigned char *out, \n" " __local float *localMem){ // unused \n" " const int w = get_global_size(0); \n" " const int h = get_global_size(1); \n" " const int x = get_global_id(0); \n" " const int y = get_global_id(1); \n" " if(x && y && x<w-1 && y<h-1){ \n" " const int idx = x+w*y; \n" " out[idx] = m[0]*in[idx-1-w] + m[1]*in[idx-w] + m[2]*in[idx-w+1] \n" " + m[3]*in[idx-1] + m[4]*in[idx] + m[5]*in[idx+1] \n" " + m[6]*in[idx-1+w] + m[7]*in[idx+w] + m[8]*in[idx+w+1]; \n" " } \n" "} \n"); program = CLProgram("gpu",k); // create program running on CPU-device program.listSelectedDevice(); // show device seledted const int dim = s.getDim(); // get image dimension input = program.createBuffer("r",dim); // create input image buffer output = program.createBuffer("w",dim); // create output image buffer mask = program.createBuffer("r",9*sizeof(float),m); // create buffer for the 3x3 conv. mask kernel = program.createKernel("convolve"); // create the OpenCL kernel } void apply(const Img8u &src, Img8u &dst){ ICLASSERT_THROW(src.getSize() == size && dst.getSize() == size, ICLException("wrong size")); input.write(src.begin(0),src.getDim()); // write input image to graphics memory CLKernel::LocalMemory lMem(9*sizeof(float)); // create local memory (unused example) kernel.setArgs(mask, input, output, lmem); // set kernel arguments kernel.apply(src.getWidth(), src.getHeight(), 0); // apply the kernel (using WxH threads max. // i.e. one per pixel) output.read(dst.begin(0),dst.getDim()); // read output buffer to destination image } } *conv = 0; GUI gui; GenericGrabber grabber; void init(){ grabber.init(pa("-i")); gui << Image().handle("image") << Show(); const ImgBase &image = *grabber.grab(); const float mask[] = { 0.25, 0, -0.25, 0.50, 0, -0.50, 0.25, 0, -0.25 }; grabber.useDesired(image.getSize()); grabber.useDesired(depth8u); grabber.useDesired(formatGray); conv = new Conv(mask,image.getSize()); } void run(){ const Img8u &image = *grabber.grab()->as8u(); static Img8u res(image.getParams()); conv->apply(image,res); gui["image"] = res; } int main(int n, char **args){ return ICLApp(n,args,"-input|-i(2)",init,run).exec(); }
The same example but this time CLImage2D is used for the memory access instead of CLBuffer.
#include <ICLUtils/CLProgram.h> #include <ICLQt/Common.h> struct Conv{ CLProgram program; // main class CLImage2D input,output,mask; // images for input/output and the 3x3-convolution mask CLKernel kernel; // the OpenCL function Size size; // ensure correct buffer sizes Conv(const float m[9], const Size &s):size(s){ // source code static const char *k = ( "__kernel void convolve(__read_only image2d_t m, \n" " __read_only image2d_t in, \n" " __write_only image2d_t out){ \n" " const int x = get_global_id(0); \n" " const int y = get_global_id(1); \n" " const int w = get_global_size(0); \n" " const int h = get_global_size(1); \n" " if(x && y && x<w-1 && y<h-1){ \n" " const sampler_t sampler= CLK_NORMALIZED_COORDS_FALSE | \n" " CLK_ADDRESS_CLAMP | \n" " CLK_FILTER_NEAREST; \n" " uint4 outPixel = 0; \n" " for (int mx = 0; mx < 3; mx++) { \n" " for (int my = 0; my < 3; my++) { \n" " uint4 inPixel = read_imageui(in, sampler, (int2)(x-mx-1,y-my-1));\n" " float4 mValue = read_imagef(m, sampler, (int2)(mx,my)); \n" " outPixel.s0 += mValue.s0 * inPixel.s0; \n" " } \n" " } \n" " write_imageui(out, (int2)(x,y), outPixel); \n" " } \n" "} \n"); program = CLProgram("gpu",k); // create program running on CPU-device program.listSelectedDevice(); // show device seledted input = program.createImage2D("r", s.width, s.height, 0); // create input image output = program.createImage2D("w", s.width, s.height, 0); // create output image mask = program.createImage2D("r",3, 3, 3, m); // create image for the 3x3 conv. mask kernel = program.createKernel("convolve"); // create the OpenCL kernel } void apply(const Img8u &src, Img8u &dst){ ICLASSERT_THROW(src.getSize() == size && dst.getSize() == size, ICLException("wrong size")); input.write(src.begin(0)); // write input image to graphics memory kernel.setArgs(mask, input, output); // set kernel arguments kernel.apply(src.getWidth(), src.getHeight(), 0); // apply the kernel (using WxH threads max. // i.e. one per pixel) output.read(dst.begin(0)); // read output image to destination image } } *conv = 0; GUI gui; GenericGrabber grabber; void init(){ grabber.init(pa("-i")); gui << Image().handle("image") << Show(); const ImgBase &image = *grabber.grab(); const float mask[] = { 0.25, 0, -0.25, 0.5, 0, -0.5, 0.25, 0, -0.25}; grabber.useDesired(image.getSize()); grabber.useDesired(depth8u); grabber.useDesired(formatGray); conv = new Conv(mask,image.getSize()); } void run(){ const Img8u &image = *grabber.grab()->as8u(); static Img8u res(image.getParams()); conv->apply(image,res); gui["image"] = res; } int main(int n, char **args){ return ICLApp(n,args,"-input|-i(2)",init,run).exec(); }
Default constructor (creates dummy instance)
icl::utils::CLProgram::CLProgram | ( | const string | deviceType, |
const string & | sourceCode | ||
) | throw (CLInitException, CLBuildException) |
create CLProgram with given device type (either "gpu" or "cpu") and souce-code
icl::utils::CLProgram::CLProgram | ( | const string | deviceType, |
ifstream & | fileStream | ||
) | throw (CLInitException, CLBuildException) |
create CLProgram with given device type (either "gpu" or "cpu") and souce-code file
icl::utils::CLProgram::CLProgram | ( | const CLProgram & | other | ) |
copy constructor (creating shallow copy)
Destructor.
CLBuffer icl::utils::CLProgram::createBuffer | ( | const string & | accessMode, |
size_t | size, | ||
const void * | src = 0 |
||
) | throw (CLBufferException) |
creates a buffer object for memory exchange with graphics card memory
acessMode can either be "r", "w" or "rw", which refers to the readibility of the data by the OpenCL source code (actually this seems to be not relevant since all buffers can be read and written).
Each buffer has a fixed size (given in bytes). Optionally an initial source pointer can be passed that is then automatically uploaded to the buffer exisiting in the graphics memory.
CLImage2D icl::utils::CLProgram::createImage2D | ( | const string & | accessMode, |
const size_t | width, | ||
const size_t | height, | ||
int | depth, | ||
const void * | src = 0 |
||
) | throw (CLBufferException) |
creates a image2D object for memory exchange with graphics card memory
acessMode can either be "r", "w" or "rw", which refers to the readibility of the data by the OpenCL source code (actually this seems to be not relevant since all images can be read and written).
Optionally an initial source pointer can be passed that is then automatically uploaded to the image exisiting in the graphics memory.
various image depths can be used
depth8u = 0, < 8Bit unsigned integer values range {0,1,...255}
depth16s = 1, < 16Bit signed integer values
depth32s = 2, < 32Bit signed integer values
depth32f = 3, < 32Bit floating point values
depth64f = 4, < 64Bit floating point values
CLKernel icl::utils::CLProgram::createKernel | ( | const string & | id | ) | throw (CLKernelException) |
extract a kernel from the program
Kernels in the CLProgram's source code have to be qualified with the __kernel qualifier. The kernel (aka function) name in the OpenCL source code is used as id.
static void icl::utils::CLProgram::listAllPlatforms | ( | ) | [static] |
lists various properties of all platforms
static void icl::utils::CLProgram::listAllPlatformsAndDevices | ( | ) | [static] |
lists various properties of all platforms and their devices
lists various properties of the selected device
lists various properties of the selected platform
assignment operator (perorming shallow copy)
Impl* icl::utils::CLProgram::impl [private] |