gpgpu Archives - Oakdale Software Ltd

Tag Archives: gpgpu

Getting Started with OpenCL

by Alex on February 6, 2014 in programming

Probably the most amazing thing about OpenCL is its heterogeneous nature. An OpenCL kernel can run on just about any compute device in your computer, the CPU, the GPU or even a FPGA and it can all be orchestrated from the host with ease.

As you may be aware, 3rd generation Intel Core (and later) processors include an integrated graphics component and in the HD400 and later chips this compute power is not to be sniffed at and certainly worth exploiting however its not entirely clear how you access it. If like me you have a discrete graphics card you may be wondering as I did why the Intel GPU is not accessible.

Here’s what to do.

Boot your computer into the BIOS settings and look for a section probably entitled something like “System Agent”, under this menu :

“Initiate Graphic Adapter” – set this to PCIe/PCI
“iGPU Multi-Monitor” – set this to Enabled

Save your settings and re-boot.

Now visit the Intel website and download the appropriate graphics driver for your CPU, install it and re-boot once more, then when you open your device panel you can see the integrated Intel graphics device like this :

We’re ready to start programming.

Next you are going to need an OpenCL SDK so that you have the headers you need to build an OpenCL program (the drivers already have a run-time). It doesn’t really matter who’s you use, in my case I downloaded the Nvidia tools which are part of the CUDA SDK. Currently the download is here but may move at a later date.

Once installed you will need to set-up your project to access the SDK. In Visual Studio 2013 (12 is the same) select the property manager tab and select your build target, in my case I select “Debug | x64” then double-click “Microsoft.Cpp.x64.user” so that you only modify properties for this project. Now you have the property dialog open select “VC++ Directories” and enter :

Include Directories – $(CUDA_PATH)\include;$(IncludePath)
Library Directories – $(CUDA_PATH)\lib\x64;$(LibraryPath)

The CUDA installer has conveniently created an environment variable called CUDA_PATH to make this nice and clean.

Now go to the “Linker” then “General” section and update :

Additional Library Directories – $(CUDA_LIB_PATH);%(AdditionalLibraryDirectories)

Then “Linker”, “Input” and update :

Additional Dependencies – OpenCL.lib;%(AdditionalDependencies)

Hit OK and we’re ready to go.

This is a little program to look for compute devices on your system and print out their capabilities :

#include "stdafx.h"
#include <CL/cl.h>

#include <memory>
#include <vector>
#include <iostream>

void displayPlatformInfo(cl_platform_id id,	cl_platform_info param_name, const char* paramNameAsStr) 
{
	cl_int error = 0;
	size_t paramSize = 0;

	error = clGetPlatformInfo(id, param_name, 0, NULL, &paramSize);
	std::unique_ptr<char> moreInfo(new char[paramSize]);
	error = clGetPlatformInfo(id, param_name, paramSize, moreInfo.get(), NULL);

	if (error == CL_SUCCESS) {
		std::cout << paramNameAsStr << " : " << moreInfo.get() << std::endl;
	}
}

void displayDeviceDetails(cl_device_id id, cl_device_info param_name, const char* paramNameAsStr) {
	cl_int error = 0;
	size_t paramSize = 0;

	error = clGetDeviceInfo(id, param_name, 0, NULL, &paramSize);
	if (error != CL_SUCCESS) {
		perror("Unable to obtain device info for param\n");
		return;
	}

	/* the cl_device_info are preprocessor directives defined in cl.h */
	switch (param_name) {
		case CL_DEVICE_TYPE: {
			std::unique_ptr<cl_device_type> devType(new cl_device_type[paramSize]);
			error = clGetDeviceInfo(id, param_name, paramSize, devType.get(), NULL);
			if (error != CL_SUCCESS) {
				perror("Unable to obtain device info for param\n");
				return;
			}
			switch (*devType) {
			case CL_DEVICE_TYPE_CPU: printf("CPU detected\n"); break;
			case CL_DEVICE_TYPE_GPU: printf("GPU detected\n"); break;
			case CL_DEVICE_TYPE_ACCELERATOR: printf("Accelerator detected\n"); break;
			case CL_DEVICE_TYPE_DEFAULT: printf("default detected\n"); break;
			}
		} break;
		case CL_DEVICE_VENDOR_ID:
		case CL_DEVICE_MAX_COMPUTE_UNITS:
		case CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: {
			std::unique_ptr<cl_uint> ret(new cl_uint[paramSize]);
			error = clGetDeviceInfo(id, param_name, paramSize, ret.get(), NULL);
			if (error != CL_SUCCESS) {
				perror("Unable to obtain device info for param\n");
				return;
			}
			switch (param_name) {
			case CL_DEVICE_VENDOR_ID: printf("\tVENDOR ID: 0x%x\n", *ret); break;
			case CL_DEVICE_MAX_COMPUTE_UNITS: printf("\tMaximum number of parallel compute units: %d\n", *ret); break;
			case CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: printf("\tMaximum dimensions for global/local work-item IDs: %d\n", *ret); break;
			}
		} break;
		case CL_DEVICE_MAX_WORK_ITEM_SIZES: {
			cl_uint maxWIDimensions;
			std::unique_ptr<size_t> ret(new size_t[paramSize]);
			error = clGetDeviceInfo(id, param_name, paramSize, ret.get(), NULL);

			error = clGetDeviceInfo(id, CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS, sizeof(cl_uint), &maxWIDimensions, NULL);
			if (error != CL_SUCCESS) {
				perror("Unable to obtain device info for param\n");
				return;
			}
			printf("\tMaximum number of work-items in each dimension: ( ");
			for (cl_uint i = 0; i < maxWIDimensions; ++i) {
				printf("%d ", ret.get()[i]);
			}
			printf(" )\n");
		} break;
		case CL_DEVICE_MAX_WORK_GROUP_SIZE: {
			std::unique_ptr<size_t> ret(new size_t[paramSize]);
			error = clGetDeviceInfo(id, param_name, paramSize, ret.get(), NULL);
			if (error != CL_SUCCESS) {
				perror("Unable to obtain device info for param\n");
				return;
			}
			printf("\tMaximum number of work-items in a work-group: %d\n", *ret);
		} break;
		case CL_DEVICE_NAME:
		case CL_DEVICE_VENDOR: {
			std::unique_ptr<char> data(new char[48]);
			error = clGetDeviceInfo(id, param_name, paramSize, data.get(), NULL);
			if (error != CL_SUCCESS) {
				perror("Unable to obtain device name/vendor info for param\n");
				return;
			}
			switch (param_name) {
			case CL_DEVICE_NAME: printf("\tDevice name is %s\n", data.get()); break;
			case CL_DEVICE_VENDOR: printf("\tDevice vendor is %s\n", data.get()); break;
			}
		} break;
		case CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE: {
			std::unique_ptr<cl_uint> size(new cl_uint[paramSize]);
			error = clGetDeviceInfo(id, param_name, paramSize, size.get(), NULL);
			if (error != CL_SUCCESS) {
				perror("Unable to obtain device name/vendor info for param\n");
				return;
			}
			printf("\tDevice global cacheline size: %d bytes\n", (*size)); break;
		} break;
		case CL_DEVICE_GLOBAL_MEM_SIZE:
		case CL_DEVICE_MAX_MEM_ALLOC_SIZE: {
			std::unique_ptr<cl_ulong> size(new cl_ulong[paramSize]);
			error = clGetDeviceInfo(id, param_name, paramSize, size.get(), NULL);
			if (error != CL_SUCCESS) {
				perror("Unable to obtain device name/vendor info for param\n");
				return;
			}
			switch (param_name) {
			case CL_DEVICE_GLOBAL_MEM_SIZE: printf("\tDevice global mem: %ld mega-bytes\n", (*size) >> 20); break;
			case CL_DEVICE_MAX_MEM_ALLOC_SIZE: printf("\tDevice max memory allocation: %ld mega-bytes\n", (*size) >> 20); break;
			}
		} break;

	} //end of switch

}

void displayDeviceInfo(cl_platform_id id, cl_device_type dev_type) {
	/* OpenCL 1.1 device types */
	cl_int error = 0;
	cl_uint numOfDevices = 0;

	/* Determine how many devices are connected to your platform */
	error = clGetDeviceIDs(id, dev_type, 0, NULL, &numOfDevices);
	if (error != CL_SUCCESS) {
		perror("Unable to obtain any OpenCL compliant device info");
		exit(1);
	}

	std::vector<cl_device_id> devices(numOfDevices, nullptr);

	/* Load the information about your devices into the variable 'devices' */
	error = clGetDeviceIDs(id, dev_type, numOfDevices, devices.data(), NULL);
	if (error != CL_SUCCESS) {
		perror("Unable to obtain any OpenCL compliant device info");
		exit(1);
	}
	printf("Number of detected OpenCL devices: %d\n", numOfDevices);
	/* We attempt to retrieve some information about the devices. */
	for (auto device : devices) {
		displayDeviceDetails(device, CL_DEVICE_TYPE, "CL_DEVICE_TYPE");
		displayDeviceDetails(device, CL_DEVICE_NAME, "CL_DEVICE_NAME");
		displayDeviceDetails(device, CL_DEVICE_VENDOR, "CL_DEVICE_VENDOR");
		displayDeviceDetails(device, CL_DEVICE_VENDOR_ID, "CL_DEVICE_VENDOR_ID");
		displayDeviceDetails(device, CL_DEVICE_MAX_MEM_ALLOC_SIZE, "CL_DEVICE_MAX_MEM_ALLOC_SIZE");
		displayDeviceDetails(device, CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE, "CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE");
		displayDeviceDetails(device, CL_DEVICE_GLOBAL_MEM_SIZE, "CL_DEVICE_GLOBAL_MEM_SIZE");
		displayDeviceDetails(device, CL_DEVICE_MAX_COMPUTE_UNITS, "CL_DEVICE_MAX_COMPUTE_UNITS");
		displayDeviceDetails(device, CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS, "CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS");
		displayDeviceDetails(device, CL_DEVICE_MAX_WORK_ITEM_SIZES, "CL_DEVICE_MAX_WORK_ITEM_SIZES");
		displayDeviceDetails(device, CL_DEVICE_MAX_WORK_GROUP_SIZE, "CL_DEVICE_MAX_WORK_GROUP_SIZE");
	}
}

int _tmain(int argc, _TCHAR* argv[])
{
	/* OpenCL 1.1 scalar data types */
	cl_uint numOfPlatforms;
	cl_int  error;

	/*
	Get the number of platforms
	*/
	error = clGetPlatformIDs(0, NULL, &numOfPlatforms);
	if (error != CL_SUCCESS) {
		perror("Unable to find any OpenCL platforms");
		return(1);
	}

	// Allocate memory for the number of installed platforms.
	std::vector<cl_platform_id> platforms(numOfPlatforms, nullptr);
	printf("Number of OpenCL platforms found: %d\n", numOfPlatforms);

	error = clGetPlatformIDs(numOfPlatforms, platforms.data(), NULL);
	if (error != CL_SUCCESS) {
		perror("Unable to find any OpenCL platforms");
		return(1);
	}

	for (auto platform : platforms) {
		displayPlatformInfo(platform, CL_PLATFORM_PROFILE, "CL_PLATFORM_PROFILE");
		displayPlatformInfo(platform, CL_PLATFORM_VERSION, "CL_PLATFORM_VERSION");
		displayPlatformInfo(platform, CL_PLATFORM_NAME, "CL_PLATFORM_NAME");
		displayPlatformInfo(platform, CL_PLATFORM_VENDOR, "CL_PLATFORM_VENDOR");
		displayPlatformInfo(platform, CL_PLATFORM_EXTENSIONS, "CL_PLATFORM_EXTENSIONS");
		// Assume that we don't know how many devices are OpenCL compliant, we locate everything !
		displayDeviceInfo(platform, CL_DEVICE_TYPE_ALL);
	}

	return 0;
}

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

#include "stdafx.h"

#include <CL/cl.h>

#include <memory>

#include <vector>

#include <iostream>

void displayPlatformInfo(cl_platform_id id, cl_platform_info param_name, const char* paramNameAsStr)

{

cl_int error = 0;

size_t paramSize = 0;

error = clGetPlatformInfo(id, param_name, 0, NULL, &paramSize);

std::unique_ptr<char> moreInfo(new char[paramSize]);

error = clGetPlatformInfo(id, param_name, paramSize, moreInfo.get(), NULL);

if (error == CL_SUCCESS) {

std::cout << paramNameAsStr << " : " << moreInfo.get() << std::endl;

}

void displayDeviceDetails(cl_device_id id, cl_device_info param_name, const char* paramNameAsStr) {

cl_int error = 0;

size_t paramSize = 0;

error = clGetDeviceInfo(id, param_name, 0, NULL, &paramSize);

if (error != CL_SUCCESS) {

perror("Unable to obtain device info for param\n");

return;

}

/* the cl_device_info are preprocessor directives defined in cl.h */

switch (param_name) {

case CL_DEVICE_TYPE: {

std::unique_ptr<cl_device_type> devType(new cl_device_type[paramSize]);

error = clGetDeviceInfo(id, param_name, paramSize, devType.get(), NULL);

if (error != CL_SUCCESS) {

perror("Unable to obtain device info for param\n");

return;

}

switch (*devType) {

case CL_DEVICE_TYPE_CPU: printf("CPU detected\n"); break;

case CL_DEVICE_TYPE_GPU: printf("GPU detected\n"); break;

case CL_DEVICE_TYPE_ACCELERATOR: printf("Accelerator detected\n"); break;

case CL_DEVICE_TYPE_DEFAULT: printf("default detected\n"); break;

}

} break;

case CL_DEVICE_VENDOR_ID:

case CL_DEVICE_MAX_COMPUTE_UNITS:

case CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: {

std::unique_ptr<cl_uint> ret(new cl_uint[paramSize]);

error = clGetDeviceInfo(id, param_name, paramSize, ret.get(), NULL);

if (error != CL_SUCCESS) {

perror("Unable to obtain device info for param\n");

return;

}

switch (param_name) {

case CL_DEVICE_VENDOR_ID: printf("\tVENDOR ID: 0x%x\n", *ret); break;

case CL_DEVICE_MAX_COMPUTE_UNITS: printf("\tMaximum number of parallel compute units: %d\n", *ret); break;

case CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: printf("\tMaximum dimensions for global/local work-item IDs: %d\n", *ret); break;

}

} break;

case CL_DEVICE_MAX_WORK_ITEM_SIZES: {

cl_uint maxWIDimensions;

std::unique_ptr<size_t> ret(new size_t[paramSize]);

error = clGetDeviceInfo(id, param_name, paramSize, ret.get(), NULL);

error = clGetDeviceInfo(id, CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS, sizeof(cl_uint), &maxWIDimensions, NULL);

if (error != CL_SUCCESS) {

perror("Unable to obtain device info for param\n");

return;

}

printf("\tMaximum number of work-items in each dimension: ( ");

for (cl_uint i = 0; i < maxWIDimensions; ++i) {

printf("%d ", ret.get()[i]);

}

printf(" )\n");

} break;

case CL_DEVICE_MAX_WORK_GROUP_SIZE: {

std::unique_ptr<size_t> ret(new size_t[paramSize]);

error = clGetDeviceInfo(id, param_name, paramSize, ret.get(), NULL);

if (error != CL_SUCCESS) {

perror("Unable to obtain device info for param\n");

return;

}

printf("\tMaximum number of work-items in a work-group: %d\n", *ret);

} break;

case CL_DEVICE_NAME:

case CL_DEVICE_VENDOR: {

std::unique_ptr<char> data(new char[48]);

error = clGetDeviceInfo(id, param_name, paramSize, data.get(), NULL);

if (error != CL_SUCCESS) {

perror("Unable to obtain device name/vendor info for param\n");

return;

}

switch (param_name) {

case CL_DEVICE_NAME: printf("\tDevice name is %s\n", data.get()); break;

case CL_DEVICE_VENDOR: printf("\tDevice vendor is %s\n", data.get()); break;

}

} break;

case CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE: {

std::unique_ptr<cl_uint> size(new cl_uint[paramSize]);

error = clGetDeviceInfo(id, param_name, paramSize, size.get(), NULL);

if (error != CL_SUCCESS) {

perror("Unable to obtain device name/vendor info for param\n");

return;

}

printf("\tDevice global cacheline size: %d bytes\n", (*size)); break;

} break;

case CL_DEVICE_GLOBAL_MEM_SIZE:

case CL_DEVICE_MAX_MEM_ALLOC_SIZE: {

std::unique_ptr<cl_ulong> size(new cl_ulong[paramSize]);

error = clGetDeviceInfo(id, param_name, paramSize, size.get(), NULL);

if (error != CL_SUCCESS) {

perror("Unable to obtain device name/vendor info for param\n");

return;

}

switch (param_name) {

case CL_DEVICE_GLOBAL_MEM_SIZE: printf("\tDevice global mem: %ld mega-bytes\n", (*size) >> 20); break;

case CL_DEVICE_MAX_MEM_ALLOC_SIZE: printf("\tDevice max memory allocation: %ld mega-bytes\n", (*size) >> 20); break;

}

} break;

} //end of switch

}

void displayDeviceInfo(cl_platform_id id, cl_device_type dev_type) {

/* OpenCL 1.1 device types */

cl_int error = 0;

cl_uint numOfDevices = 0;

/* Determine how many devices are connected to your platform */

error = clGetDeviceIDs(id, dev_type, 0, NULL, &numOfDevices);

if (error != CL_SUCCESS) {

perror("Unable to obtain any OpenCL compliant device info");

exit(1);

}

std::vector<cl_device_id> devices(numOfDevices, nullptr);

/* Load the information about your devices into the variable 'devices' */

error = clGetDeviceIDs(id, dev_type, numOfDevices, devices.data(), NULL);

if (error != CL_SUCCESS) {

perror("Unable to obtain any OpenCL compliant device info");

exit(1);

}

printf("Number of detected OpenCL devices: %d\n", numOfDevices);

/* We attempt to retrieve some information about the devices. */

for (auto device : devices) {

displayDeviceDetails(device, CL_DEVICE_TYPE, "CL_DEVICE_TYPE");

displayDeviceDetails(device, CL_DEVICE_NAME, "CL_DEVICE_NAME");

displayDeviceDetails(device, CL_DEVICE_VENDOR, "CL_DEVICE_VENDOR");

displayDeviceDetails(device, CL_DEVICE_VENDOR_ID, "CL_DEVICE_VENDOR_ID");

displayDeviceDetails(device, CL_DEVICE_MAX_MEM_ALLOC_SIZE, "CL_DEVICE_MAX_MEM_ALLOC_SIZE");

displayDeviceDetails(device, CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE, "CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE");

displayDeviceDetails(device, CL_DEVICE_GLOBAL_MEM_SIZE, "CL_DEVICE_GLOBAL_MEM_SIZE");

displayDeviceDetails(device, CL_DEVICE_MAX_COMPUTE_UNITS, "CL_DEVICE_MAX_COMPUTE_UNITS");

displayDeviceDetails(device, CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS, "CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS");

displayDeviceDetails(device, CL_DEVICE_MAX_WORK_ITEM_SIZES, "CL_DEVICE_MAX_WORK_ITEM_SIZES");

displayDeviceDetails(device, CL_DEVICE_MAX_WORK_GROUP_SIZE, "CL_DEVICE_MAX_WORK_GROUP_SIZE");

}

int _tmain(int argc, _TCHAR* argv[])

{

/* OpenCL 1.1 scalar data types */

cl_uint numOfPlatforms;

cl_int error;

Get the number of platforms

error = clGetPlatformIDs(0, NULL, &numOfPlatforms);

if (error != CL_SUCCESS) {

perror("Unable to find any OpenCL platforms");

return(1);

}

// Allocate memory for the number of installed platforms.

std::vector<cl_platform_id> platforms(numOfPlatforms, nullptr);

printf("Number of OpenCL platforms found: %d\n", numOfPlatforms);

error = clGetPlatformIDs(numOfPlatforms, platforms.data(), NULL);

if (error != CL_SUCCESS) {

perror("Unable to find any OpenCL platforms");

return(1);

}

for (auto platform : platforms) {

displayPlatformInfo(platform, CL_PLATFORM_PROFILE, "CL_PLATFORM_PROFILE");

displayPlatformInfo(platform, CL_PLATFORM_VERSION, "CL_PLATFORM_VERSION");

displayPlatformInfo(platform, CL_PLATFORM_NAME, "CL_PLATFORM_NAME");

displayPlatformInfo(platform, CL_PLATFORM_VENDOR, "CL_PLATFORM_VENDOR");

displayPlatformInfo(platform, CL_PLATFORM_EXTENSIONS, "CL_PLATFORM_EXTENSIONS");

// Assume that we don't know how many devices are OpenCL compliant, we locate everything !

displayDeviceInfo(platform, CL_DEVICE_TYPE_ALL);

}

return 0;

}

This gives us output like this :

Number of OpenCL platforms found: 2
CL_PLATFORM_PROFILE : FULL_PROFILE
CL_PLATFORM_VERSION : OpenCL 1.1 CUDA 6.0.1
CL_PLATFORM_NAME : NVIDIA CUDA
CL_PLATFORM_VENDOR : NVIDIA Corporation
CL_PLATFORM_EXTENSIONS : cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_n3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_compiler_options cl_nv_devicv_pragma_unroll
Number of detected OpenCL devices: 2
GPU detected
        Device name is GeForce GTX 680
        Device vendor is NVIDIA Corporation
        VENDOR ID: 0x10de
        Device max memory allocation: 512 mega-bytes
        Device global cacheline size: 128 bytes
        Device global mem: 2048 mega-bytes
        Maximum number of parallel compute units: 8
        Maximum dimensions for global/local work-item IDs: 3
        Maximum number of work-items in each dimension: ( 1024 1024 64  )
        Maximum number of work-items in a work-group: 1024
GPU detected
        Device name is GeForce GTX 680
        Device vendor is NVIDIA Corporation
        VENDOR ID: 0x10de
        Device max memory allocation: 512 mega-bytes
        Device global cacheline size: 128 bytes
        Device global mem: 2048 mega-bytes
        Maximum number of parallel compute units: 8
        Maximum dimensions for global/local work-item IDs: 3
        Maximum number of work-items in each dimension: ( 1024 1024 64  )
        Maximum number of work-items in a work-group: 1024
CL_PLATFORM_PROFILE : FULL_PROFILE
CL_PLATFORM_VERSION : OpenCL 1.2
CL_PLATFORM_NAME : Intel(R) OpenCL
CL_PLATFORM_VENDOR : Intel(R) Corporation
CL_PLATFORM_EXTENSIONS : cl_khr_fp64 cl_khr_icd cl_khr_global_int32_base_atomics cl_khr_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_ntel_printf cl_ext_device_fission cl_intel_exec_by_local_thread cl_khr_gl_sharing cl_intl_khr_dx9_media_sharing cl_khr_d3d11_sharing
Number of detected OpenCL devices: 1
CPU detected
        Device name is        Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
        Device vendor is Intel(R) Corporation
        VENDOR ID: 0x8086
        Device max memory allocation: 8159 mega-bytes
        Device global cacheline size: 64 bytes
        Device global mem: 32639 mega-bytes
        Maximum number of parallel compute units: 8
        Maximum dimensions for global/local work-item IDs: 3
        Maximum number of work-items in each dimension: ( 1024 1024 1024  )
        Maximum number of work-items in a work-group: 1024

Number of OpenCL platforms found: 2

CL_PLATFORM_PROFILE : FULL_PROFILE

CL_PLATFORM_VERSION : OpenCL 1.1 CUDA 6.0.1

CL_PLATFORM_NAME : NVIDIA CUDA

CL_PLATFORM_VENDOR : NVIDIA Corporation

CL_PLATFORM_EXTENSIONS : cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_n3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_compiler_options cl_nv_devicv_pragma_unroll

Number of detected OpenCL devices: 2

GPU detected

Device name is GeForce GTX 680

Device vendor is NVIDIA Corporation

VENDOR ID: 0x10de

Device max memory allocation: 512 mega-bytes

Device global cacheline size: 128 bytes

Device global mem: 2048 mega-bytes

Maximum number of parallel compute units: 8

Maximum dimensions for global/local work-item IDs: 3

Maximum number of work-items in each dimension: ( 1024 1024 64 )

Maximum number of work-items in a work-group: 1024

GPU detected

Device name is GeForce GTX 680

Device vendor is NVIDIA Corporation

VENDOR ID: 0x10de

Device max memory allocation: 512 mega-bytes

Device global cacheline size: 128 bytes

Device global mem: 2048 mega-bytes

Maximum number of parallel compute units: 8

Maximum dimensions for global/local work-item IDs: 3

Maximum number of work-items in each dimension: ( 1024 1024 64 )

Maximum number of work-items in a work-group: 1024

CL_PLATFORM_PROFILE : FULL_PROFILE

CL_PLATFORM_VERSION : OpenCL 1.2

CL_PLATFORM_NAME : Intel(R) OpenCL

CL_PLATFORM_VENDOR : Intel(R) Corporation

CL_PLATFORM_EXTENSIONS : cl_khr_fp64 cl_khr_icd cl_khr_global_int32_base_atomics cl_khr_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_ntel_printf cl_ext_device_fission cl_intel_exec_by_local_thread cl_khr_gl_sharing cl_intl_khr_dx9_media_sharing cl_khr_d3d11_sharing

Number of detected OpenCL devices: 1

CPU detected

Device name is Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz

Device vendor is Intel(R) Corporation

VENDOR ID: 0x8086

Device max memory allocation: 8159 mega-bytes

Device global cacheline size: 64 bytes

Device global mem: 32639 mega-bytes

Maximum number of parallel compute units: 8

Maximum dimensions for global/local work-item IDs: 3

Maximum number of work-items in each dimension: ( 1024 1024 1024 )

Maximum number of work-items in a work-group: 1024

Lovely.

Comments ( 0 )

Oakdale Software Ltd

Getting Started with OpenCL

Technologies

Recent Articles