chickadee » opencl

OpenCL bindings for Chicken Scheme

These bindings target OpenCL version 1.1 for wide support. It should be fairly easy to add support to newer OpenCL APIs.

Source Code

Hosted here.

General conventions

Most Scheme procedures map 1-1 to OpenCL C procedures, with some naming conventions changed. All procedures start with the type of object they work on. For example, clCreateBuffer becomes buffer-create.

All OpenCL objects are wrapped in a record structure. All finalizer keyword arguments default to using set-finalizer! to call the appropriate *-release! procedure when the object is freed by the CHICKEN GC. This automatic memory management should be sufficient for many use cases.

For a large number of OpenCL objects it may be a good idea to call the *-release! procedures explicitly. All the *-release! procedures are idempotent, so it is even safe to use them on objects that have finalizers attached. Also, (gc #t) can sometimes be used to free ObjecCL objects no longer needed by the application, freeing OpenCL resources sooner.

If any of the OpenCL C procedures return an error code other than CL_SUCCESS, an error is signaled, hopefully with a meaningful error message.

OpenCL installation

You must have a working OpenCL installed on your system for this egg to work. clinfo is a useful tool to diagnose and verify this. Environment variables like OCL_ICD_VENDORS may be useful to out-source the device-picking process:

 
 ~> csi -R opencl -P '(platforms)'
(#<cl_platform "Intel(R) OpenCL HD Graphics">
 #<cl_platform "Clover">
 #<cl_platform "Portable Computing Language">)
 ~> env OCL_ICD_VENDORS=intel.icd csi -R opencl -P '(platforms)'
(#<cl_platform "Intel(R) OpenCL HD Graphics">)

Example

There is unfortunately a bit boilerplate to get started, but luckily it's less than the equivalent C code. This code snippet might be useful:

 
(import srfi-4 opencl)

(define device (car (flatten (map platform-devices (platforms)))))
(define context (context-create device))
(define cq (command-queue-create context device))
(define buffer (buffer-create (f32vector -1 -1 -1 -1 -1) cq))
(define kernel (kernel-create (program-build (program-create context "
__kernel void foo(__global float *out) {
  out[get_global_id(0)] = get_global_id(0) * 10;
}
") device) "foo"))
(kernel-arg-set! kernel 0 buffer)
(kernel-enqueue kernel cq (list 4))
(print (buffer-read buffer cq))

When run, it should output #f32(0.0 10.0 20.0 30.0 -1.0). The last -1 is leftovers from buffer creation. More examples can be found in the examples folder.

API

platformsprocedure

List all the OpenCL platforms on the system.

platform-extensions platformprocedure
platform-name platformprocedure
platform-profile platformprocedure
platform-vendor platformprocedure
platform-version platformprocedure

Get various information for a given platform. See clGetPlatformIDs

platform-devices platform #!optional device-typeprocedure

List the available devices for platform. device-type can optionally be supplied as a filter, and symbol among cpu gpu all accelerator default.

device-info deviceprocedure
device-address-bits deviceprocedure
device-name deviceprocedure
device-vendor deviceprocedure
device-version deviceprocedure

Get various information of device. See this page.

context-create devices #!key (finalizer #t)procedure
context-release!procedure

Create or release an OpenCL context. See this page. Unfurtunately, currently devices must be a single cl_device. In the future, a list of cl_devices might be implemented.

command-queue-create context device #!key out-of-order profile (finalizer #t)procedure
command-queue-release! command_queueprocedure

Create or release a cl_command-queue. out-of-order is a boolean flag and is #f by default. See this page.

command-queue-context contextprocedure

Get back the cl_context passed into command-queue-create.

buffer-create s c #!key (type #f) (flags 0) (finalizer #t)procedure
buffer-release! bufferprocedure

Create or release an OpenCL buffer object, a memory chunk that is usually stored on the target device. See this page.

If c is a cl_context, s (s for source) must be an integer representing the buffer size in bytes.

If c is a cl_command-queue, s must be a srfi-4 vector or a blob. The content of s is copied to the newly created buffer buffer using buffer-write. The size in bytes of this buffer and the size in bytes of s will be equal.

type defaults to the type of s for srfi-4 vectors, or blob if s is an integer. If s is an integer, type can be explicitly supplied. See buffer-type for valid values.

Support for flags is currently limited, and must be a number corresponding to the C API. The default is CL_MEM_READ_WRITE.

buffer-write buffer cq src #!key (offset 0)procedure

Copy the content of src to buffer. See this page. Only blocking writes are supported.

cq must be a valid cl_command-queue. src must be a srfi-4 vector or a blob. offset is the offset in bytes in the buffer object to write to.

buffer-read buffer cq #!key type (dst #f) (byte-offset 0) (bytes #f)procedure

Copy the content of buffer into host memory. The returned object is dst if supplied, otherwise a newly allocated srfi-4 vector or blob. Which vector type depends on type, which defaults to (buffer-type buffer). If supplied, dst must be a srfi-4 vector or blob and its content will be overwritten with data from buffer.

If supplied, bytes must be the number of bytes to copy. If supplied, byte-offset must be the number of bytes to skip in buffer.

buffer-size bufferprocedure

Return the size of buffer in bytes.

buffer-type bufferprocedure

Return the type tag symbol. This is stored in the cl_mem Scheme record, not in the OpenCL cl_mem object. It is used by buffer-read to construct a suitable srfi-4 vector. Valid types are u8 s8 u16 s16 u32 s32 u64 s64 f32 f64.

program-create context source #!key (finalizer #t)procedure
program-release! programprocedure

Create or release a OpenCL with clCreateProgramWithSource.

context must be a valid cl_context. source must be OpenCL C source code as a string. Note that the returned program must be built before it can be used to create kernels.

program-build program devices #!key optionsprocedure

devices must be a valid cl_device. Unfortunately, only a single device is currently supported. If the build process fails, an error is signaled with the content of (program-build-log program) which will hopefully contain useful compiler errors.

If supplied, options must be a string as specified in clBuildProgram.

program-build-log programprocedure
program-build-options programprocedure

Retrieve various information about program, see this page.

kernel-create program name #!key (finalizer #t)procedure
kernel-release! kernelprocedure

Create or release a cl_kernel. See this page.

name must be a string, the name of a __kernel function inside program.

kernel-arg-set! kernel idx valueprocedure

Calls clSetKernelArg. idx must be an integer, where 0 is the first kernel argument. value must be a srfi-4 vector for by-value argument types, or cl_buffer for arguments pointers.

The type of the kernel argument (ie. float2) must match the srfi-4 vector type (ie. (f32vector 1 1)). It is unfortunately not possible to enforce checks for this. The srfi-4 vector's length must match the kernel argument. For example, the arguments to __kernel foo(long4 a, float *result) could be initialized like this:

 
(kernel-arg-set! kernel 0 (s64vector 1 2 3 4))
(kernel-arg-set! kernel 1 (buffer-create (* 4 1000) ctx type: 'f32))

I don't know if you must call this for all arguments each time.

kernel-enqueue kernel cq global-work-sizes #!key event wait global-work-offsets local-work-sizesprocedure

Enqueues kernel for execution on cq as described here.

global-work-sizes must be a list of integers, each representing a work size for each dimension. On most platforms, the maximum number of dimensions (and thus the length of global-work-sizes) is 3. The minimum is always 1.

The numbers supplied here will be reflected inside kernel as get_global_size(n). Note that total number of worker-items executed should be (foldl * 1 global-work-sizes).

Unlike most of the other procedures, kernel-enqueue is non-blocking and may return before kernel is finished executing. If event is supplied and is #t or a cl_event, the returned event can be used to query the kernel execution status. If event is not supplied, kernel-enqueue returns #f.

Events

The event API is a bit experimental and difficult to test.

event-create contextprocedure
event-release!procedure

Create or release a user-event cl_event, see this page.

event-complete! event #!optional (status 'complete)procedure

Set the execution status of event, see clSetUserEventStatus.

event-status eventprocedure

Retrieve event status, one of: queued submitted running complete.

Record predicates

cl_command-queue? xprocedure
cl_context? xprocedure
cl_device? xprocedure
cl_event? xprocedure
cl_kernel? xprocedure
cl_mem? xprocedure
cl_platform? xprocedure
cl_program? xprocedure

Predicates for record types. buffer object are cl_mem? records.

Development status

So far everything is blocking. Implementing the non-blocking API would complicate things as the Chicken GC might get in the way.

Contents »