OpenCL Toys => SmallptGPU

SmallptCPU vs SmallptGPU

Written by David Bucciarelli

SmallptGPU is a small and simple demo written in OpenCL in order to test the performance of this new standard. It is based on Kevin Beason's Smallpt available at http://www.kevinbeason.com/smallpt/. SmallptGPU has been written using the ATI OpenCL SDK 2.0 on Linux but it should work on any platform/implementation (i.e. NVIDIA). Some discussion about this little toy can be found at Luxrender's forum

A video of SmallptGPU is available here: http://vimeo.com/8013005 (the old low quality version is available here: http://vimeo.com/8013005)

History

  • V1.6 - Thanks to Jens and all the discussion at Luxrender's forum now SmallptGPU works fine with MacOS and NVIDIA cards. A bug in the Apple's OpenCL compiler has been found (Khronos's OpenCL forum) and a workaround has been applied to SmallptGPU. Added a new kernel with direct lighting surface integrator (very fast indeed).

  • V1.5 - Thanks to discussion at Beyond3D, the perfomances on NVIDA GPUs have been improved. They are not yet where they should be but are lot better now.V1.5 - The thanks to discussion at http://forum.beyond3d.com/showthread.php?t=55913 The perfomances on NVIDA GPUs have been improved. They are not yet where they should be but are lot better now.

  • V1.4 - Updated for ATI SDK 2.0, fixed a problem in object selection

  • V1.3 - Jens's patch for MacOS, added on-screen help, fixed performance estimation, removed movie recording, added on-screen help, added Windows binaries

  • V1.2 - Indirect diffuse path can be now disabled/enabled (available only on CPU version because a bug of ATI's compiler), optimized buffers reallocation, added keys to select/move objects

  • V1.1 - Fixed few portability problems, added support to save movie, fixed a problem in window resize code

  • V1.0 - First release

The following test has been done at 1024x768 with scenes/cornell.scn.

SmallCPU

This is just a simple mono-thread CPU implementation (no OpenCL involved). Result:

Sample/sec 446836

SmallptGPU on CPU device

This is the OpenCL implementation using only the CPU device. Result:

Reading scene: scenes/cornell.scn
Scene size: 9
For test only: Expires on Sun Feb 28 00:00:00 2010
OpenCL Device 0: Type = TYPE_CPU
OpenCL Device 0: Name = Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz OpenCL Device 0: Compute units = 4
OpenCL Device 0: Max. work group size = 1024
Reading file 'rendering_kernel.cl' (size 2634 bytes)
[...]
Rendering time 1.870000 sec (pass 7) Sample/sec 420552

It uses the 4 cores but it has the same performance of smallptCPU (with only one core). I guess CPU devices are useful only for developing purpose (i.e. when you don't have a fast GPU available).

SmallGPU (on GPU)

This is the OpenCL implementation using only the GPU device. Result:

Reading scene: scenes/cornell.scn
Scene size: 9
For test only: Expires on Sun Feb 28 00:00:00 2010
OpenCL Device 0: Type = TYPE_GPU
OpenCL Device 0: Name = ATI RV770
OpenCL Device 0: Compute units = 10
OpenCL Device 0: Max. work group size = 256
Reading file 'rendering_kernel.cl' (size 2634 bytes)
[...]
Rendering time 1.040000 sec (pass 236) Sample/sec 4537108

It is about 10 time faster than the single-thread CPU implementation.

How to compile

Just edit the Makefile and use an appropriate value for ATISTREAMSDKROOT.

Key bindings

  • 'p' - save image.ppm

  • ESC - exit

  • Arrow keys - rotate camera left/right/up/down

  • 'a' and 'd' - move camera left and right

  • 'w' and 's' - move camera forward and backward

  • 'r' and 'f' - move camera up and down

  • PageUp and PageDown - move camera target up and down

  • ' ' - refresh the window

  • '+' and '-' - to select next/previous object

  • '2', '3', '4', '5', '6', '8', '9' - to move selected object

Download: smallptgpu-v1.6.tgz (includes sources, Linux 64bit binaries and Windows 32bit binaries)