Skip to content
kazuyukitanimura edited this page Sep 14, 2010 · 11 revisions

What is this?

This is a project under the Ruby summer of code: NArray on OpenCL.

NArray on OpenCL utilizes hardware resources such as GPU. The goal of our project is to translate NArray from C to OpenCL. The NArray has the characters that match well as OpenCL (GPU) applications. In addition, OpenCL is hardware agnostic so that the programs using OpenCL should run on various CPUs/GPUs. This means the translation of NArray to OpenCL will not detract the feature of the Ruby portability. Therefore, this project allows Ruby programmers to facilitate implementing and executing heavily calculation-oriented applications.

What platforms does it support?

This project is tested on the Foxc compiler (available at http://www.fixstars.com/foxc/) and Apple(Mac mini mid 2010 a.k.a Mac OS X v10.6, Xcode3.2.2, and Nvidia GeForce 320M); However, we “hope” the code works with ATI Stream and Nvidia compliers as well.
Currently, this program runs only on CPUs, but we are working hard to make it run on GPUs.

How do I install it?

Installation:

  • If you are using any of Foxc, ATI Stream, or Nvidia compiler; the LD_LIBRARY_PATH has to specify the path to your OpenCL Library install directory.
    $ ruby extconf.rb
    $ make
    $ make install

The actual kernel file (na_kernel.cl) will be complied at the time you do ‘require “narray”’ (It might take several seconds). This is because some OpenCL implementations do not support the offline compiling.

Is it useful?

Benchmark results (Mac mini mid 2010, CPU:Intel Core 2 Duo 2.4GHz, Memory:DDR3 1067MHz 2GB)

NArray on OpenCL is slower than the original NArray for the simple operations such as addition

(This result should be different if you have quad cores or more. NArray on OpenCL becomes faster if you have more “cores”)

Original
$ ruby bench/bench.rb sfloat add 1000000 1000
Ruby NArray type=sfloat size=1000000 op=add repeat=1000  Time: 3.98 sec

OpenCL
$ ruby bench/bench.rb sfloat add 1000000 1000
Ruby NArray type=sfloat size=1000000 op=add repeat=1000  Time: 6.65 sec

NArray on OpenCL is a bit faster than the original NArray for the fairly complicated operations such as modulo

Original
$ ruby bench/bench.rb sfloat mod 1000000 1000
Ruby NArray type=sfloat size=1000000 op=mod repeat=1000  Time: 21.99 sec

OpenCL
$ ruby bench/bench.rb sfloat mod 1000000 1000
Ruby NArray type=sfloat size=1000000 op=mod repeat=1000  Time: 19.63 sec

NArray on OpenCL is faster than the original NArray for the complicated operations such as atanh

Original
$ ruby bench/bench.rb scomplex atanh 1000000 1000
Ruby NArray type=scomplex size=1000000 op=atanh repeat=1000  Time: 98.95 sec

OpenCL
$ ruby bench/bench.rb scomplex atanh 1000000 1000
Ruby NArray type=scomplex size=1000000 op=atanh repeat=1000  Time: 42.46 sec

Notes

  1. Currently NArray on OpenCL uses only CPUs. GPU version will be available soon!
  2. Only byte, sint, int, sloat, scomplex operations work on OpenCL environment. float, complex, and object operations are the same as the original NArray. This limitation is caused by the fact that some OpenCL implementations do not support double precision operations.
  3. For Mac, random!(max), srand(), abs(), and ** operations are not executable on OpenCL. These operations are treated as the original NArray operations.

Last but not least

This is a forked repository (dev branch). Have a look at
masa16’s narray wiki
for more information.

Reference
Fixstars Corporation, “OpenCL入門 マルチコアCPU・GPUのための並列プログラミング,” (In Japanese) Feb. 2010
http://www.fixstars.com/company/books/opencl/