-
Notifications
You must be signed in to change notification settings - Fork 2
Home
This is a project under the Ruby summer of code: NArray on OpenCL.
NArray on OpenCL utilizes hardware resources such as GPU. The goal of our project is to translate NArray from C to OpenCL. The NArray has the characters that match well as OpenCL (GPU) applications. In addition, OpenCL is hardware agnostic so that the programs using OpenCL should run on various CPUs/GPUs. This means the translation of NArray to OpenCL will not detract the feature of the Ruby portability. Therefore, this project allows Ruby programmers to facilitate implementing and executing heavily calculation-oriented applications.
This project is tested on the Foxc compiler (available at http://www.fixstars.com/foxc/) and Apple(Mac mini mid 2010 a.k.a Mac OS X v10.6, Xcode3.2.2, and Nvidia GeForce 320M); However, we “hope” the code works with ATI Stream and Nvidia compliers as well.
Currently, this program runs only on CPUs, but we are working hard to make it run on GPUs.
- If you are using any of Foxc, ATI Stream, or Nvidia compiler; the LD_LIBRARY_PATH has to specify the path to your OpenCL Library install directory.
$ ruby extconf.rb $ make $ make install
The actual kernel file (na_kernel.cl) will be complied at the time you do ‘require “narray”’ (It might take several seconds). This is because some OpenCL implementations do not support the offline compiling.
(This result should be different if you have quad cores or more. NArray on OpenCL becomes faster if you have more “cores”)
Original
$ ruby bench/bench.rb sfloat add 1000000 1000
Ruby NArray type=sfloat size=1000000 op=add repeat=1000 Time: 3.98 sec
OpenCL
$ ruby bench/bench.rb sfloat add 1000000 1000
Ruby NArray type=sfloat size=1000000 op=add repeat=1000 Time: 6.65 sec
NArray on OpenCL is a bit faster than the original NArray for the fairly complicated operations such as modulo
Original
$ ruby bench/bench.rb sfloat mod 1000000 1000
Ruby NArray type=sfloat size=1000000 op=mod repeat=1000 Time: 21.99 sec
OpenCL
$ ruby bench/bench.rb sfloat mod 1000000 1000
Ruby NArray type=sfloat size=1000000 op=mod repeat=1000 Time: 19.63 sec
Original
$ ruby bench/bench.rb scomplex atanh 1000000 1000
Ruby NArray type=scomplex size=1000000 op=atanh repeat=1000 Time: 98.95 sec
OpenCL
$ ruby bench/bench.rb scomplex atanh 1000000 1000
Ruby NArray type=scomplex size=1000000 op=atanh repeat=1000 Time: 42.46 sec
- Currently NArray on OpenCL uses only CPUs. GPU version will be available soon!
- Only byte, sint, int, sloat, scomplex operations work on OpenCL environment. float, complex, and object operations are the same as the original NArray. This limitation is caused by the fact that some OpenCL implementations do not support double precision operations.
- For Mac, random!(max), srand(), abs(), and ** operations are not executable on OpenCL. These operations are treated as the original NArray operations.
This is a forked repository (dev branch). Have a look at
masa16’s narray wiki
for more information.
Reference
Fixstars Corporation, “OpenCL入門 マルチコアCPU・GPUのための並列プログラミング,” (In Japanese) Feb. 2010
http://www.fixstars.com/company/books/opencl/