Accelerating Java using ArrayFire, CUDA and OpenCL

Pavan ArrayFire, JAVA 3 Comments

We have previously mentioned the ability to use ArrayFire through Java.

In this post, we are going to show how you can get the best performance inside Java using ArrayFire for CUDA and OpenCL.


Here is a sample code to perform Monte Caro Estimation of Pi.

The same code can be written using ArrayFire in the following manner.

  • Array.randu(dims, Array.FloatType) creates a uniform random Array.
    • Array.FloatType is passed in to create a uniform random array of 32 bit floating point numbers.
    • Other types can include Array.FloatComplexType, Array.DoubleType and so on.

  • Array.mul, Array.add and perform element wise operations on the two operands to produce an output.

  • Array.sumAll adds up all the elements in the array to produce a scalar output.

  • x.close(), y.close() and res.close() are necessary in the finally section.
    • This ensures that the unnecessary memory is released when you are exiting the function
    • This is because the Java garbage collector may not control the device being used by ArrayFire.


Using ArrayFire CUDA in Java, The NVIDIA K5000 is 13x faster than the native Java code on an Intel core i7 3770k CPU.

Using ArrayFire OpenCL in Java, the same CPU is 7x faster than the native Java implementation.

The AMD HD 7970 is 14x faster than native Java using ArrayFire OpenCL.


ArrayFire for Java is a work in progress. You'll need Java 7 or higher to use ArrayFire through Java. We are trying to add more functionality and documentation in the coming weeks. You can find our Java Wrapper for ArrayFire over here.

If you need help accelerating your Java code using ArrayFire, please contact us at