ArrayFire v3.7.x Release

Stefan Announcements, ArrayFire Leave a Comment

With the release of the 3.7.2 patch release, we wanted to discuss some of the major features added to ArrayFire. The binaries have been available for a few weeks but we wanted to discuss the changes here. It can be downloaded from these locations:

This version of ArrayFire is better than ever! We have added many new features that expand the capabilities of ArrayFire while improving its performance and flexibility. Some of the new features include:

  • 16-bit floating point support
  • Neural network compatible convolution and gradient functions
  • Reduce-by-key
  • Confidence Connected Components
  • Array padding functions
  • Support for sparse-sparse arithmetic operations
  • Pseudo-inverse, meanvar(), rqsrt() and much more!

We have also spent a significant amount of effort exposing the memory manager of the library. Thanks to the contributions of Jacob Khan, it is now possible to write a custom memory manager for more tailored performance on specific devices or backends.

16-bit floating point support

A major addition to ArrayFire 3.7 is the inclusion of fp16 compatible functions. The af::array now supports the f16 datatype and many functions have been ported to take advantage of this type.

The lower level library calls can take advantage of f16 hardware. The f16 data type takes half the amount of memory as a float and on certain hardware you can see a dramatic improvements on compute heavy operations such as matmul().

Not all hardware or platforms support f16 operations so you may need to check with your device before performing the conversion.

Neural network compatible convolution and gradient functions

The building blocks of image-based deep learning are convolutions. ArrayFire has long provided optimized convolution functions, however these have been somewhat incompatible with the existing deep learning conventions. The new convolve2NN and convolve2GradientNN functions provide methods to perform 2d, multi-channel image convolutions and backwards gradients. In the CUDA backend, these functions wrap their equivalents of the cuDNN library.

Array padding functions

Padding an array previously required a few indexing operations which can now be done in a single call to the pad function. Several schemes are supported, including padding with zeros, padding symmetrically across the border, clamping the border values, and periodic tiling across the padded border. The pad function allows arbitrary border sizes with significant performance improvements.

Confidence Connected Components

One of the new functions added is confidenceCC() , which groups neighboring pixels together whose values lie within a confidence interval of each other, essentially extracting a single segment from the input image. The implementation is described in more detail in the ArrayFire library documentation. Basically, the segmentation starts at the seed pixels as specified by the user. The segment grows and adds the neighboring pixels to the current segment if they are within the confidence interval of the seed pixel. This happens iteratively, each time updating the confidence interval’s mean and variance, using the collective values of the segment so far. An example of this is given below, where an input image has three segments obvious to humans which the connected components algorithm detects.

Input Brain Scan Image is Acquired from a post on NIH website about "Brain Scan May Predict Best Depression Treatment"

af::confidenceCC has been manually given the seed points for each of these segments, and as a result, it automatically masked the segment it found for the given seed point (the white portion of each image).

Reduce By Key

The regular reductions we have used and loved  have been extended to allow reductions based on a key array. Reduction will only be performed on contiguous integer based keys:

The new reduction functions are sumByKey, productByKey, minByKey, maxByKey, allTrueByKey, anyTrueByKey, countByKey. Detailed usage can be found in our documentation.

Other Improvements

This release of ArrayFire also includes several lower level improvements and performance optimizations. Among the optimizations are:

  • Optimized af::array assignment
  • Optimized unified backend function calls
  • Optimized anisotropic smoothing
  • Optimized canny filter for CUDA and OpenCL

As always, ArrayFire strives to be as simple to use as possible. We're continually making efforts to improve our documentation. This release has expanded on a number of pages in our documentation. We've also added more descriptive error messages and logging capabilities with respect to device and driver incompatibilities.

More detail and a complete list of these and other changes can be found in our release notes.

We are excited to finally release ArrayFire v3.7. We would like to thank our community for supporting us and helping us improve ArrayFire. Through the efforts of the community this release has continually grown and improved into the feat it is today. We are eagerly looking forward to community participation in future releases!

Dedicated Support and Coding Services

ArrayFire is open source and always will be. For those who want dedicated support or custom function development, we offer a variety of support packages.

ArrayFire also serves many clients through consulting and coding services, algorithm development, porting code, and training courses for developers. Contact us at or schedule a free technical consultation to learn more about our consulting and coding services.