ArrayFire v3.8.1 Release

Stefan YurkevitchArrayFire Leave a Comment

We are excited to share the v3.8.1 bugfix release of ArrayFire!

In this post, we share an overview of the changes to ArrayFire in its 3.8.1 bugfix release. The binaries and source code can be downloaded from these locations:

The bugfix release consists mainly of overall improvements to the ArrayFire 3.8 codebase as well as bugfixes.


As always, a number of improvements have been made to all backends. We continue to clean up the codebase and update the library to support newer frameworks. In addition to general maintenance and bookkeeping, the following improvements have been added:

  • moddims now uses JIT approach for certain special cases
  • JIT Performance Optimization
  • Improved readability of log traces
  • Use short function name in non-debug build error messages
  • Short-circuit zero elements case in detail::copyArray backend function
  • Speedup of kernel caching mechanism
  • Add short-circuit check for empty Arrays in JIT evalNodes
  • Performance optimization of indexing using dynamic thread block sizes
  • Speedup join by eliminating temp buffers for cascading joins
  • Added batch support for solve
  • Use pinned memory to copy device pointers in CUDA solve
  • General(including JIT) performance improvements across backends
  • Update CLBlast to latest version
  • Improved Otsu threshold computation helper in canny algorithm


  • Fixed a bug JIT kernel disk caching
  • Fixed stream used by thrust(CUDA backend) functions
  • Added workaround for new cuSparse API that was added by CUDA amid fix releases
  • Fixed const array indexing inside gfor
  • Handle zero elements in copyData to host
  • Fixed double free regression in OpenCL backend
  • Fixed an infinite recursion bug in NaryNode JIT Node
  • Added missing input validation check in sparse-dense arithmetic operations
  • Fixed bug in getMappedPtr in OpenCL due to invalid lambda capture
  • Fixed bug in getMappedPtr on Arrays that are not ready
  • Fixed edgeTraceKernel for CPU devices on OpenCL backend
  • Fixed windows build issue(s) with VS2019
  • API documentation fixes
  • CMake Build Fixes
  • Fixed couple of bugs in CPU backend canny implementation
  • Fixed reference count of array(s) used in JIT operations. It is related to arrayfire’s internal memory book keeping. The behavior/accuracy of arrayfire code wasn’t broken earlier. It corrected the reference count to be of optimal value in the said scenarios. This may potentially reduce memory usage in some narrow cases
  • Added assert that checks if topk is called with a negative value for k
  • Fixed an Issue where countByKey would give incorrect results for any n > 128

More detail and a complete list of these and other changes can be found in our release notes.

We are excited to finally release ArrayFire v3.8.1. We would like to thank our community for supporting us and helping us improve ArrayFire, with a special thanks to contributors HO-COOHWilly BornGilad AvidovPavan Yalamanchili for this bugfix release. Through the efforts of the community this release has continually grown and improved into the feat it is today. We are eagerly looking forward to community participation in future releases!

Dedicated Support and Coding Services

ArrayFire is open source and always will be. For those who want dedicated support or custom function development, we offer a variety of support packages.

ArrayFire also serves many clients through consulting and coding services, algorithm development, porting code, and training courses for developers. Contact us at or schedule a free technical consultation to learn more about our consulting and coding services.

Leave a Reply

Your email address will not be published. Required fields are marked *