Cycling through SYCL

Umar ArshadC/C++, Computing Trends, Open Source, OpenCL 1 Comment

We recently gave an overview of recent history in the technical computing hardware market. In it, we mention the energy at Intel right now. The weight of Intel is behind the SYCL standard through its new software approach, oneAPI. SYCL is a cross-platform API that targets heterogeneous hardware, similar to OpenCL and CUDA. The SYCL standard was first introduced by Codeplay and is now being managed by the Khronos group. It allows single-source compilation in C++ to target multiple devices on a system, rather than using C++ for the host and domain specific kernel languages for the device. Furthermore, SYCL is fully C++ 17 standards compliant. You don’t have any extensions to the language that would prevent any standards compliant …

ArrayFire v3.8 Release

John MelonakosAnnouncements, ArrayFire Leave a Comment

We are excited to share the v3.8 release of ArrayFire! ArrayFire is used in commercial, academic, and government projects around the world, solving some of the toughest computing problems in the most innovative projects. It is well-tested and amazingly fast! In this post, we share some of the major features added to ArrayFire in its 3.8 feature release. The binaries and source code can be downloaded from these locations: Official installers GitHub repository Official APT repository Starting with this release, we will provide Ubuntu packages form our APT repository. To install our packages add our apt repository with the below commands. At this moment we are only supporting bionic(18.04) and focal(20.04). apt-key adv –fetch-key https://repo.arrayfire.com/GPG-PUB-KEY-ARRAYFIRE-2020.PUB echo “deb [arch=amd64] https://repo.arrayfire.com/ubuntu $(lsb_release …

The Roaring 20s in AI & Technical Computing

John MelonakosArrayFire, Computing Trends, Open Source Leave a Comment

Since ArrayFire was founded in 2007, there has been an explosion in software and its importance to our lives. Computers, connected to sensors and real-world outcomes, do really cool things that touch nearly every aspect of our lives. I believe these are exciting times for technical computing and for HPC, as evidenced by the things showcased this week at SC 2020. While ArrayFire focuses purely on software, our hardware partners turn our imaginative lines of code into real-world applications. AMD, NVIDIA, and Intel have each evolved tremendously since we started ArrayFire. Over a decade ago, NVIDIA and its CEO-founder Jensen saw the opportunity to teach the world a new heterogeneous model of computing that overwhelmingly convinces scientists, engineers, and analysts …

ArrayFire v3.7.x Release

Stefan YurkevitchAnnouncements, ArrayFire Leave a Comment

With the release of the 3.7.2 patch release, we wanted to discuss some of the major features added to ArrayFire. The binaries have been available for a few weeks but we wanted to discuss the changes here. It can be downloaded from these locations: Official installers GitHub repository This version of ArrayFire is better than ever! We have added many new features that expand the capabilities of ArrayFire while improving its performance and flexibility. Some of the new features include: 16-bit floating point support Neural network compatible convolution and gradient functions Reduce-by-key Confidence Connected Components Array padding functions Support for sparse-sparse arithmetic operations Pseudo-inverse, meanvar(), rqsrt() and much more! We have also spent a significant amount of effort exposing the …

ArrayFire v3.6 Release

Umar ArshadAnnouncements, ArrayFire 3 Comments

Today we are pleased to announce the release of ArrayFire v3.6.  It can be downloaded from these locations: Official installers GitHub repository This latest version of ArrayFire is better than ever! We added several new features that improve the performance and usability of the ArrayFire library. The main features are: Support for batched matrix multiply Added the topk function Added the anisotropic diffusion filter We have also spent a significant amount of effort improving the internals of the library. The build system is significantly improved and organized. Batched Matrix Multiplication The new batch matmul allows you to perform several matrix multiplication operations in one call of matmul. You might want to call this function if you are performing multiple smaller matrix multiplication operations. Here …

ArrayFire v3.5.1 Release

Miguel LloredaAnnouncements, ArrayFire 1 Comment

We are excited to announce ArrayFire v3.5.1! This release focuses on fixing bugs and improving performance. Here are the improvements we think are most important: Performance improvements We’ve improved element-wise operation performance for the CPU backend. The af::regions() function has been modified to leverage texture memory, improving its performance. Our JIT engine has been further optimized to boost performance. Bug fixes We’ve squashed a long standing bug in the CUDA backend responsible for breaking whenever the second, third, or fourth dimensions were large enough to exceed limits imposed by the CUDA runtime. The previous implementation of af::mean() suffered from overflows when the summation of the values lied outside the range of the backing data type. New kernels for each of …

ArrayFire v3.5 Official Release

Umar ArshadAnnouncements, ArrayFire, CUDA, Open Source, OpenCL 1 Comment

Today we are pleased to announce the release of ArrayFire v3.5, our open source library of parallel computing functions supporting CUDA, OpenCL, and CPU devices. This new version of ArrayFire improves features and performance for applications in machine learning, computer vision, signal processing, statistics, finance, and more. This release focuses on thread-safety, support for simple sparse-dense arithmetic operations, canny edge detector function, and a genetic algorithm example. A complete list of ArrayFire v3.5 updates and new features are found in the product Release Notes. Thread Safety ArrayFire now supports threading programming models. This is not intended to improve the performance since most of the parallelism is happening on the device, but it does allow you to use multiple devices in …

ArrayFire v3.4 Official Release

John MelonakosArrayFire Leave a Comment

Today we are pleased to announce the release of ArrayFire v3.4, our open source library of parallel computing functions supporting CUDA, OpenCL, and CPU devices. This new version of ArrayFire improves features and performance for applications in machine learning, computer vision, signal processing, statistics, finance, and more. This release focuses on 5 major components of the library that are common to many areas of mathematical, scientific, and financial computing:  sparse matrix operations, random number generation, image processing, just-in-time (JIT) compilation, and visualizations. Sparse Matrix and BLAS (see blog post) Support for CSR and COO storage types Sparse-Dense Matrix Multiplication and Matrix-Vector Multiplication Conversion to and from dense matrix to CSR and COO storage types Support for Random Number Generator Engines (see blog post) Philox Threefry Mersenne Twister Image Processing (see blog post) …

Performance Improvements to JIT in ArrayFire v3.4

Pavan YalamanchiliAnnouncements, ArrayFire, Benchmarks Leave a Comment

ArrayFire uses Just In Time compilation to combine many light weight functions into a single kernel launch. This along with our easy-to-use API allows users to not only quickly prototype their algorithms, but also get the best out of the underlying hardware. This feature has been a favorite among our users in the domains of finance and scientific simulation. That said, ArrayFire v3.3 and earlier had a few limitations. Namely: Multiple outputs with inter-dependent variables were generating multiple kernels. The number of operations per kernel was fairly limited by default. In the latest release of ArrayFire, we addressed these issues to get some pretty impressive numbers. In the rest of the post, we demonstrate the performance improvements using our BlackScholes …

Random Number Generators in ArrayFire v3.4

Kumar AatishArrayFire Leave a Comment

Pseudorandom number generators (PRNGs) are an integral part of many applications in statistics, modeling, and simulations. In ArrayFire v3.4, we introduce random number generation enhancements that improve speed, accuracy, storage, and unity among the ArrayFire backends. Previously in ArrayFire v3.3, each ArrayFire backend used a different PRNG. In ArrayFire v3.4, each ArrayFire backend is able to select from among 3 different random number generators. ArrayFire v3.3 (platform specific) ArrayFire v3.4 (all generators on all platforms) CUDA XORWOW CUDA, OpenCL, CPU Philox (CBRNG), Threefry (CBRNG), Mersenne Twister OpenCL Threefry (CBRNG) CPU Mersenne Twister As seen above, the XORWOW generator (which was only available for CUDA devices previously) has been replaced by the Philox generator which is available along with Threefry and Mersenne …