ArrayFire v3.9.0 Release

Umar ArshadAnnouncements, ArrayFire, C/C++, CUDA, oneAPI, OpenCL 1 Comment

We are pleased to announce a new release of the ArrayFire library, v3.9.0. This release makes it easier than ever to target new devices without sacrificing performance. This post describes four of these new features, including: oneAPI Backend This release is the first time since v3.0 that introduces a new backend. The new backend is built with the oneAPI specification on top of the SYCL language. oneAPI is an open specification providing a full framework for high-performance computing applications without vendor lock-in. While this has been possible with OpenCL, the oneAPI specification includes libraries like BLAS and FFT significantly reducing the burden on the developers to maintain math functions and increasing the performance of these common operations. Here is a …

Synthetic Aperture Radar on the Jetson TX1

John MelonakosArrayFire, Case Studies, Computer Vision, CUDA, Image Processing 1 Comment

Researchers at Peter the Great St. Petersburg Polytechnic University have implemented a synthetic aperture radar processing on the Jetson TX1 Platform using ArrayFire as described in this paper. The paper introduces SAR as “a remote sensing technique producing high-resolution radar images of the Earth’s surface. SAR technology allows obtaining wide swath radar images of objects at a considerable distance regardless of the weather and lighting conditions. It can be used by unmanned aerial vehicles and space satellites. Thus, SAR technology allows solving various problems, such as: detecting small objects (vehicles, airplanes, ships), assessing the state of railways, airfields, seaports, mapping an area, assisting in geological exploration, mapping vegetation, detecting oil spills and pollution as well as many other tasks.” The …

CUDA Computing on Google Colab with ArrayFire

John MelonakosArrayFire, Benchmarks, Computing Trends, CUDA, Open Source, Python Leave a Comment

For the first-time in our 14 year existence, we are now able to provide our community with the ability to run ArrayFire programs for free within minutes. Before today, users would have to download and install the library on their own systems, which can be a hassle if you just want to play around with some code and benchmarks. Today, we’re excited to announce that ArrayFire is available on Google Colab, the free GPU computing cloud service from Google. Colaboratory, or “Colab” for short, allows you to write and execute Python in your browser, with Zero configuration required, free access to GPUs, and easy sharing. You can jump right in and start playing with this new tool: Click Here to …

Finger Vein Identity Recognition in “Negligible Time” using ArrayFire

John MelonakosArrayFire, Case Studies, Computer Vision, CUDA, Open Source Leave a Comment

In this blog post, we summarize work by researchers in Slovakia using ArrayFire to develop OpenFinger, a finger vein identity recognition library. Finger prints and finger veins can be used as a biometric for identity recognition. The physical setup of their sensor system is the following collection of CMOS sensors scattering light to a near infrared LED that projects the image to a CCD camera for capture to a computer. The computing infrastructure used in this work consists of the following components. Several great open source libraries are used, including OpenCV, Caffe, Qt, and ArrayFire. ArrayFire is specifically used in pre-processing to accelerate Gabor filtering on the GPU. Gabor filter has proven itself as one of the most suitable techniques …

Bringing Together the GPU Computing Ecosystem for Python

John MelonakosAnnouncements, ArrayFire, Computing Trends, CUDA, Open Source, Python Leave a Comment

To date, we have not done a lot for the Python ecosystem. A few months ago, we decided it was time to change that. Like NVIDIA said in this post, the current slate of GPU tools available to Python developers is scattered. With some attention to community building, perhaps we can build something better — together. NVIDIA spoke some about its plans to help cleanup the ecosystem. We’re onboard with that mentality and have two ways we propose to contribute: We’re working on a survey paper that assesses the state of the ecosystem. What technical computing things can you do with each package? What benchmarks result from the packages on real Python user code? What plans does each group have …

Cycling through SYCL

Umar ArshadC/C++, Computing Trends, Open Source, OpenCL 1 Comment

We recently gave an overview of recent history in the technical computing hardware market. In it, we mention the energy at Intel right now. The weight of Intel is behind the SYCL standard through its new software approach, oneAPI. SYCL is a cross-platform API that targets heterogeneous hardware, similar to OpenCL and CUDA. The SYCL standard was first introduced by Codeplay and is now being managed by the Khronos group. It allows single-source compilation in C++ to target multiple devices on a system, rather than using C++ for the host and domain specific kernel languages for the device. Furthermore, SYCL is fully C++ 17 standards compliant. You don’t have any extensions to the language that would prevent any standards compliant …

ArrayFire v3.5 Official Release

Umar ArshadAnnouncements, ArrayFire, CUDA, Open Source, OpenCL 1 Comment

Today we are pleased to announce the release of ArrayFire v3.5, our open source library of parallel computing functions supporting CUDA, OpenCL, and CPU devices. This new version of ArrayFire improves features and performance for applications in machine learning, computer vision, signal processing, statistics, finance, and more. This release focuses on thread-safety, support for simple sparse-dense arithmetic operations, canny edge detector function, and a genetic algorithm example. A complete list of ArrayFire v3.5 updates and new features are found in the product Release Notes. Thread Safety ArrayFire now supports threading programming models. This is not intended to improve the performance since most of the parallelism is happening on the device, but it does allow you to use multiple devices in …

Graphics Updates in ArrayFire v3.4

Pradeep GarigipatiAnnouncements, ArrayFire, OpenGL Leave a Comment

This post outlines the new graphics features available in ArrayFire v3.4:  Vector Fields, Overlays We have added visualization support to render ArrayFire array objects as vector fields. An example of how to visualize vector fields is included in ArrayFire v3.4. A screenshot of this example’s output in multi-view mode is shown below, showcasing both static and dynamic vector field rendering. Previously, each graph (such as plot, hist, scatter, etc) was rendered in its own window (or view). Overlaying graphs was not supported. ArrayFire v3.4 now support graph overlays. Each draw call in ArrayFire is either rendered to a whole window (single view) or to a view, which is a portion of the screen obtained in multiview mode. The following image is an example of a …

ArrayFire – CUDA Interoperability

Brian KloppenborgArrayFire, CUDA 2 Comments

Although ArrayFire is quite extensive, there remain many cases in which you may want to write custom kernels in CUDA or OpenCL. For example, you may wish to add ArrayFire to an existing code base to increase your productivity, or you may need to supplement ArrayFire’s functionality with your own custom implementation of specific algorithms. Today’s “Learning ArrayFire from scratch“, blog post discusses how you can interface ArrayFire and CUDA.

Benchmarking parallel vector libraries

Pavan YalamanchiliArrayFire, Benchmarks, C/C++, CUDA Leave a Comment

There are many open source libraries that implement parallel versions of the algorithms in the C++ standard template libraries. Inevitably we get asked questions about how ArrayFire compares to the other libraries out in the open. In this post we are going to compare the performance of ArrayFire to that of BoostCompute, HSA-Bolt, Intel TBB and Thrust. The benchmarks include the following commonly used vector algorithms across 3 different architectures. Reductions Scan Transform The following setup has been used for the benchmarking purposes. The code to reproduce the benchmarks is linked at the bottom of the post. The hardware used for the benchmarks is listed below: NVIDIA Tesla K20 AMD FirePro S10000 Intel Xeon E5-2560v2 Background ArrayFire ArrayFire provides high …