The Thrombotherm project by Catalysts is developing a method to analyze blood platelets by means of cell microscopy in real time and to classify them according to their activation state. ArrayFire enabled faster overall research project times and real-time analysis on video data. This project represents an enormous extension of thrombocyte diagnotics, especially through significantly accelerated analysis times. Faster analyses enabled university research collaborators from the University of Applied Sciences OÖ and the Ludwig Boltzmann Institute to shorten research project times. The project has three main parts: Detect cell morphology in real-time Thombotherm makes it possible to mathematically determine and categorize the cell boundaries by means of transmitted light microscopy. The software distinguishes between "fried-egg”-shaped cells and "spider”-shaped cells. This is used ...

## ArrayFire at SC16

SC16 is almost here! We're getting excited to heading to Salt Lake City, Utah, to be a part of this excellent conference. It's a great place for soaking up HPC knowledge, getting inspired, and connecting with the brightest minds in the industry. Here's a quick run-down of where we'll be. Visit our booth. We're booth #717 in the exhibit hall during exhibit hours November 15 - 17. We'll be showing off our latest demos and our engineers will be available for questions. Ask your questions, meet the team, or just bounce some ideas. Try our in-booth tutorials. Want to learn how to use ArrayFire to accelerate your code? Stop by and receive an in-booth tutorial from one of our ArrayFire experts. We’ll show ...

## ArrayFire v3.4 Official Release

Today we are pleased to announce the release of ArrayFire v3.4, our open source library of parallel computing functions supporting CUDA, OpenCL, and CPU devices. This new version of ArrayFire improves features and performance for applications in machine learning, computer vision, signal processing, statistics, finance, and more. This release focuses on 5 major components of the library that are common to many areas of mathematical, scientific, and financial computing: sparse matrix operations, random number generation, image processing, just-in-time (JIT) compilation, and visualizations. Sparse Matrix and BLAS (see blog post) Support for CSR and COO storage types Sparse-Dense Matrix Multiplication and Matrix-Vector Multiplication Conversion to and from dense matrix to CSR and COO storage types Support for Random Number Generator Engines (see blog post) Philox Threefry Mersenne Twister Image Processing (see blog post) ...

## Graphics Updates in ArrayFire v3.4

This post outlines the new graphics features available in ArrayFire v3.4: Vector Fields, Overlays We have added visualization support to render ArrayFire array objects as vector fields. An example of how to visualize vector fields is included in ArrayFire v3.4. A screenshot of this example's output in multi-view mode is shown below, showcasing both static and dynamic vector field rendering. Previously, each graph (such as plot, hist, scatter, etc) was rendered in its own window (or view). Overlaying graphs was not supported. ArrayFire v3.4 now support graph overlays. Each draw call in ArrayFire is either rendered to a whole window (single view) or to a view, which is a portion of the screen obtained in multiview mode. The following image is an example of a ...

## Performance Improvements to JIT in ArrayFire v3.4

ArrayFire uses Just In Time compilation to combine many light weight functions into a single kernel launch. This along with our easy-to-use API allows users to not only quickly prototype their algorithms, but also get the best out of the underlying hardware. This feature has been a favorite among our users in the domains of finance and scientific simulation. That said, ArrayFire v3.3 and earlier had a few limitations. Namely: Multiple outputs with inter-dependent variables were generating multiple kernels. The number of operations per kernel was fairly limited by default. In the latest release of ArrayFire, we addressed these issues to get some pretty impressive numbers. In the rest of the post, we demonstrate the performance improvements using our BlackScholes ...

## Image Processing Functions in ArrayFire v3.4

There are a number of additions and updates to image based features in the new v3.4 release of ArrayFire. Among the updates are: New interpolation methods for several existing functions approx1, approx2 transform resize Functions for image moments This blog post will display some typical use cases for these new features. ArrayFire v3.4 implements several new interpolation methods for 1-d and 2-d domains. The new interpolation methods for 1-d functions are: AF_INTERP_LINEAR_COSINE AF_INTERP_CUBIC and for 2-d functions are: AF_INTERP_BILINEAR_COSINE AF_INTERP_BICUBIC The behavior of the interpolation methods can be seen in the following pictures. A common use for interpolation is image filtering. Given a coarse image, we can resample it to be smoother.

1 2 3 4 5 6 7 8 9 10 11 12 13 |
af::array img = af::randu(7,7); //create a random image //define sample points for interpolation af::array Xs = af::seq(0, 6, 0.1f); af::array Ys = af::seq(0, 6, 0.1f); Xs = af::tile(Xs, 1, Ys.dims(0)); Ys = af::tile(Ys.T(), Xs.dims(0)); //interpolate based on specific method af::array img_bilinear = af::approx2(img, Xs, Ys, AF_INTERP_BILINEAR); af::array img_bilinearcos = af::approx2(img, Xs, Ys, AF_INTERP_BILINEAR_COSINE); af::array img_bicubic = af::approx2(img, Xs, Ys, AF_INTERP_BICUBIC_SPLINE); |

## Random Number Generators in ArrayFire v3.4

Pseudorandom number generators (PRNGs) are an integral part of many applications in statistics, modeling, and simulations. In ArrayFire v3.4, we introduce random number generation enhancements that improve speed, accuracy, storage, and unity among the ArrayFire backends. Previously in ArrayFire v3.3, each ArrayFire backend used a different PRNG. In ArrayFire v3.4, each ArrayFire backend is able to select from among 3 different random number generators. ArrayFire v3.3 (platform specific) ArrayFire v3.4 (all generators on all platforms) CUDA XORWOW CUDA, OpenCL, CPU Philox (CBRNG), Threefry (CBRNG), Mersenne Twister OpenCL Threefry (CBRNG) CPU Mersenne Twister As seen above, the XORWOW generator (which was only available for CUDA devices previously) has been replaced by the Philox generator which is available along with Threefry and Mersenne ...

## Sparse Matrices in ArrayFire v3.4

In ArrayFire v3.4, we have added support for sparse matrices, which greatly reduce the memory footprint on GPUs and accelerated devices for many applications. A sparse data structure is one where all the non-zero elements are not stored. Sparse matrices are useful when the number of zero-values elements are much greater than the number of non-zero elements (i.e. the sparsity of the matrix is high). A sparse data structure is generally stored as 3 arrays: A data or values array containing all the non-zero elements A vector for row indices (based on storage format) A vector for column indices (based on storage format) There are many ways to store sparse matrices, the most prominent of which are: Compressed Sparse Row (CSR) Compressed Storage Column (CSC) ...

## A Simple Particle System with ArrayFire

It's the 4th of July today and we're celebrating at ArrayFire! The 4th of July implies fireworks, and fireworks obviously imply particle systems. Particle systems are a collection of many small images or points that can be rendered to represent some complex behaving object. So before we can launch our fireworks, we will need to create a particle system. The large number of particles in a system lends well to GPU computation. Thankfully, ArrayFire's easy to use interface will allow us to do this simply and efficiently. First, let's examine the structure of a typical particle system. Individual particles in a system typically have a variety of properties that govern their individual behavior. A non-comprehensive list below summarizes some of ...

## Using GPUs in KVM Virtual Machines

Introduction A couple of months ago, I began investigating GPU passthrough on my workstation to test ArrayFire on different operating systems. Around the same time, we at ArrayFire found ourselves with a few surplus GPUs. Having had great success with my virtualization efforts, we decided to build a Virtualized GPU Server to utilize these GPUs. Building a Virtualized GPU Server alleviated one of the pain points at our company: We no longer need to swap GPUs or Hard Disks to test a new environment. To maximize the number of GPUs we can put in a machine, we ended up getting a Quantum TXR430-0768R from Exxact Computing which comes in a 4U form factor and supports upto 8x double width GPUs. ...