Demystifying PTX Code

Peter EntschevC/C++, CUDA, OpenCL 3 Comments

In my recent post, I showed how to generate PTX files from both CUDA and OpenCL kernels. In this post I will address the issue of how a PTX file look, and more importantly, how to understand all those complicated instructions in a PTX files. In this post I will use the same vector addition kernel from the the previous post previous post (the complete code can be found here). For this post, I will focus on OpenCL PTX file. In a future post I will discuss the differences between PTX files of OpenCL and CUDA code. Let’s start by looking at the complete PTX code: // // Generated by NVIDIA NVVM Compiler // Compiler built on Sun May 18 …

The Benefits of Array Element-Wise Operations

Oded GreenArrayFire 1 Comment

In this blog we will review the benefit of using element-wise operations for your computations. Element-wise operations are operations that are applied to every element in an array and allow the user to avoid coding loops and nested loops for rudimentary operations. In a simple example of an element-wise operation, we use both the addition (+) and multiply (*)operations: array A = randu(1024, 1024), B = randu(1024, 1024); array C=A*A+B*B; An element-wise operator that is applied to a single element is unary operator. An operator that works on two elements is a binary operator. ArrayFire has implemented a large number element-wise operations that are applied to the elements of an array. These operators can help reduce the programming overhead for an application designer as: Performance – …

ArrayFire: Write once, Run anywhere

Shehzan MohammedArrayFire 2 Comments

One of ArrayFire’s biggest features is the ability for code to be written just once and run on a plethora of devices. In this post, we show the outputs of af::info() from various devices available to us. Desktop Processors AMD GPU/CPU (OpenCL) ArrayFire v2.1 (OpenCL, 64-bit Linux, build 4b9115c) License: Standalone (/home/pavan/.arrayfire.lic) Addons: MGL4, DLA, SLA Platform: AMD Accelerated Parallel Processing, Driver: 1526.3 (VM) [0]: Tahiti, 2864 MB, OpenCL Version: 1.2 1 : AMD FX(tm)-8350 Eight-Core Processor, 7953 MB, OpenCL Version: 1.2 Compute Device: [0] AMD APU (OpenCL) ArrayFire v2.1 (OpenCL, 64-bit Linux, build 586ef59) License: Standalone (/home/arrayfire/.arrayfire.lic) Addons: MGL4, DLA, SLA Platform: AMD Accelerated Parallel Processing, Driver: 1445.5 (VM) [0]: Spectre, 624 MB, OpenCL Version: 1.2 1 : AMD …

ArrayFire Capability Update – July 2014

Oded GreenAndroid, ArrayFire, C/C++, CUDA, Fortran, Java, OpenCL, R 1 Comment

In response to user requests for additional ArrayFire capabilities, we have decided to extend the library to have CPU fall back when OpenCL drivers for CPUs are not available. This means that ArrayFire code will be portable to both devices that have OpenCL setup and devices without it. This is done through the creation of additional backends. This will allow ArrayFire users to write their code once and have it run on multiple systems. We currently support the following systems and architectures: NVIDIA GPUs (Tesla, Fermi, and Kepler) AMD’s GPUs, CPUs and APUs Intel’s CPUs, GPUs and Xeon Phi Co-Processor Mobile and Embedded devices As part of this update process we are also looking at extending ArrayFire capabilities to low power systems such …

ArrayFire on Tegra K1

Shehzan MohammedArrayFire 2 Comments

We’re pleased to announce the arrival of ArrayFire for NVIDIA Tegra K1! This version of ArrayFire comes with all the capabilities and features of our standard version of ArrayFire. It includes all ArrayFire CUDA functionality—with the exception of linear algebra support—as well as fully operational graphics support. ArrayFire for Tegra currently works with Tegra K1 processors running Linux for Tegra. We invite and encourage you to test it out on your boards and give us feedback; any bug fixes or performance improvements will be promptly resolved, as this is a separate branch of ArrayFire. If you’d like to deploy ArrayFire on Android, feel free to contact us for further support. We are open to partnering with anyone wishing to deploy ArrayFire in other …

Q/A Using ArrayFire

Shehzan MohammedArrayFire Leave a Comment

One of the reasons for ArrayFire’s usefulness is the various performance oriented function from many domains. What many people don’t realize is that ArrayFire also includes many utilities for image loading and visualization. In many cases, setting up a test harness is a ton of work. This is where ArrayFire can come in handy. Recently we worked on a project for one of our customers that involved image processing. As a part of development we wanted to make sure the quality is not compromised. They did not have a sufficient test framework in place. One option was to do this was the old fashioned way by reading two images and comparing them on CPU. Given that we needed to compare hundreds of images and …

Custom Kernels with ArrayFire

Pavan YalamanchiliArrayFire, C/C++, CUDA, OpenCL Leave a Comment

As extensive as ArrayFire is, there are a few cases where you are still working with custom CUDA or OpenCL kernels. For example, you may want to integrate ArrayFire into an existing code base for productivity or you may want to keep it around the old implementation for testing purposes. In this post we are going to talk about how to integrate your custom kernels into ArrayFire in a seamless fashion. In and Out of ArrayFire First let us look at the following code and then break it down bit by bit. int main() { af::array x = af::randu(num, 1); af::array y = af::array(num, 1); float *d_x = x.device(); float *d_y = y.device(); af::sync(); launch_simple_kernel(d_y, d_x, num); x.unlock(); y.unlock(); float err = …

https://www.youtube.com/watch?v=ZQVzXaOWSZ0

In Case you Missed it: ArrayFire Joint Webinar with AMD

Oded GreenEvents, OpenCL Leave a Comment

ArrayFire recently gave two webinar presentations to OpenCL developers as part of a joint webinar series with AMD. Due to popular demand for the first webinar, we ended up presenting a second! In case you missed it, here’s a recording of the webinar complete with the presentation and an informative Q&A session:  http://bit.ly/SkzIJs This webinar focused on enhancing productivity by using existing OpenCL libraries while achieving a high level of performance and maximizing system utilization. We demonstrated how our ArrayFire software library offers simple GPU programming with the benefit of awesome performance. In the webinar we showed how to use several image processing and computer vision building blocks in less than 3 lines of code.  The immediate takeaway message of …

Open Source Initiatives from ArrayFire

Pavan YalamanchiliAnnouncements, ArrayFire, CUDA, Fortran, Java, Open Source, OpenCL, OpenGL, R Leave a Comment

At ArrayFire we like to use a lot of Free/Open Source software. We use various Linux distributions, Jenkins, Gitlab, gcc, emacs, vim and numerous other FOSS tools on a daily basis. We also love the idea of developing software collaboratively and openly. Last year we started working with AMD on CL Math Libraries. Internally we’ve had numerous discussions about contributing to the GPGPU community. However, it’s neither simple nor straightforward to take a closed software Open Source. Earlier this year, we decided to take the first step and Open Source all of the ArrayFire library’s  tertiary projects. This includes all of our ArrayFire library’s language wrappers, examples, and source code used for our blog posts. All of our projects are hosted at our …

How to Make GPU Hardware Decisions

ScottComputing Trends, CUDA, Hardware & Infrastructure, OpenCL Leave a Comment

We get questions all the time about how to make GPU hardware decisions. We’ve seen just about every scenario you can imagine, and so we always jump at the chance to help others through this decision process. Here’s a recent question from a customer. “I’ve just found your post on Analytic Bridge and have taken a look at your website … I’m replacing my two Tesla M1060 cards (computing capability too low) and I’m considering used Tesla M2070s or the new GTX 760 cards. Could you offer any insight? I believe the GTX 760 cards may well outperform the older 2070s and are much cheaper.” And here’s our response. “The GTX 760 will probably outperform the M2070 for single precision …