Using GPUs in KVM Virtual Machines

Pavan YalamanchiliHardware & Infrastructure, Open Source 2 Comments

Introduction A couple of months ago, I began investigating GPU passthrough on my workstation to test ArrayFire on different operating systems. Around the same time, we at ArrayFire found ourselves with a few surplus GPUs. Having had great success with my virtualization efforts, we decided to build a Virtualized GPU Server to utilize these GPUs. Building a Virtualized GPU Server alleviated one of the pain points at our company: We no longer need to swap GPUs or Hard Disks to test a new environment. To maximize the number of GPUs we can put in a machine, we ended up getting a Quantum TXR430-0768R from Exxact Computing which comes in a 4U form factor and supports upto 8x double width GPUs. …

Benchmarking parallel vector libraries

Pavan YalamanchiliArrayFire, Benchmarks, C/C++, CUDA Leave a Comment

There are many open source libraries that implement parallel versions of the algorithms in the C++ standard template libraries. Inevitably we get asked questions about how ArrayFire compares to the other libraries out in the open. In this post we are going to compare the performance of ArrayFire to that of BoostCompute, HSA-Bolt, Intel TBB and Thrust. The benchmarks include the following commonly used vector algorithms across 3 different architectures. Reductions Scan Transform The following setup has been used for the benchmarking purposes. The code to reproduce the benchmarks is linked at the bottom of the post. The hardware used for the benchmarks is listed below: NVIDIA Tesla K20 AMD FirePro S10000 Intel Xeon E5-2560v2 Background ArrayFire ArrayFire provides high …

ArrayFire: Write once, Run anywhere

Shehzan MohammedArrayFire 2 Comments

One of ArrayFire’s biggest features is the ability for code to be written just once and run on a plethora of devices. In this post, we show the outputs of af::info() from various devices available to us. Desktop Processors AMD GPU/CPU (OpenCL) ArrayFire v2.1 (OpenCL, 64-bit Linux, build 4b9115c) License: Standalone (/home/pavan/.arrayfire.lic) Addons: MGL4, DLA, SLA Platform: AMD Accelerated Parallel Processing, Driver: 1526.3 (VM) [0]: Tahiti, 2864 MB, OpenCL Version: 1.2 1 : AMD FX(tm)-8350 Eight-Core Processor, 7953 MB, OpenCL Version: 1.2 Compute Device: [0] AMD APU (OpenCL) ArrayFire v2.1 (OpenCL, 64-bit Linux, build 586ef59) License: Standalone (/home/arrayfire/.arrayfire.lic) Addons: MGL4, DLA, SLA Platform: AMD Accelerated Parallel Processing, Driver: 1445.5 (VM) [0]: Spectre, 624 MB, OpenCL Version: 1.2 1 : AMD …

ArrayFire Capability Update – July 2014

Oded GreenAndroid, ArrayFire, C/C++, CUDA, Fortran, Java, OpenCL, R 1 Comment

In response to user requests for additional ArrayFire capabilities, we have decided to extend the library to have CPU fall back when OpenCL drivers for CPUs are not available. This means that ArrayFire code will be portable to both devices that have OpenCL setup and devices without it. This is done through the creation of additional backends. This will allow ArrayFire users to write their code once and have it run on multiple systems. We currently support the following systems and architectures: NVIDIA GPUs (Tesla, Fermi, and Kepler) AMD’s GPUs, CPUs and APUs Intel’s CPUs, GPUs and Xeon Phi Co-Processor Mobile and Embedded devices As part of this update process we are also looking at extending ArrayFire capabilities to low power systems such …

Remote Off-Screen Rendering with OpenGL

Shehzan MohammedArrayFire, OpenGL 18 Comments

At ArrayFire, we constantly encounter projects that require OpenGL and run on a remote server that does not have a display. In this blog, we have compiled a list of steps that users can use to run full profile OpenGL applications over SSH on remote systems without a display. A few notes before we get started. This blog is limited to computers running distributions of Linux. The first part of the blog that shows the configuration of the xorg.conf file is limited to NVIDIA cards (with display). AMD cards support this capability without the modification of xorg.conf file. However, we have not been able to get a comprehensive list of supported devices. Requirements You will need access to the remote …

ArrayFire on Tegra K1

Shehzan MohammedArrayFire 2 Comments

We’re pleased to announce the arrival of ArrayFire for NVIDIA Tegra K1! This version of ArrayFire comes with all the capabilities and features of our standard version of ArrayFire. It includes all ArrayFire CUDA functionality—with the exception of linear algebra support—as well as fully operational graphics support. ArrayFire for Tegra currently works with Tegra K1 processors running Linux for Tegra. We invite and encourage you to test it out on your boards and give us feedback; any bug fixes or performance improvements will be promptly resolved, as this is a separate branch of ArrayFire. If you’d like to deploy ArrayFire on Android, feel free to contact us for further support. We are open to partnering with anyone wishing to deploy ArrayFire in other …

Photos from SC13

John MelonakosArrayFire, CUDA, Events, OpenCL Leave a Comment

SC13 was awesome this week! Tomorrow is the last day of the exhibition. For those of you that did not make it to the show, here are some pictures from our exhibit: The AccelerEyes Booth ——————————————————————————————————– ArrayFire OpenCL Demo on ARM Mali ——————————————————————————————————– ArrayFire CUDA Demo on NVIDIA K40 ——————————————————————————————————– ArrayFire OpenCL Demo on Intel Xeon Phi Coprocessor ——————————————————————————————————– ArrayFire OpenCL Demo on AMD FirePro GPU ——————————————————————————————————– It was a great show and wonderful to see so many ArrayFire users in person. If you could not attend and would like to learn more about our CUDA or OpenCL products or services, let us know! Related articles ArrayFire v2.0 Release Candidate Now Available for Download Two Kinds of Exhibits to Watch …

ArrayFire for Defense and Intelligence Applications – Joint Webinar Recap

Aaron TaylorEvents 1 Comment

In case you missed it, hundreds of attendees recently joined us in a special joint webinar with NVIDIA. The webinar was led by Kyle Spafford, a Senior Developer at AccelerEyes. Kyle detailed how GPU computing can be implemented in the defense and intelligence fields. Kyle specifically addressed enabling unique solutions for applications related to video analysis, recognition, and tracking using the ArrayFire software library for C, C++, and Fortran. At the conclusion of the presentation Kyle fields questions from those in attendance, including “How does ArrayFire Fortran Lib compare to CUDA Fortran?” (see 59:36 mark), “Can you target a specific GPU if you have multiple on the machine?” (56:14), and “How can I combine several kernels to one fat kernel by using …

ArrayFire for Defense and Intelligence Applications

Aaron TaylorAnnouncements, C/C++, Events, Fortran Leave a Comment

AccelerEyes and NVIDIA invite you to participate in a joint webinar designed to help you learn about ArrayFire, a productive, easy-to-use GPU software library for C, C++, and Fortran. Major defense and intelligence institutions are discovering just how effective GPU computing can be in enabling unique solutions for applications related to video analysis, recognition, and tracking. During this informative webinar, Kyle Spafford, a Senior Software Developer at AccelerEyes, will explain how to accelerate common defense and intelligence algorithms using ArrayFire. The webinar will take place on Tuesday, September 17, 2013 at 10:00 AM PDT. Register for this webinar by clicking here. We hope you will join us as we discuss exciting developments in GPU computing software!

ISC 2013 Keynote by Stephen Pawlowski of Intel

John MelonakosComputing Trends, Events Leave a Comment

Stephen Pawlowski of Intel gave an interesting keynote today at ISC 2013. He continued the theme of yesterday’s keynote to address challenges our market faces in getting to exascale computing. Here is a summary of the points he made during his talk: Getting to exascale by 2020 requires performance improvement of 2x every year Innovations anticipated include stacked chips and optical layers DRAM is not scaling with Moore’s Law More power goes into transferring data than in computing Need to operate transistors near threshold New materials for DRAM needed. Resistive memory could replace DRAM. Need to explore both the big die and the small die paths as we approach 2020 Big die path leads to 10 billion transistors on a …