Machine Learning with ArrayFire

ArrayFireBenchmarks, C/C++, Case Studies, CUDA, Events Leave a Comment

In case you missed it, we recently held a webinar on the ArrayFire GPU Computing Library and its applications to Machine Learning on June 15. This webinar was part of a free series of webinars that help you learn about ArrayFire and Jacket (our MATLAB® product). Anyone can attend these webinars, for they are absolutely free and open for anyone to attend and interact with AccelerEyes engineers. Learn more at http://www.accelereyes.com/webinars. Chris, a Software Engineer at AccelerEyes, explained ArrayFire’s position in the GPU computing world, and presented benchmarks where ArrayFire beats GPU libraries such as Thrust in many critical applications. He also mentioned that ArrayFire could be used either standalone, or in combination with other options for GPU computing such …

Option Pricing

ArrayFireArrayFire, Benchmarks, C/C++, Case Studies, CUDA 2 Comments

Andrew Shin, Market Risk Manager of Koch Supply & Trading, achieves significant performance increases on option pricing algorithms using Jacket to accelerate his MATLAB® code with GPUs. Andrew says, “My buddy and I are, at best, novice programmers and we couldn’t imagine having to figure out how to code all this in CUDA.” But he found Jacket to be straight-forward. With these results, he says he can see Jacket and GPUs populating Koch’s mark-to-futures cube, which contains its assets, simulations, and simulated asset prices. Modern option pricing techniques are often considered among the most mathematically complex of all applied areas of finance. Andrew shared some exemplary code to demonstrate how much leverage you can get out of Jacket and GPUs for financial computing in MATLAB® and C/C++. …

Powering Mars Research

John MelonakosCase Studies, CUDA Leave a Comment

The Curiosity Mars rover landing reminded us of a recent talk by Brendan Babb of NASA and UAA in Anchorage about Jacket-accelerated Mars research. The talk was given at GTC 2012 in May. The main thrust of this research is improving mars rover image compression via GPUs and genetic algorithms. With Jacket and GPUs, the researchers were able to achieve 5X speedups on the larger data sizes. The algorithm works by pairing neighboring pixels with a random one and then adjusting the random pixel based on whether it incrementally improves the original image. Babb described the algorithm as an “embarrassingly” parallel process, ideally suited to GPU acceleration. He estimates he has been able to achieve a 20 to 30 percent error …

Parallelized Gene Predictors with Jacket

John MelonakosCase Studies, CUDA Leave a Comment

Researchers at the University of Quebec have developed high-performance gene predictors using Jacket to accelerated their MATLAB® code.  This work has been published in BMC Research Notes and is freely available here. Computerized approaches to studying the human genome are challenged by the exploding amount of data, which doubles roughly every 6 months.  In order to deal with this burgeoning datasets, demands for faster processing power continue to arise. This work focuses on predicting genes using frequency analysis with FFTs and with an equivalent technique known as Goertzel’s algorithm.  In these applications, the emphasis of this paper is to propose tools to geneticists and molecular biologists for the prediction or identification of new genes using existing complementary strategies. The criteria for these …

Benchmarking the new Kepler (GTX 680)

Pavan YalamanchiliBenchmarks, CUDA 13 Comments

NVIDIA has launched their next generation GPU based on their Kepler Architecture. They followed it up with a rather quick update to their CUDA toolkit. Considering that we have access to 3 generations of their GTX cards (480, 580 and 680), we thought we would show case how the performance has changed over the generations. Matrix multiplication: It can be seen that the GTX 680 breaches the 1 Terraflop mark comfortably for single precision, while the GTX 580 barely scratches it. However the performance seems to peak around 2048 x 2048 and then rallies downward to match the performance of the GTX 580 at larger sizes. The high end Tesla C2070 finishes last for single precision behind the third placed …

ArrayFire for Defense and Intelligence Applications

ArrayFireC/C++, Case Studies, CUDA, Events, Fortran Leave a Comment

In case you missed it, we recently held a webinar on the ArrayFire GPU Computing Library and its applications to Defense and Intelligence functions. Defense projects often have hard deadlines and definite speed targets, and ArrayFire is a fast and easy-to-use choice for these applications. This webinar was part of an ongoing series of webinars that will help you learn more about the many applications of Jacket and ArrayFire, while interacting with AccelerEyes GPU computing experts.  John Melonakos, our CEO, introduced ArrayFire and talked about some exciting recent customer successes in the field of defense. He then ran through the mechanics of compiling and running code on a machine with 2 Quadro 6000 GPUs, and talked about customer success stories. …

No Free Lunch for GPU Compiler Directives

John MelonakosArrayFire, C/C++, CUDA, Fortran 3 Comments

Last week, Steve Scott at NVIDIA put up a viral post entitled, “No Free Lunch for Intel MIC (or GPU’s).”  It was a great read and a big hit in technical computing circles. The centrepiece of Scott’s piece was to say that there are no magic compilers.  GPUs don’t have them, and neither will MIC.  No compiler will be able to automatically recompile existing code and get great performance from MIC or GPUs.  Rather, it takes a good amount of elbow grease to write high-performance code. We totally agree.  The problem Scott addresses is real.  Despite marketing spin to the contrary, developing code for GPUs requires work. However, we don’t agree with Scott’s conclusion that compiler directives are a good solution. You can’t fight …

Jacket v2.1 Now Available

ScottAnnouncements, CUDA 2 Comments

Optimization Library, Sparse Functionality, Graphics Library Improvements, CUDA 4.1 Enhancements, and much more… AccelerEyes announces the release of Jacket v2.1, adding GPU computing capabilities for use with MATLAB®. Jacket v2.1 delivers even more speed through a host of new improvements, maximizing GPU device performance and utilization.. Notable new features include an Optimization Library and additional functions to our Graphics Library. With Jacket v2.1, we have also extended support for sparse matrix subscripting and made improvements to host-to-device and device-to-host data transfer speeds for complex data. In addition, we have included various GFOR enhancements. Jacket v2.1 now includes NVIDIA CUDA 4.1 enhancements to provide improved functionality and performance (requires latest drivers). Jacket is the premier GPU software plugin for MATLAB®, better than alternative …

12,288 CUDA Cores in One Computer

John MelonakosAnnouncements, CUDA 3 Comments

Kepler is here.  And it’s fantastic! The news came out today that the first Kepler GPU, the GeForce GTX 680, has been launched.  A single GPU has 1,536 CUDA Cores.  This means that those high-end workstations with 8 PCIe slots will be able to pack 12,288 CUDA cores into a single computer.  That’s some serious computational power. Current high-end Fermi cards have 512 cores, so this new Kepler architecture boasts 3X the number of computation cores. Normally we focus on the higher-end Tesla products because those more aptly fit the needs of our science, engineering, and financial computing readers.  But we are excited nonetheless by this GeForce GPU.  It is a major step forward in GPU technology.  And this GeForce card portends …