Category Archive

Below you'll find a list of all posts that have been categorized as “CUDA”

12,288 CUDA Cores in One Computer

John Melonakos March 22, 2012Announcements, CUDA 3 Comments

Kepler is here. And it’s fantastic! The news came out today that the first Kepler GPU, the GeForce GTX 680, has been launched. A single GPU has 1,536 CUDA Cores. This means that those high-end workstations with 8 PCIe slots will be able to pack 12,288 CUDA cores into a single computer. That’s some serious computational power. Current high-end Fermi cards have 512 cores, so this new Kepler architecture boasts 3X the number of computation cores. Normally we focus on the higher-end Tesla products because those more aptly fit the needs of our science, engineering, and financial computing readers. But we are excited nonetheless by this GeForce GPU. It is a major step forward in GPU technology. And this GeForce card portends …

ArrayFire Pro : Features and Scalability

ArrayFire March 14, 2012ArrayFire, C/C++, CUDA, Fortran Leave a Comment

ArrayFire is a fast GPU library that off-loads compute intensive tasks onto many-core GPUs, thereby reducing application runtime and accelerating it many times. ArrayFire is built on top of NVIDIA CUDA software stack which is currently the best and most stable GPU Software Development Kit available for GPU-based computing. ArrayFire comes with a huge set of functions that span across various domains like image processing, signal processing, financial modeling, applications requiring graphics support. ArrayFire has an array based notation (supports N-dimensional arrays) and allows sub-referencing and assignment into these multi-dimensional arrays. The following code snippet shows how you can index into array objects. // Generate a 3×3 array of random numbers on the GPU array A = randu(3,3); array a1 …

CUDA and OpenCL Benchmarks – Keeneland Workshop Day 1

John Melonakos February 20, 2012Benchmarks, CUDA, Events, OpenCL 3 Comments

Today was Day 1 of the Keeneland Workshop. Many great talks were given, across a broad range of GPU computing topics. With last week’s ArrayFire Webinar fresh in mind, it was interesting to see similar conclusions drawn in a presentation by Kyle Spafford of Oak Ridge National Laboratory. Kyle independently ran a number of benchmarks over a period of time which show how quickly OpenCL has matured and where it yet has room for improvement. The slide below comes from Kyle’s presentation. For numbers >1, CUDA is faster. For numbers <1, OpenCL is faster. Performance in most cases is close to equivalent. Just as we showed in the ArrayFire webinar, OpenCL performance is quite comparable with CUDA performance. The Achilles heel …

OpenCL vs CUDA Comparisons

ArrayFire February 17, 2012CUDA, Events, OpenCL 4 Comments

In case you missed it, we recently held an ArrayFire Webinar, focused on exploring the tradeoffs of OpenCL vs CUDA. This webinar is part of an ongoing series of webinars held each month to present new GPU software topics as well as programming techniques with Jacket and ArrayFire. For those of you who missed it, we provide a recap here. Lots of questions were fielded by our team, so it’s a must-watch. We hope to see you at the next one! Recap Download the slides. Here is a transcript of the content portion of the webinar: AccelerEyes is pleased to present today’s ArrayFire webinar looking at OpenCL and CUDA Trade-offs and Comparisons. Everyday, we interact with many programmers in various stages of GPU …

ArrayFire Support for CUDA 4.1

John Melonakos February 15, 2012Announcements, ArrayFire, C/C++, CUDA, Fortran Leave a Comment

The question above comes from María (@turbonegra). She follows us @accelereyes. Many of you are wondering when ArrayFire support for new CUDA version 4.1 will be released. The answer: work is currently under way. CUDA 4.1 includes a new Fermi compiler, and many people in the GPU ecosystem have reported slowdowns from upgrading to the new CUDA version. So we’ve delayed releasing ArrayFire and Jacket support for CUDA 4.1 because we want to verify performance and reliability across all our unit tests, performance regressions, and customer code samples. Our tests sweep across various driver versions and everything from mobile GeForce cards through server-grade Tesla and Fermi chips. We are still working through the testing and verification at the moment. While …

Jacket over Remote Desktop for Tesla and Quadro GPUs

ArrayFire January 17, 2012CUDA 1 Comment

We recently reported that Jacket could be used over Windows Remote Desktop connections as long as you had an NVIDIA Tesla device in TCC mode. With the latest NVIDIA driver updates, Tesla and Quadro devices can be put into TCC mode, making it possible to use Jacket over Remote Desktop with both Tesla and Quadro devices. We have tested this out with the NVIDIA Quadro 4000 as well as Quadro 6000 GPUs. The system had a Tesla C2050 connected to the display, and the Quadro in TCC mode. Here’s the ginfo output: >> ginfo Jacket v2.0 (build 80c7ba4) by AccelerEyes (64-bit Windows) License Type: Designated Computer ([JACKET_ROOT]jacketenginejlicense.dat) Addons: MGL4, JMC, SDK, DLA, SLA CUDA toolkit 4.0, driver 285.62 GPU1 Quadro …

AccelerEyes Webinar Series

Scott January 12, 2012Announcements, CUDA, Events, OpenCL 1 Comment

AccelerEyes invites you to participate in series of webinars designed to help you learn more about Jacket for MATLAB® and ArrayFire for C/C++/Fortran/Python, a comprehensive library of GPU-accelerated functions. GPU Programming for Medical Image Segmentation: January 18, 2012 at 3:00 p.m. EST There’s a huge volume of data generated using acquisition modalities like computer tomography (CT), magnetic resonance imaging (MRI), positron emission tomography or nuclear medicine. A common need is to manipulate and transmit this data using compression techniques in as little time as possible. During this webinar we will show Jacket’s superior speed and handling volumes from subscripting to convolutions. Come and learn how to accelerate common medical imaging applications using an easy, powerful programming library with Jacket for MATLAB®. OpenCL and CUDA Trade-Offs and Comparison: February 15, 2012 at …

AccelerEyes Releases ArrayFire GPU Software

Scott November 21, 2011Announcements, ArrayFire, C/C++, CUDA, Fortran, OpenCL 1 Comment

A free, fast, and simple GPU library for CUDA and OpenCL devices. AccelerEyes announces the launch of ArrayFire, a freely-available GPU software library supporting CUDA and OpenCL devices. ArrayFire supports C, C++, Fortran, and Python languages on AMD, Intel, and NVIDIA hardware. Learn more by visiting the ArrayFire product page. “ArrayFire is our best software yet and anyone considering GPU computing can benefit,” says James Malcolm, VP Engineering at AccelerEyes. “It is fast, simple, GPU-vendor neutral, full of functions, and free for most users.” Thousands of paying customers currently enjoy AccelerEyes’ GPU software products. With ArrayFire, everyone developing software for GPUs has an opportunity to enjoy these benefits without the upfront expense of a developer license. Reasons to use ArrayFire: …

AccelerEyes Webinar Series

Scott October 27, 2011Announcements, CUDA, Events, OpenCL Leave a Comment

AccelerEyes invites you to participate in series of webinars designed to help you learn more about Jacket for MATLAB® and LibJacket for C/C++/Fortran/Python, a comprehensive library of GPU-accelerated functions. Joint Webinar With NVIDIA: LibJacket CUDA Library On October 20th we co-hosted a joint webinar with NVIDIA. During this well-attended event, our GPU computing experts provided a general product overview and usage of the LibJacket CUDA library. Several impressive demos of LibJacket in action were provided as well. LibJacket supports hundreds of GPU computing functions and programmers in numerous industries have been able to speedup applications. Be sure to check out the Q&A session included in the recorded webinar posted on NVIDIA’s Developer Zone. Thanks again to NVIDIA for co-hosting this informative webinar! GPU Programming for …

Filtering Benchmarks – OpenCV GPU vs LibJacket

ArrayFire September 26, 2011Benchmarks, CUDA Leave a Comment

OpenCV is one of the most popular computer vision toolkits, and over the last year they’ve been integrating more GPU processing into the core. One of the most common image processing tasks is convolution. Since LibJacket and OpenCV both support this, one of my coworkers rolled up his sleeves and benchmarked the latest versions from both libraries: OpenCV/CPU, OpenCV/GPU, LibJacket. Jump over to his personal website for the full benchmark results and source code. From the graphs, the GPU implementations from OpenCV and LibJacket both easily outperform the default CPU version in OpenCV, but notice that LibJacket pushes performance even further and dominates OpenCV’s GPU implementation, especially when using separable filters. We’ve worked really hard the last few years to …

Page 7 of 11
←
1
...
6
7
8
...
11
→