Fast Computer Vision with OpenCV and ArrayFire

John MelonakosArrayFire, Benchmarks, Case Studies, CUDA Leave a Comment

Update:  While the post below discusses LibJacket (no longer a product), you can do the same thing in the newer, but different, ArrayFire library.  Improved performance benchmarks and a simpler API are the results of moving from LibJacket to ArrayFire. Mcclanahoochie just posted some code and instructions for pairing OpenCV with LibJacket to get accelerated computer vision.  You can do really fast image processing on video cam feeds too, see picture below: Really cool stuff.  Computer vision is really hot with applications emerging in defense, radiology, games, automotive, and other consumer applications. Computer vision algorithms like these are also going mobile.  For instance, we have started to build LibJacket for Mobile applications, which runs on Tegra, PowerVR, and other mobile …

Tree cats see your code!

John MelonakosArrayFire Leave a Comment

From time-to-time we stumble across funny quirks while using MATLAB®.  The latest came as one of our developers accidentally mis-keyed a few characters.  With 5 characters on the command line, you too can get a message about tree cats seeing your bad code (followed by a nasty seg fault, so beware).  Try this: >> a()@a tree_cat sees bad code * Subsref [4] * M_ID 0(5) which * M_LRB 5(1) * ExprList [1] * M_ID 6(1) e * M_RRB 7(1) tree_cat sees bad code * Subsref [4] * M_ID 0(5) which * M_LRB 5(1) * ExprList [1] * M_ID 6(1) e * M_RRB 7(1) Top Secret:  Part of Jacket’s GPU runtime involves monkeys obtaining bananas for optimal performance. While we can’t …

New Product Updates – Jacket v1.8, LibJacket v1.1

John MelonakosAnnouncements, CUDA Leave a Comment

Announcements Jacket v1.8 for MATLAB® now available LibJacket v1.1 for C/C++/Python/Fortran now available Request a FREE GPU computing consultation Introduction Enhance your code with the fastest, most comprehensive library for GPU computing: Jacket – the best GPU computing in MATLAB®.  Take a tour and compare! LibJacket – the best way to kick start your CUDA development.  Take a tour! Both products enable: Manipulating vectors, matrices, and ND arrays Support for single- and double-precision, boolean, real, and complex numbers Hundreds of routines for arithmetic, linear algebra, statistics, imaging, signal processing, and more (full list: Jacket, LibJacket) Thousands of lines of optimized code for any CUDA-capable GPU New Product Features Expanded support for the Signal Processing, Image Processing, and Statistics Libraries included with …

Jacket Lectures – Learn and Teach GPU computing

John MelonakosAnnouncements, CUDA Leave a Comment

We are pleased to share 6 in-depth Jacket lectures, helpful both in learning and teaching Jacket.  Download the lectures (PDF format), here:  http://www.accelereyes.com/support/lectures Jacket is used in course instruction at many universities around the world. Professors and course instructors use Jacket to provide engineering students with GPU acceleration of MATLAB® algorithms and to bring HPC to MATLAB courses. The six lectures are entitled “Parallel High Performance Computing with Emphasis on Jacket Based GPU Computing” and have topics including: Parallel computing introduction Jacket introduction Basic programming with Jacket Advanced programming with Jacket Multiple GPU programming Benchmarking If you are looking at accelerating MATLAB code or parallel computing with MATLAB, you definitely will want to add these lectures to your arsenal of …

Getting More out of GPU Computing with LIBJACKET v1.0

John MelonakosAnnouncements, CUDA Leave a Comment

LIBJACKET v1.0 is here! It is the Matrix Companion to CUDA, providing a high-productivity performance layer for GPU computing. Download now to start a free 15-day trial. It integrates seamlessly with any CUDA code, but can also be used to avoid writing complicated GPU kernels yourself via its matrix interface. Soak up its features, here. We’re celebrating this launch by offering two big promotions, one for existing Jacket programmers and one for the broader GPU computing community: Existing Jacket customers get 50% off libJacket. Buy a Tesla, Get a Free libJacket subscription. Learn more about these offers. Here are some other links of interest to this launch: Tour Documentation Function benchmarks Press release Over the years, we’ve been thrilled to …

Our Point of View & Twitter Comedy

John MelonakosCUDA Leave a Comment

“Great businesses have a point of view, not just a product or service.” ~37 Signals At AccelerEyes, our point of view is that GPU software can and should deliver great results on real applications. With this point of view, we’ve kept our heads down solely focused on delivering a great runtime system for GPUs. All our energy has been devoted to the task of emitting optimized low-level code from high-level matrix notation. These efforts are now paying off in a big way!  Jacket is consistently delivering awesome results in real applications, read examples here and here. Alternative choices apparently have a different point of view.  Yesterday’s twitter stream contained a comical, but all-to-common indication of frustration with the recent GPU …

CUDA over Remote Desktop now available for Tesla GPUs

John MelonakosAnnouncements, CUDA 5 Comments

Update: Jacket over Remote Desktop is now available for Quadro devices too! Read this post. Jacket over Remote Connections is also documented extensively on the AccelerEyes Wiki. Over the past several years, many Jacket programmers have requested support for Remote Desktop in Windows.  We are pleased to report that recent NVIDIA drivers now enable Jacket to run over Remote Desktop, for some system configurations. Specifically, the requirements to make this work include: Windows Vista, Windows 7, Windows HPC Server 2008, or Windows HPC Server 2008 R2 The latest NVIDIA driver (as required by Jacket) Tesla GPU TCC-mode enabled on at least one (Tesla) GPU To enable TCC, the Tesla cannot be connected to a display. This means you need to …

Unraveling Speedups: Two Important Questions

John MelonakosBenchmarks, CUDA 1 Comment

One Jacket programmer recently emailed the following to us: Our chief scientists asked me a question that I’d like to pass on to you.  I think I know the answer, but you guys can be much more definitive than I can. He recently read about people achieving ~10x speedups by converting parts of their code to MEX files.  He was wondering how much of the observed speedup is due to that MEX and how much is due to CUDA and the GPU. Two Questions You Should Ask Yourself When contemplating an effort to optimize a piece of code, it is important to unravel the effort into two separate questions.  Both need to be addressed to improve performance: How well-written is …

Stanford GPU Benchmarks: Jacket vs PCT/GPU

John MelonakosBenchmarks, Case Studies, CUDA Leave a Comment

Researchers in the Pervasive Parallelism Laboratory at Stanford University recently published work describing a novel framework for parallel computing with a paper entitled, “A Domain-Specific Approach to Heterogeneous Parallelism.”  As part of their research, they compared Jacket to the GPU support in the Parallel Computing Toolbox™.  The results clearly show that Jacket’s optimizations make a big difference in performance. In this blog post, we highlight 4 algorithms included in their research: NAME DESCRIPTION INPUT Gaussian Discriminant Analysis (GDA) Generative learning algorithm for modeling the probability distribution of a set of data as a multivariate Gaussian 1,200×1,024 Matrix Restricted Boltzmann Machine (RBM) Stochastic recurrent neural network, without connections between hidden units 2,000 Hidden Units 2,000 Dimensions Support Vector Machine (SVM) Optimal …

GPU accelerated lattice Boltzmann model for shallow water flow and mass transport

John MelonakosBenchmarks, Case Studies, CUDA 3 Comments

Dr. Kevin Tubbs and Professor Tsai at Louisiana State University recently published an interesting paper using GPUs and Jacket to accelerate lattice Boltzmann models for shallow water flow and mass transport.  More details about this work are provided in the full success story page on the website. Jacket makes GPU programming easy.  “Very little recoding was needed to promote the LBM code to run on the GPU,” say the authors at one point in their paper. In this blog post, we share the highlights of this work.  Using these methods, the authors are able to simulate shallow water flow and mass transport.  For instance, checkout these videos of a dam break: The authors completed this work with a relatively older …