ISC 2013 Keynote by Bill Dally of NVIDIA

Bill Dally of NVIDIA gave a wonderful keynote today at ISC 2013. He focused on addressing the challenges facing our market in getting to exascale computing. He talked about how Moore’s law is alive and well because transistors continue to double at an astonishing rate. However, the additional transistors are not translating into the same big performance gains as they did in the 1990’s. Whereas performance used to grow 50% per year, performance today is growing at a much slower pace. The biggest bottleneck to more performance is energy efficiency. Bill showed slides of chips and talked about the picojoules required to compute versus those required to move data and operands around the chip. The take home message was that ...

Solution to NVIDIA Toolkit Installation Error for Ubuntu 12.10 [Driver: Installation Failed]

  You may find this error message while trying to set up the NVIDIA CUDA Toolkit in Ubuntu. I found it when I was installing the toolkit for ArrayFire   [1] CUDA Toolkit Installation 1. Download the CUDA Toolkit in the link.  2. Extract the .run file in a location

  3. Exit the X server (press Ctrl+Alt+F1) and stop the display manager by the following command.

  4. cd to the location and now there are run files named samples*, devdriver* and cudatoolkit*. 5. Install devdriver (*only if NVIDIA Driver is not installed)

  6. Install cudatoolkit

In the end, when it asks "Would you like to create a symbolic link /usr/local/cuda/ pointing to /usr/local/cuda-x.x?", ...

Beamforming with ArrayFire

Alessandro Savoia and researchers at Università degli Studi Roma Tre have achieved an order of magnitude improvement in the performance of a beamforming application using ArrayFire for GPU acceleration with CUDA-capable NVIDIA GPUs. This application involves conventional beamforming. Steps include the application of a time delay to each signal vector, summation across all vectors, and processing on the result. Processing includes demodulation, envelope extraction, and logarithmic compression. ArrayFire's functions for shifting, interpolation, and filtering made this application possible for acceleration on GPUs and reduced the time to develop significantly. Alessandro's benchmarks show that a CPU-only version was only running at 1 frame/sec, while the ArrayFire-accelerated version was running at 10-20 frames/sec, depending on the dataset. Alessandro and his team are looking forward to ...

Are You Getting Left Behind?

HPCwire posted a nice article today with trends from IDC on computer processing. These trends fall inline and corroborate things we've been saying here on this blog. Accelerators (including GPUs and co-processors) are taking off. Are you getting left behind? If you're reading this blog, you're probably at the bleeding edge, but nonetheless here are some interesting excerpts from HPCwire's market report (go read the whole thing): "While they expected to see a jump in coprocessor and accelerator uptake, they were wholly unprepared for the overwhelming positive response to GPUs and new entrants into the market, most notably Intel’s shiny new Phi." "Conway said that while accelerator and coprocessor adoption growth was anticipated, they had no idea that it would ...

ArrayFire + Scorpii Demo by CreativeC

CreativeC makes awesome compute + visualization systems. We got to see the demo in live action at the GPU Technology Conference last month. Tim Thomas was kind enough to let us film the demo showing how ArrayFire can be used to drive a multi-node, 9 GPU system in a physics application. Checkout the video below. If you are interested in high-throughput compute coupled with high-pixel visualizations, we recommend you talk with the folks at CreativeC. They are always pushing the envelope on what can be done with GPU computing and GPU visualizations. Also, if you have cool demos showing ArrayFire in action, let us know. We'd love to film your work and make it available on this blog! Related articles ...

Parallel Software Development Trends for Dummies

Last month, I posted two articles describing computing trends and why heterogeneous computing will be a significant force in computing for the next decade. Today, I continue that series with an article describing the biggest challenge to continued increases in computing performance - parallel software development. Biggest Challenge As I described previously, in order to use an accelerator, software changes must be made. Regular x86-based compilers cannot compile code to run on accelerators without these needed changes. The amount of software change required varies depending upon the availability of and reliance upon software tools that increase performance and productivity. There are four possible approaches to take advantage of accelerators in heterogeneous computing environments:  do-it-yourself, use compilers, use libraries, or use ...

7 Highlights of GTC 2013 - Day 4 of 4

Day 4 at GTC is always a little less hyped than the first 3 days, but it is when some of the best sessions are found. Here are 7 of the highlights we've collected from our team on the last day of GTC 2013: Paulius Micikevicius of NVIDIA gave a great talk entitled, "Performance Optimization: Programming Guidelines and GPU Architecture Details Behind Them." It was so great, we have 2 highlights from this talk. The first Paulius highlight is the information about how instruction level parallelism is essential to fully take advantage of Kepler GPUs. Paulius gave a clear presentation on these difficult concepts. The second Paulius highlight is the thorough treatment of memory hierarchy for Kepler. It is very detailed and ...

7 Highlights of GTC 2013 - Day 3 of 4

Day 3 at GTC was awesome. It was super hard to narrow down our list to just 7 highlights. For instance, the stress ball pyramid in our booth does not count. Neither does the massive ArrayFire poster in front of the keynote hall. Here are 7 of the highlights we've collected from our team on the third day of GTC 2013: Professor Erez Lieberman Aiden of Baylor and Rice Universities gave a great keynote on "Parallel Processing of the Genomes, by the Genomes and for the Genomes." He discussed how folding of genes and interactions between multiple folded genes can impact genetic expressions. It's not just about the composition of the gene, but also how the gene folds. It turns ...

7 Highlights of GTC 2013 - Day 1 of 4

AccelerEyes is out in force at GTC. We ended up with 10 of our engineers and sales staff here onsite. I collected feedback from the team to learn what people enjoyed the most from today's activities. Here are 7 of the highlights we've collected from our team on the first day of GTC 2013: Will Ramey of NVIDIA kicked off GTC with a tutorial on the CUDA ecosystem. He talked about the three different approaches to getting GPU acceleration:  1) Libraries, 2) Compiler Directives, and 3) Programming Languages. He talked about how libraries, if you can find one for your application (hint, hint),  are the best of the 3 options, because you get great performance and you don't have to ...

Heterogeneous Computing Trends for Dummies

Ten days ago, I posted an article on CPU Processing Trends for Dummies. Today, I continue that series with an article describing the latest major trend in computing, namely Heterogeneous Computing. The Point The point of these articles is to paint the high-level picture for trends in computer processing. I hope this bigger picture will help summarize things for those that do not breathe computer processors and technical software on a daily basis. Over the last 20 years, big gains in computer processing have been defined by increases in CPU clock speeds, then by increases in the number of CPU cores. The next 10+ years will be defined by heterogeneous computing. Heterogeneous Computing So let's start with a definition:  Heterogeneous ...