Stanford GPU Benchmarks: Jacket vs PCT/GPU

John MelonakosBenchmarks, Case Studies, CUDA Leave a Comment

Researchers in the Pervasive Parallelism Laboratory at Stanford University recently published work describing a novel framework for parallel computing with a paper entitled, “A Domain-Specific Approach to Heterogeneous Parallelism.”  As part of their research, they compared Jacket to the GPU support in the Parallel Computing Toolbox.  The results clearly show that Jacket’s optimizations make a big difference in performance.

In this blog post, we highlight 4 algorithms included in their research:

Gaussian Discriminant Analysis (GDA) Generative learning algorithm for modeling the probability distribution of a set of data as a multivariate Gaussian 1,200×1,024 Matrix
Restricted Boltzmann Machine (RBM) Stochastic recurrent neural network, without connections between hidden units 2,000 Hidden Units

2,000 Dimensions

Support Vector Machine (SVM) Optimal margin classifier, implemented using the Sequential Minimal Optimization (SMO) algorithm 800×1,448 Matrix
Naïve Bayes (NB) Fast, low-work supervised learning algorithm for classification 25,000×1,448 Matrix

These algorithms were benchmarked using the following system at Stanford:

System Specs:
Computer Dell Precision T7500n
Processor 2 Quad-core Intel Xeon X5550 2.67 GHz Each core has 2-way hyperthreading for a total of 16 hardware thread contexts

The execution times for these algorithms are shown in the charts below:

To learn more about how Jacket compares, visit:  Special thanks to the Stanford researchers for undertaking this effort, and good luck continuing this line of work!  We look forward to learning from your insights.

Researchers: Hassan Chafi, Arvind K. Sujeeth, Kevin J. Brown, HyoukJoong Lee, Anand R. Atreya, and Kunle Olukotun

Leave a Reply

Your email address will not be published. Required fields are marked *