In this post we will be looking at benchmarks of the following ArrayFire image processing functions on an ARM device.
We pitted the brand new compute 3.2 GPU on NVIDIA Jetson TK1 against a mobile NVIDIA GPU. The closest match to the GPU (from here on referred as TK1) on the Jetson board we have in our mobile card deck is a NVIDIA GT 650M. The GPU device properties that have critical effect on the function performance are listed below.
Property Name / Device Name | Jetson TK1 GK20A | GT 650M |
---|---|---|
Compute | 3.2 | 3.0 |
Number of multiprocessors | 1 | 2 |
Cores | 192 | 384 |
Base clock rate | 852 MHz | 950 MHz |
Total global memory | 1746 MB | 2048 MB |
Total shared memory per block | 48 KB | 48 KB |
Total constant memory | 64 KB | 64KB |
Memory clock rate | 924 MHz | 900 MHz |
Memory bus width | 64-bit | 128-bit |
Total registers per block | 32768 | 65536 |
Warp size | 32 | 32 |
Benchmarks
Images with the the following resolutions are used for benchmarks.
- 480p (720×480)
- 720p and 1080p HD
- 4K and 8K UHD
Note that the vertical axis (Frames per second) in all of the plots is on log scale. The higher the value along the vertical axis, the better the run times are for that function on a given device.
Erosion/Dilation
A 3×3 mask was used to benchmark erode
function. Dilate results will be similar to erode because they use the same algorithm with a different local neighborhood operator.
Median Filter
A 3×3 mask was used to benchmark medfilt
function as well.
Resize
We benchmarked resize
for halving the image size.
Histogram
Standard 256 bin histogram.
Bilateral filter
We used 3.5 (7×7 window) spatial variance and 50 chromatic variance to benchmark bilateral
function.
Convolution
A 5×5 blur kernel was used to benchmark convolve
function.
Conclusion
erode
,resize
andconv
run in real time for all resolutions except 8K UHD.medfilt
runs in real time until 1080p HD resolution and falls to interactive rates for 4K UHD.bilateral
runs in real time until 1080p HD resolution and falls to 8 fps for 4k UHD.histogram
run times are good enough for it to be used in any photo editing software without noticing any lag for generating image histograms.
There are a plethora of other image processing functions available in ArrayFire. You can find the complete list of the functions available in our documentation here. The main take away point from this post is that we can easily do image processing in real time using Jetson TK1 for up to 4k UHD resolution.
Update
If you want to try out ArrayFire on your Jetson TK1, please contact us at sales@arrayfire.com.
We’ve released ArrayFire for Jetson TK1. You can now get access to the latest version from our download page.
Comments 7
Very helpful! We just get Jetson TK1 and want to migrate code on it . We hope to know more ArrayFire functions about face recoganization
We are very glad that the post was helpful. ArrayFire functions for computer vision algorithms are currently under testing phase. We very recently did a blog post on feature detection which can be found at http://arrayfire.com/computer-vision-arrayfire-part-1/ . In this post, we shared the benchmarks of Harris corner detector and FAST feature detector in comparision with OpenCV’s implementation. We shall keep posting new benchmarks as more functions pass through the testing phase.
Pingback: ArrayFire: A Portable Open-Source Accelerated Computing Library « Another Word For It
Pingback: The ArrayFire Blog’s Best of 2014 | ArrayFire
The graphs are a joke. Make linear vertical axis or simply don’t to any graphs.
Pingback: Tech SEO Guru
Pingback: Мимические морщины вокруг глаз