Accelerate Computer Vision Data Access Patterns with Jacket & GPUs

Gallagher PryorArrayFire Leave a Comment

For computer vision, we’ve found that efficient implementations require a new data access pattern that MATLAB does not currently support.  MATLAB and the M language is great for linear algebra where blocks of matrices are the typical access pattern, but not for Computer Vision where algorithms typically operate on patches of imagery. For instance, to pull out patches of imagery in M, one must do a double nested for loop,

```A = rand(100,100)
for xs = -W:W
for ys = -W:W
patch(xs+W+1, ys+W+1) = A(xs+1+x, ys+1+y);
end
end```

…with guards for boundary conditions, etc. It gets even more complicated with non-square patches. On top of that, these implementations don’t translate to the GPUs memory hierarchy at all and are thus slow.

Therefore, we have implemented a command called windows,

`B = windows(A, x, y, z, w, T);`

whose purpose is to pull a large number of sparse (as opposed to sliding) windows out of A at the locations x, y, z with sizes radii w, and affine transforms T that can then be computed over. The command `windows` signals to Jacket that we’re doing a patched access pattern that can then be optimized on the GPU.

This alleviates some of the problems MATLAB has with implementing computer vision algorithms in general – – typically lots of mex files are written to handle these problems.

In the future, windows can possibly also be lazy. So,

`hist(windows(...));`

would be intelligent and hist would use a windowed access pattern to compute many histograms in patches over an image. The implementations of NLFILTER and COLFILTER would also imply a particular memory access pattern that Jacket can optimize for.

We’re hoping that such developments will lead to efficient implementations of common image patch reliant computer vision algorithms like KLT, SIFT, Mean-Shift etc in MATLAB without having to develop a tightly knit, hard to understand batch of C and CUDA.