Antenna array design involves repeated simulation to tune the many parameters involved, and waiting around for simulations to finish is no fun. Offloading the optimization problem onto the GPU cuts that time down significantly.
In their recent paper, Capozzoli, Curcio, and Liseno (pdf, citation) of University of Naples Federico II demonstrated how a simple modification to their echo generator array simulation took advantage of the GPU to bring immediate speedups. Checkout this figure from their paper showing CPU simulation time growing prohibitively slow while the GPU grows little as more data is fed.
Their simulation is designed around optimizing an energy functional. Using fminunc to drive the optimization problem on the CPU, they simply modified their functional evaluation to take place on the GPU with Jacket. The most expensive part of their computation is the singular value decomposition, an algorithm for which Jacket easily beats the CPU.
Below is the core energy functional. It’s slightly modified from the original paper to take advantage of features in the latest release (v1.8) so that only a few lines need modification to GPU-enable the application.
The paper is a quick read with a solid intro to formulating the optimization problem. Thanks to Drs. Capozzoli, Curcio, and Liseno for sharing their work!
function Fun=functional(Coeff_legendre) global k0 dq LEG_csi LEG_eta LEG_csi_qz LEG_eta_qz a b jk0dab Num_x_prime Num_y_prime Nx_qz Ny_qz max_degree Coeff_legendre = gdouble(Coeff_legendre); % (1) cast to GPU t=size(Coeff_legendre,1)/4; Coeff_legendre_x = Coeff_legendre(1:t,:); Coeff_legendre_y = Coeff_legendre((t+1):(2*t),:)'; Coeff_leg_x_qz = Coeff_legendre((2*t+1):(3*t),:); Coeff_leg_y_qz = Coeff_legendre((3*t+1):(4*t),:)'; gfor m=1:Num_x_prime*Num_y_prime % (2) for-loop -> gfor-loop X_PRIME(m) = sum(flat(Coeff_legendre_x.*(LEG_csi(2:2:end,m)*LEG_eta(1:2:end-1,m).'))); Y_PRIME(m) = sum(flat(Coeff_legendre_y.*(LEG_csi(1:2:end-1,m)*LEG_eta(2:2:end,m).'))); gend gfor m=1:Nx_qz*Ny_qz % (3) for-loop -> gfor-loop X(m) = sum(flat(Coeff_leg_x_qz.*(LEG_csi_qz(2:2:end,m)*LEG_eta_qz(1:2:end-1,m).'))); Y(m) = sum(flat(Coeff_leg_y_qz.*(LEG_csi_qz(1:2:end-1,m)*LEG_eta_qz(2:2:end,m).'))); gend [XX XX_PRIME]=meshgrid(X,X_PRIME); [YY YY_PRIME]=meshgrid(Y,Y_PRIME); Rmn=sqrt(dq+(XX-XX_PRIME).^2+(YY-YY_PRIME).^2); Kx = k0*(XX-XX_PRIME)./Rmn; Ky = k0*(YY-YY_PRIME)./Rmn; A = jk0dab./(Rmn.^2).*sinc(Ky*b/(2*pi)).*cos(Kx*a/2)./(pi^2-(Kx*a).^2).*exp(-1i*k0*Rmn); S=svd(A); Fun=double(1/(sum(S)/S(1))); % (4) ensure back on CPU for fminunc end function x = flat(x) x = x(:); end