Researchers from the University Bordeaux credit ArrayFire in a paper published in a Master’s Thesis by André Almeida. The thesis is titled “Soliton Excitations in Open Systems using GPGPU Supercomputing.” It investigates the stability of nonlinear excitations in open optical systems modeled by the Complex Ginzburg Landau Equation when influenced by effects such as dissipation and gain, using numerical simulations.

**Summary**

In the early years of the 19th century the naval engineer James Scott Russell made the first observation of a very uniform accumulation of water in a boat canal that was capable to propagate for many kilometers without any losses in amplitude and with constant width. This was a very strange phenomenon at the time because no known description of hydrodynamics could explain how such a single wave propagates with those proprieties. The initial description of the engineer James Russell was of a wave in a steady state that ”rolled forward with great velocity, assuming a form of a large solitary elevation which continued its course along the channel apparently without change of form or diminution of speed.” This type of wave became known as a soliton.

There are many types of solitons, depending on the specific nonlinear equation and leading to intense studies to better understand how they influence and could be used to describe some physical effects. Figure 2.3 from the paper shows an example analysis of two spatial solitons colliding.

In the context of nonlinear optics, solitonic solutions are classified into two main categories: spatial solitons or temporal solitons, depending on the axis where the confinement of the light takes place. A temporal soliton represents an optical wave package that maintains its shape whereas a spatial soliton represents a continuous wave beam that is confined in the directions transverse to the direction of propagation.

In this paper, the problem of interest is the propagation of an optical pulse in a two dimensional waveguide with rectangular cross section, filled with a 4-level atomic system with N-type configuration, as represented schematically in figure 4.2.

**Results**

The paper presents the analysis of the computational performance of the solver of the Complex Ginzburg Landau Equation (CGLE) and to verify the advantages of GPU computing against the single-thread CPU equivalent. This was done by running the same physical problem on different platforms,with specific hardware budgets ramping from low-end laptop to an high-end desktop, and running the solver in both CPUs and GPUs, as well as, using implementations based both in CUDA and OpenCL.

The numerical tests considered the simulations of physical systems with different sizes and consisted in measuring the elapsed time during each simulation. The results are shown in figure 3.4 as SpeedUp factors of the simulations in different platforms and for different versions relative to the CPU version. The results for single-thread show that for small physical systems with smaller number of grid points, there is no benefit in using GPUs since the CPU cores are significantly faster. However, as the number of grid points of the simulations increases, the CPU performance decays considerably and all the GPUs outperform the CPU.

**The code was implemented in C++ with ArrayFire, and so it is capable to run in almost every graphics card unit available in any computer, from laptops to workstations.** This is an important achievement because now we can simulate a complex equation in virtually any machine, providing an high portability. The approach to benchmark the solver was dived in two manners; first against a single core hardware of a CPU in a workstation. We obtained a SpeedUp of around 55 using a GTX Titan with CUDA framework. When using mainstream desktop hardware we observed a good improvement, with 30 times less time required to simulate the propagation dynamics of such physical nonlinear light-matter interaction system; the other test-case we used to evaluate the solver was against multi-core CPU. Here the results are somewhat different in terms of scale, with a maximum SpeedUp of approximately 3.7. For low resolutions we found a separation of frameworks, with CUDA under-performing OpenCL and even the Intel CPU. Interestingly and in contrary to the single-thread comparison, the GPGPUs of laptops cannot outperform the workstation i7-4930K CPU.

**Conclusion**

The supercomputing solver was written in C++ using the ArrayFire library and is capable to simulate the propagation of (1 + 1)-dimensional optical pulse in a nonlinear system described by the CGLE. The main structure is based in an inner kernel that was optimized to performs calculations and the remaining non kernel sections detached. This allows a great modularity to add/remove/work on parameters that describes the solution as well as to manipulate the soliton stability condition factors. The benchmarks of the GPGPU Supercomputing solver reinforced what was predicted. Against single core operation, we obtained a SpeedUp of around 55 using a GTX Titan with CUDA framework by means of ArrayFire.

Thanks to these researchers for sharing their great work with us!