We compared an application which is using a 3-D adaptive mesh refinement (AMR) algorithm to solve the semi-linear wave equation with exponent p = 7 with second order finite differencing in space and Runge-Kutta third order integration in time. We implemented this application using conventional techniques based on MPI and using HPX. The figures below show a comparison of the strong scaling behavior of both applications.
In application performance experiments, the HPX runtime system substantially reduced starvation and latency effects which resulted in better load-balancing and better strong scaling than comparison code written using MPI. As levels of refinement were added to the simulation, strong scaling improved in the HPX version. The MPI comparison code showed the opposite behavior: strong scaling decreased as levels of refinement were added. The reduction in starvation and the mitigation of latencies when using the HPX runtime system comes at a cost of increased overhead and contention. Some of this overhead can be controlled and partially amortized by adjusting the task granularity of a simulation. This reduces the number of lightweight threads used and allows the user to optimize the granularity for a particular simulation configuration.
As levels of refinement are added, the strong scaling of the MPI application becomes worse.
As levels of refinement are added, the strong scaling of the equivalent HPX application improves.