GPUSPH version 4.1 is out! Head over to the release tag on GitHub.

We are pleased to announce the official release of GPUSPH version 4.1. This new version introduces several fixes and improvements to GPUSPH over version 4.0:

  • open boundaries now have multi-GPU/multi-node support (only available for semi-analytical boundaries) (fixes GitHub issue #18);
  • multi-device simulations do not lock unused devices (fixes GitHub issue #17);
  • support for updated Chrono library (version 3.0) (should fix GitHub issue #19);
  • automatic water depth computation at open boundaries is now possible for all types of open boundaries (with prescribed pressure and prescribed velocity)
  • the hotstart (resume) functionality now works for simulations involving moving bodies (but see Known Issues below);
  • improved balanced domain split with SA boundaries.


  • Linux or Mac OS X;
    • full support for Mac OS X may be phased out in the future, due to the difficulty in keeping up with the evolution of the CUDA and compiler support on that platform;
  • CUDA 7.0 or later;
  • a host compiler with C++11 support, and supported by CUDA:
    • on Linux, GCC 4.8 or later, or Clang 3.6 or later (CUDA 7.5 or later only);
    • on Mac OS X, XCode 6 or later;
  • the Chrono library is required for fluid/solid and solid/solid interaction;
    • GPUSPH can be compiled without Chrono support;
    • you need version 3.0 of the Chrono library. The develop branch of Chrono at the time of this announcement (7f3d61f641c41b63f58c79dc0c0355768864f911) is known to work too;
  • an MPI implementation for multi-node support;
    • OpenMPI 1.8.4 or MVAPICH2 1.8 recommended, other MPI implementations may work too without GPUDirect support;
    • GPUSPH can be compiled without MPI support.

Known issues:

  • large simulations may lock up on Maxwell class GPUs (Compute Capability 5.x, i.e. GTX 9xx, GTX Titan, Titan X) with CUDA version 7.x;
    • this is actually an issue in the thrust library, which we use in the particle sort;
    • upgrading to the latest version of CUDA fixes the issue (CUDA 8 at the time of writing);
  • hot-starting a simulation with moving objects may produce slightly different results;
  • simulations with moving objects running on multiple nodes may fail to resume correctly on hot-start;
  • if a simulation runs to completion, it is possible to resume from the last checkpoint and continue, but the results may be different from the ones that would have been obtained running the original simulation for longer;
  • Ferrari correction may not work correctly with boundary formulations different from semi-analytical;
  • the time step size may vary when using semi-analytical boundaries, even if a fixed time step is asked for.

For any feedback or support request please use the official issue tracker.

We also invite you to join the #gpusph IRC channel on FreeNode.