GPUSPH v4 released

GPUSPH version 4.0 is out! Head over to the release tag on GitHub.

We are pleased to announce the official release of GPUSPH version 4.0. After two years of development, the new version introduces significant computational advancements, new SPH modelling features, and a simpler and more robust interface to define simulation scenarios.

New features:

Gaussian smoothing kernel;
Grenier formulation for multi-fluid support;
dynamic particles boundary formulation;
improved support for periodic boundary conditions;
open boundaries (inlet/outlet) with the semi-analytical boundary formulation;
computation of total mechanical energy;
hot-start (possibility to restart a simulation from a previous run);
multi-node support: running a simulation across multiple cluster nodes, with one or more GPUs on each node;
new, simplified interface for the definition of simulation scenarios;
use of the Chrono library instead of ODE for moving objects (support for complex joints);
better, updated documentation.

Requirements:

Linux or Mac OS X;
- full support for Mac OS X may be phased out in the future, due to the difficulty in keeping up with the evolution of the CUDA and compiler support on that platform;
CUDA 7.0 or later;
a host compiler with C++11 support, and supported by CUDA:
- on Linux, GCC 4.8 or later, or Clang 3.6 or later (CUDA 7.5 or later only);
- on Mac OS X, XCode 6 or later;
the develop branch of the Chrono library is required for fluid/solid and solid/solid interaction;
- GPUSPH can be compiled without Chrono support;
- the latest Chrono commit at the moment of this announcement (984f51dbe16903097fe0170b4aabefa851421b63) is known to work;
an MPI implementation for multi-node support;
- OpenMPI 1.8.4 or MVAPICH2 1.8 recommended, other MPI implementations may work too without GPUDirect support;
- GPUSPH can be compiled without MPI support.

Known issues:

large simulations may lock up on Maxwell class GPUs (Compute Capability 5.x, i.e. GTX 9xx, GTX Titan, Titan X)
- this is actually an issue in the thrust library, which we use in the particle sort;
open boundaries do not work correctly in multi-GPU simulations;
GPUSPH may fail to run on systems with multiple GPUs when the first GPU is set in exclusive mode;
- this is due to GPUSPH always creating a context on the first GPU when pinning memory to speed up host/device data transfers;
- a fix is under development; as a temporary workaround, change the first GPU to non-exclusive mode;
some CUDA-aware MPI implementations do not support working with multiple GPUs per process;
- a possible workaround to use all GPUs on multi-GPU nodes is to run multiple processes per node;
hot-starting a simulation with moving objects may produce slightly different results;
Ferrari correction may not work correctly with boundary formulations different from semi-analytical;
the time step size may vary when using semi-analytical boundaries, even if a fixed time step is asked for.

For any feedback or support request please use the official issue tracker.

We also invite you to join the #gpusph IRC channel on FreeNode.

Posts