INGVIstituto Nazionale di Geofisica e Vulcanologia
First implementation of WCSPH to run entirely on GPU with CUDA.
Version 5 released on June 13, 2019. https://www.gpusph.org/
Aims to be a universal SPH computational engine.
potential simulation framework combinations.
And it keeps growing!
(e.g. not included: thermal and thermo-dynamic problems)
Not all combinations are currently supported. We’re working on it!
Morris | |
---|---|
Monaghan | |
Español & Revenga |
where , and are some average first and second viscosity,
and is the volume-based neighbor weight.
How many specializations do we need?
except for Grenier .
Harmonic mean of constant kinematic viscosity:
When possible, simplify .
We want a specialization for simulations with a single, Newtonian fluid when using kinematic internal viscosity representation with harmonic averaging, except when using Grenier’s formulation or the Español & Revenga viscous model.
Yield strength | Shear rate dependency | ||
---|---|---|---|
Linear | Power-law | Exponential | |
none | Newton | Power-law | |
constant | Bingham | Herschel–Bulkley | DeKee & Turcotte |
regularized | Papanastasiou | Alexandrou | Zhu |
functional | Granular |
Separate contributions and share common code: functions instead of (of up to ).
And no repeated code (DRY)!
void shear_rate_contrib(...) {
if (linear_rheology(...)) { /* linear contrib */ }
else if (power_law_rheology(...)) { /* power-law contrib */ }
else if (exponential_rheology(...)) { /* exponential contrib */ }
}
void yield_strength_contrib(...) {
if (no_yield_strength(...)) return;
if (constant_ys(...)) { /* standard yield strength contrib */ }
else if (regularized_ys(...)) { /* regularized yield strength contrib */}
else if (functional_ys(...)) { /* compute yield strength and apply contrib */ }
}
void calc_visc(...) {
compute_shear_rate(...);
shear_rate_contrib(...);
yield_strength_contrib(...);
}
Compiler may or may not see all optimization opportunities (e.g. dead code elimination).
Compiler may or may not minimize use of variables (more variables = more registers = register spills = slower code).
Trust the compiler? 🙏
Help the compiler! 🏄
(...) shear_rate_contrib(...) { /* linear contrib */ }
(...) shear_rate_contrib(...) { /* power-law contrib */ }
(...) shear_rate_contrib(...) { /* exponential contrib */ }
(...) yield_strength_contrib(...) { return; /* no yield strength */ }
(...) yield_strength_contrib(...) { /* standard yield strength contrib */ }
(...) yield_strength_contrib(...) { /* regularized yield strength contrib */}
(...) yield_strength_contrib(...) { /* compute yield strength and apply contrib */ }
void calc_visc(...) {
compute_shear_rate(...);
shear_rate_contrib(...);
yield_strength_contrib(...);
}
one (version of the) function for each specialization, where the specialization may depend on sophisticated conditionals on the framework parameters;
no unnecessary arguments and variables, but same call syntax for all functions: same name, “same” parameters, “same” return type.
enum KernelType { QUARTIC, CUBIC, WENDLAND, GAUSSIAN };
template<KernelType kerneltype> float W(float r, float slength);
template<> float W<QUARTIC>(float r, float slength)
{ /* compute and return QUARTIC kernel value */ }
template<> float W<CUBIC>(float r, float slength)
{ /* compute and return CUBIC kernel value */ }
/* etc for the other values of KernelType */
/* same mechanism also used for F */
Compile-time equivalent of a switch
/ case
statement. Selections based on a single option value.
What about more complex conditionals?
/* A type that is T if B is true, invalid otherwise.
* Pre-defined in C++14, can be defined in C++11.
* Older versions can define enable_if<B, T>::type
*/
enable_if_t< bool B, typename T = void>
Requirement: B
must be evaluable at compile-time:
constexpr bool f(...) { return ... }
with compile-time constant arguments;template<...> struct f { static const bool value = ... }
Substitution Failure Is Not An Error ⇒ when enable_if
fails, the corresponding specialization gets skipped without errors.
constexpr bool linear_rheology(RheologyType rheologytype)
{ return /* conditions for which a rheology is linear */ }
constexpr bool power_law_rheology(RheologyType rheologytype)
{ return /* conditions for a power-law rheology */ }
template<RheologyType rheologytype>
enable_if_t< linear_rheology(rheologytype) > /* skipped if not linear */
shear_rate_contrib(...)
{ /* linear contrib */ }
template<RheologyType rheologytype>
enable_if_t< power_law_rheology(rheologytype) > /* skipped if not power-law */
shear_rate_contrib(...)
{ /* power-law contrib */ }
Structures with optional members, only present if a condition is satisfied.
struct effvisc_params {
float *effvisc;
effvisc_params(float *effvisc_) : effvisc(effvisc_) {}
};
template<typename Framework,
typename optional_effvisc = cond_struct< needs_effvisc<Framework>(), effvisc_params >
>
struct all_params : optional_effvisc
{
float *forces;
all_params(...) :
forces(...),
optional_effvisc(...)
{}
};
Framework:
// Standard struct that maps type to T if B, else to F
template<bool B, typename T, typename F> struct conditional;
// A struct with no members to “gobble” any other type
template<typename>
struct empty_struct {
empty_struct() {} /* empty constructor */
// Universal constructor
template<typename ...T>
empty_struct(T const&...) {}
};
// type alias template for C++11; for older versions of the standard
// one can use a less robust COND_STRUCT(B, T) macro with heavier syntax
template<bool B, typename T>
using cond_struct = conditional<B, T, empty_struct<T>>::type;
template<KernelType _kerneltype, SPHFormulation _sph_formulation,
DensityDiffusionType _densitydiffusiontype, /* more framework params omitted */
flag_t _simflags,
ParticleType _cptype, ParticleType _nptype,
/* auxiliary defines for some conditionals, omitted */
typename xsph_cond = cond_struct< (_simflags & ENABLE_XSPH) && _cptype == _nptype,
xsph_forces_params>,
typename vol_cond = cond_struct<
_sph_formulation == SPH_GRENIER && _densitydiffusiontype == COLAGROSSI,
volume_forces_params>,
typename grenier_cond =
cond_struct< _sph_formulation == SPH_GRENIER, grenier_forces_params>,
/* more conditional structures definitions, omitted */
>
struct forces_params : // the type template that will be passed to the computational kernels
common_forces_params,
// empty_struct must be template or multiple missing optional substructures would error
xsph_cond, vol_cond, grenier_cond,
/* more conditional structures omitted */
{
/* see next slide */
};
template<...> struct forces_params : ... {
// allow compile-time extraction of template parameters from the struct
static constexpr KernelType kerneltype = _kerneltype;
static constexpr SPHFormulation sph_formulation = _sph_formulation;
/* etc */
forces_params(...) : common_forces_params(...), xpsh_cond(...), vol_cond(...), ... {}
};
template<typename FP /* any instance of forces_params */>
void forcesDevice(FP params) { // local variables are held in conditional structures too!
particle_data<FP> pdata(params); // central particle data
particle_output<FP> pout(params, pdata); // particle output
for_each_neib(FP::nptype, ...) { // iterate over neighbors of type nptype
neib_data<FP> ndata(params, pdata); // neighbor data
neib_output<FP> nout(params, pdata, ndata); // neighbor contribution
/* function that abstracts the logic of the interaction, that calls other function
* that handle specific parts, etc down the final specializations of each contribution
*/
particle_particle_interaction(params, pdata, ndata, pout, nout);
}
write_out(params, pout); // save the results for this particle
}
/// INVISCID viscous model
/*! No actual contribution */
template<typename FP, typename P, typename N, typename OP, typename ON>
enable_if_t< FP::inviscid > compute_laminar_visc_contrib
(FP const& params, P const& pdata, N const& ndata, OP &pout, ON &nout)
{ /* do nothing */ }
/// Standard volumic viscous contribution
/*! This is used when the neighbor is not a boundary particle, or for dynamic boundaries,
* but not with Grenier's formulation or Español & Revenga's viscous model
*/
template<typename FP, typename P, typename N, typename OP, typename ON>
enable_if_t< (not FP::inviscid) && wants_volumic_visc_term<FP>() &&
FP::viscmodel != ESPANOL_REVENGA && FP::sph_formulation != SPH_GRENIER >
compute_laminar_visc_contrib
(FP const& params, P const& pdata, N const& ndata, OP &pout, ON &nout)
{
const float visc = visc_avg<FP::ViscSpec>(
get_visc_coeff(pdata), get_visc_coeff(ndata),
physical_density(pdata), physical_density(ndata), ndata.relPos.w);
nout.DvDt += visc*ndata.f*viscous_vector_component(params, pdata, ndata);
}
/* and several other specializations */
forces_params
(and thus forcesDevice
) have the central and neighbor particle type as template parameters ⇒ individual specializations per pair of particle types ⇒ no runtime conditionals based on neighbor type.
Main structure is (mostly) the same for all pairs ⇒ actual specialization is typically only at or near the “bottom of the stack” (e.g. selected through wants_volumic_visc_term
).
How do we iterate efficiently on neighbors of a single type?
Top left | CPU-preferred neighbors | list layout |
---|---|---|
Top middle | GPU-preferred neighbors | list layout |
Right | GPU-preferred splitneibs | list layout |