Swarm-NG  1.1
swarm::gpu::bppt Namespace Reference

Class of GPU integrators with a thread for each body-pair. More...

Classes

class  hermite_adap
 GPU implementation of PEC2 Hermite integrator w/ adaptive time step. More...
 
class  hermite
 GPU implementation of PEC2 Hermite integrator. More...
 
struct  FixedTimeStep
 data structure for fixed time step More...
 
struct  AdaptiveTimeStep
 data structure for adaptive time step More...
 
class  rkck
 Runge Kutta Cash Karp integrator Fixed/Adaptive. More...
 
struct  EulerPropagatorParams
 Paramaters for EulerPropagator. More...
 
struct  EulerPropagator
 GPU implementation of euler propagator It is of no practical use. More...
 
struct  HermitePropagatorParams
 Paramaters for HermitePropagator. More...
 
struct  HermitePropagator
 GPU implementation of hermite propagator It is of no practical use since hermite integrator implements the same functionaliy faster. More...
 
struct  MidpointPropagatorParams
 Paramaters for MidpointPropagator. More...
 
struct  MidpointPropagator
 GPU implementation of modified midpoint method propagator. More...
 
struct  MVSPropagatorParams
 Paramaters for MvsPropagator. More...
 
struct  MVSPropagator
 GPU implementation of mixed variables symplectic propagator. More...
 
struct  VerletPropagatorParams
 Paramaters for VerletPropagator. More...
 
struct  VerletPropagator
 GPU implementation of Verlet propagator. More...
 
class  integrator
 Common functionality and skeleton for body-pair-per-thread integrators Common tasks include: More...
 
class  generic
 Generic integrator for rapid creation of new integrators. More...
 
class  GravitationAcc
 templatized Class to calculate acceleration and jerk in parallel More...
 
class  GravitationAccJerk
 templatized Class working as a function object to calculate acceleration and jerk in parallel. More...
 
struct  GravitationAccScalars
 Unit type of the acceleration pairs shared array. More...
 
struct  GravitationAccJerkScalars
 Unit type of the acceleration and jerk pairs shared array. More...
 
class  GravitationAcc_GR
 templatized Class to calculate acceleration and jerk in parallel More...
 
class  GravitationLargeN
 Gravitation calculation class for large number of bodies in a system. More...
 
class  GravitationMediumN
 Gravitation calculation for a number of bodies between 10-20 EXPERIMENTAL: This class is not thoroughly tested. More...
 

Functions

GPUAPI int sysid ()
 Kernel Helper Function: Extract system ID from CUDA thread ID.
 
GPUAPI int sysid_in_block ()
 Kernel Helper Function: Extract system sequence number inside current block.
 
GPUAPI int thread_in_system ()
 Kernel Helper Function: Extract the worker-thread number for current system.
 
GPUAPI int system_per_block_gpu ()
 Kernel Helper Function: Extract number of systems per a block from CUDA thread information.
 
GPUAPI int thread_component_idx (int nbod)
 Kernel Helper Function: Logical coordinate component id [1:x,2:y,3:z] calculated from thread ID info.
 
GPUAPI int thread_body_idx (int nbod)
 Kernel Helper Function: Logical body id [0..nbod-1] calculated from thread ID info.
 
template<class Impl , class T >
GPUAPI void * system_shared_data_pointer (Impl *integ, T compile_time_param)
 Kernel Helper Function: Get the pointer to dynamic shared memory allocated for the system. More...
 
GENERIC double inner_product (const double a[3], const double b[3])
 Helper function for calculating inner product.
 
template<int nbod>
GENERIC int first (int ij)
 Helper function to convert an integer from 1..n*(n-1)/2 to a pair (first,second), this function returns the first element.
 
template<int nbod>
GENERIC int second (int ij)
 Helper function to convert an integer from 1..n*(n-1)/2 to a pair (first,second), this function returns the second element.
 

Detailed Description

Class of GPU integrators with a thread for each body-pair.

Using a thread for each body-pair is to parallelize as much as possible when integrating an ensemble. The thread assignment is as follows

  1. When computing interaction forces (and higher derivatives) between bodies, one thread is assigned to each pair of bodies.
  2. When integrating quantities for bodies individually, one thread is assigned to each coordinate component of each body.
  3. When advancing the time or checking for stop criteria or setting the time step, only one thread is used.

We use predicate barriers for each case since the number of threads that are actually working is not the same in each case. For example, in case of 3 bodies, there are 3 pairs, 9 body-components and 1 thread is needed for overall tasks.

For better coalesced reads from global and shared memory, the block is structured in a non-traditional way. Innermost (x) component is part of the system id, and the other component is the body-pair.

Global functions defined here are used inside kernels for consistent interpretation of thread and shared memory references

The integrators that derive from this may override three functions: thread_per_system(), shmem_per_sytem(),

Function Documentation

template<class Impl , class T >
GPUAPI void* swarm::gpu::bppt::system_shared_data_pointer ( Impl *  integ,
compile_time_param 
)

Kernel Helper Function: Get the pointer to dynamic shared memory allocated for the system.

This function assumes that the memory is used through CoalescedStructArray with a chunk size of SHMEM_CHUNK_SIZE. This uses overlapping data structures to provide coalescing for shared memory.

Definition at line 122 of file bppt.hpp.

References sysid_in_block().

Referenced by swarm::gpu::bppt::hermite< Monitor, Gravitation >::kernel(), swarm::gpu::bppt::rkck< AdaptationStyle, Monitor, Gravitation >::kernel(), TutorialIntegrator< Monitor, Gravitation >::kernel(), swarm::gpu::bppt::hermite_adap< Monitor, Gravitation >::kernel(), and swarm::gpu::bppt::generic< Propagator, Monitor, Gravitation >::kernel().