alpaka
Abstraction Library for Parallel Kernel Acceleration
|
►Nalpaka | The alpaka accelerator library |
►Nbt | |
CIdxBtLinear | General ND bt index provider based on a linear index |
CIdxBtOmp | The OpenMP accelerator index provider |
CIdxBtRefThreadIdMap | The threads accelerator index provider |
CIdxBtUniformCudaHipBuiltIn | The CUDA/HIP accelerator ND index provider |
CIdxBtZero | A zero block thread index provider |
►Nconcepts | |
RTag | |
►Ncore | |
►Nalign | The alignment specifics |
COptimalAlignment | Calculates the optimal alignment for data of the given size |
►Ndetail | Defines implementation details that should not be used directly by the user |
CAssertGreaterThan | |
CAssertValueUnsigned | |
CRoundUpToPowerOfTwoHelper | Base case for N being a power of two |
CRoundUpToPowerOfTwoHelper< N, false > | Case for N not being a power of two |
CScopeLogStdOut | Scope logger |
CThreadPool | A thread pool yielding when there is not enough work to be done |
►Nthreads | |
Ndetail | |
CBarrierThread | A self-resetting barrier |
CBarrierThreadWithPredicate | A self-resetting barrier with barrier |
►Nvectorization | Suggests vectorization of the directly following loop to the compiler |
CGetVectorizationSizeElems | |
CGetVectorizationSizeElems< double > | |
CGetVectorizationSizeElems< float > | |
CGetVectorizationSizeElems< std::int16_t > | |
CGetVectorizationSizeElems< std::int32_t > | |
CGetVectorizationSizeElems< std::int64_t > | |
CGetVectorizationSizeElems< std::int8_t > | |
CGetVectorizationSizeElems< std::uint16_t > | |
CGetVectorizationSizeElems< std::uint32_t > | |
CGetVectorizationSizeElems< std::uint64_t > | |
CGetVectorizationSizeElems< std::uint8_t > | |
CCallbackThread | |
CRoundUpToPowerOfTwo | Rounds to the next higher power of two (if not already power of two) |
►Ncpu | |
►Ndetail | The CPU device |
CQueueCpuOmp2CollectiveImpl | The CPU collective device queue implementation |
►Ncuda | |
Ndetail | |
Ntrait | |
►Ndetail | |
CAtomicHierarchyConceptType | |
CAtomicHierarchyConceptType< hierarchy::Blocks > | |
CAtomicHierarchyConceptType< hierarchy::Grids > | |
CAtomicHierarchyConceptType< hierarchy::Threads > | |
CBlockSharedMemDynMemberStatic | "namespace" for static constexpr members that should be in BlockSharedMemDynMember but cannot be because having a static const member breaks GCC 10 OpenMP target: type not mappable |
CBlockSharedMemStMemberImpl | Implementation of static block shared memory provider |
CBufCpuImpl | The CPU memory buffer |
CCheckFnReturnType | Check that the return of TKernelFnObj is void |
CCheckFnReturnType< AccGpuUniformCudaHipRt< TApi, TDim, TIdx > > | Specialization of the TKernelFnObj return type evaluation |
CDevGlobalImplGeneric | |
CDevGlobalTrait | |
CDevGlobalTrait< TagCpuOmp2Blocks, T > | |
CDevGlobalTrait< TagCpuOmp2Threads, T > | |
CDevGlobalTrait< TagCpuSerial, T > | |
CDevGlobalTrait< TagCpuTbbBlocks, T > | |
CDevGlobalTrait< TagCpuThreads, T > | |
CDevGlobalTrait< TagGpuCudaRt, T > | |
CDevGlobalTrait< TagGpuHipRt, T > | |
►CIndependentGroupsAlong | |
Cconst_iterator | |
CParallelFor | Executor of parallel OpenMP loop |
CParallelFor< TKernel, omp::Schedule > | Executor of parallel OpenMP loop |
CParallelFor< TKernel, TSchedule, UseScheduleKind< TKernel, TSchedule > > | Executor of parallel OpenMP loop |
CParallelForDynamicImpl | Helper executor of parallel OpenMP loop with the dynamic schedule |
CParallelForDynamicImpl< TKernel, TSchedule, HasScheduleChunkSize< TKernel > > | Helper executor of parallel OpenMP loop with the dynamic schedule |
CParallelForGuidedImpl | Helper executor of parallel OpenMP loop with the guided schedule |
CParallelForGuidedImpl< TKernel, TSchedule, HasScheduleChunkSize< TKernel > > | Helper executor of parallel OpenMP loop with the guided schedule |
CParallelForImpl | Executor of parallel OpenMP loop with the given schedule |
CParallelForImpl< TKernel, omp::Schedule, omp::Schedule::Dynamic > | Executor of parallel OpenMP loop with the dynamic schedule |
CParallelForImpl< TKernel, omp::Schedule, omp::Schedule::Guided > | Executor of parallel OpenMP loop with the guided schedule |
CParallelForImpl< TKernel, omp::Schedule, omp::Schedule::Static > | Executor of parallel OpenMP loop with the static schedule |
CParallelForImpl< TKernel, TSchedule, omp::Schedule::Auto > | Executor of parallel OpenMP loop with auto schedule set |
CParallelForImpl< TKernel, TSchedule, omp::Schedule::Dynamic > | Executor of parallel OpenMP loop with the dynamic schedule |
CParallelForImpl< TKernel, TSchedule, omp::Schedule::Guided > | Executor of parallel OpenMP loop with the guided schedule |
CParallelForImpl< TKernel, TSchedule, omp::Schedule::NoSchedule > | Executor of parallel OpenMP loop with no schedule set |
CParallelForImpl< TKernel, TSchedule, omp::Schedule::Runtime > | Executor of parallel OpenMP loop with runtime schedule set |
CParallelForImpl< TKernel, TSchedule, omp::Schedule::Static > | Executor of parallel OpenMP loop with the static schedule |
CParallelForStaticImpl | Helper executor of parallel OpenMP loop with the static schedule |
CParallelForStaticImpl< TKernel, TSchedule, HasScheduleChunkSize< TKernel > > | Helper executor of parallel OpenMP loop with the static schedule |
CPitchHolder | |
CPitchHolder< TDim, std::enable_if_t< TDim::value >=2 > | |
CPrint | |
CPrint< DimInt< Dim< TView >::value - 1u >, TView > | |
CQueueRegistry | The CPU/GPU device queue registry implementation |
CTaskCopyCpu | The CPU device ND memory copy task |
CTaskCopyCpu< DimInt< 0u >, TViewDst, TViewSrc, TExtent > | The CPU device scalar memory copy task |
CTaskCopyCpu< DimInt< 1u >, TViewDst, TViewSrc, TExtent > | The CPU device 1D memory copy task |
CTaskCopyCpuBase | The CPU device memory copy task base |
CTaskCopyUniformCudaHip | The CUDA/HIP memory copy trait |
CTaskCopyUniformCudaHip< TApi, DimInt< 0u >, TViewDst, TViewSrc, TExtent > | The scalar CUDA/HIP memory copy trait |
CTaskCopyUniformCudaHip< TApi, DimInt< 1u >, TViewDst, TViewSrc, TExtent > | The 1D CUDA/HIP memory copy trait |
CTaskCopyUniformCudaHip< TApi, DimInt< 2u >, TViewDst, TViewSrc, TExtent > | The 2D CUDA/HIP memory copy trait |
CTaskCopyUniformCudaHip< TApi, DimInt< 3u >, TViewDst, TViewSrc, TExtent > | The 3D CUDA/HIP memory copy trait |
CTaskSetCpu | The CPU device ND memory set task |
CTaskSetCpu< DimInt< 0u >, TView, TExtent > | The CPU device scalar memory set task |
CTaskSetCpu< DimInt< 1u >, TView, TExtent > | The CPU device 1D memory set task |
CTaskSetCpuBase | The CPU device ND memory set task base |
CTaskSetUniformCudaHip | The CUDA/HIP memory set task |
CTaskSetUniformCudaHip< TApi, DimInt< 0u >, TView, TExtent > | The scalar CUDA/HIP memory set task |
CTaskSetUniformCudaHip< TApi, DimInt< 1u >, TView, TExtent > | The 1D CUDA/HIP memory set task |
CTaskSetUniformCudaHip< TApi, DimInt< 2u >, TView, TExtent > | The 2D CUDA/HIP memory set task |
CTaskSetUniformCudaHip< TApi, DimInt< 3u >, TView, TExtent > | The 3D CUDA/HIP memory set task |
CTaskSetUniformCudaHipBase | The CUDA/HIP memory set task base |
►CUniformElementsAlong | |
Cconst_iterator | |
►Ngb | |
CIdxGbLinear | General ND index provider based on a linear index |
CIdxGbRef | A IdxGbRef grid block index |
CIdxGbUniformCudaHipBuiltIn | The CUDA/HIP accelerator ND index provider |
►Ngeneric | |
►Ndetail | |
CEventGenericThreadsImpl | The CPU device event implementation |
CQueueGenericThreadsBlockingImpl | The CPU device queue implementation |
CQueueGenericThreadsNonBlockingImpl | The CPU device queue implementation |
►Nhierarchy | Defines the parallelism hierarchy levels of alpaka |
CBlocks | |
CGrids | |
CThreads | |
►Ninterface | |
►Ndetail | |
CImplementationBaseType | Returns the type that implements the given interface in the inheritance hierarchy |
CImplementationBaseType< TInterface, TDerived, std::enable_if_t< ImplementsInterface< TInterface, TDerived >::value > > | For types that inherit from "Implements<TInterface, ...>" it finds the base class (TBase) which implements the interface |
CImplementationBaseType< TInterface, TDerived, std::enable_if_t<!ImplementsInterface< TInterface, TDerived >::value > > | Base case for types that do not inherit from "Implements<TInterface, ...>" is the type itself |
CImplements | Tag used in class inheritance hierarchies that describes that a specific interface (TInterface) is implemented by the given base class (TBase) |
CImplementsInterface | Checks whether the interface is implemented by the given class |
►Ninternal | |
CComplex | Implementation of a complex number useable on host and device |
CViewAccessOps | |
►Nmath | |
Nconstants | |
►Ntrait | The math traits |
Ndetail | |
CAbs | The abs trait |
CAcos | The acos trait |
CAcosh | The acosh trait |
CArg | The arg trait |
CAsin | The asin trait |
CAsinh | The asin trait |
CAtan | The atan trait |
CAtan2 | The atan2 trait |
CAtanh | The atanh trait |
CCbrt | The cbrt trait |
CCeil | The ceil trait |
CConj | The conj trait |
CCopysign | The copysign trait |
CCos | The cos trait |
CCosh | The cosh trait |
CErf | |
CExp | The exp trait |
CFloor | The floor trait |
CFma | The fma trait |
CFmod | The fmod trait |
CIsfinite | The isfinite trait |
CIsinf | The isinf trait |
CIsnan | The isnan trait |
CLlround | The round trait |
CLog | The log trait |
CLog10 | The base 10 log trait |
CLog2 | The bas 2 log trait |
CLround | The round trait |
CMax | The max trait |
CMin | The min trait |
CPow | The pow trait |
CRemainder | The remainder trait |
CRound | The round trait |
CRsqrt | The rsqrt trait |
CSin | The sin trait |
CSinCos | The sincos trait |
CSinh | The sin trait |
CSqrt | The sqrt trait |
CTan | The tan trait |
CTanh | The tanh trait |
CTrunc | The trunc trait |
CAbsStdLib | The standard library abs, implementation covered by the general template |
CAbsUniformCudaHipBuiltIn | The CUDA built in abs |
CAcoshStdLib | The standard library acos, implementation covered by the general template |
CAcoshUniformCudaHipBuiltIn | The CUDA built in acosh |
CAcosStdLib | The standard library acos, implementation covered by the general template |
CAcosUniformCudaHipBuiltIn | The CUDA built in acos |
CArgStdLib | The standard library arg, implementation covered by the general template |
CArgUniformCudaHipBuiltIn | The CUDA built in arg |
CAsinhStdLib | The standard library asinh, implementation covered by the general template |
CAsinhUniformCudaHipBuiltIn | The CUDA built in asinh |
CAsinStdLib | The standard library asin, implementation covered by the general template |
CAsinUniformCudaHipBuiltIn | The CUDA built in asin |
CAtan2StdLib | The standard library atan2, implementation covered by the general template |
CAtan2UniformCudaHipBuiltIn | The CUDA built in atan2 |
CAtanhStdLib | The standard library atanh, implementation covered by the general template |
CAtanhUniformCudaHipBuiltIn | The CUDA built in atanh |
CAtanStdLib | The standard library atan, implementation covered by the general template |
CAtanUniformCudaHipBuiltIn | The CUDA built in atan |
CCbrtStdLib | The standard library cbrt, implementation covered by the general template |
CCbrtUniformCudaHipBuiltIn | The CUDA built in cbrt |
CCeilStdLib | The standard library ceil, implementation covered by the general template |
CCeilUniformCudaHipBuiltIn | The CUDA built in ceil |
CConceptMathAbs | |
CConceptMathAcos | |
CConceptMathAcosh | |
CConceptMathArg | |
CConceptMathAsin | |
CConceptMathAsinh | |
CConceptMathAtan | |
CConceptMathAtan2 | |
CConceptMathAtanh | |
CConceptMathCbrt | |
CConceptMathCeil | |
CConceptMathConj | |
CConceptMathCopysign | |
CConceptMathCos | |
CConceptMathCosh | |
CConceptMathErf | |
CConceptMathExp | |
CConceptMathFloor | |
CConceptMathFma | |
CConceptMathFmod | |
CConceptMathIsfinite | |
CConceptMathIsinf | |
CConceptMathIsnan | |
CConceptMathLog | |
CConceptMathLog10 | |
CConceptMathLog2 | |
CConceptMathMax | |
CConceptMathMin | |
CConceptMathPow | |
CConceptMathRemainder | |
CConceptMathRound | |
CConceptMathRsqrt | |
CConceptMathSin | |
CConceptMathSinCos | |
CConceptMathSinh | |
CConceptMathSqrt | |
CConceptMathTan | |
CConceptMathTanh | |
CConceptMathTrunc | |
CConjStdLib | The standard library conj, implementation covered by the general template |
CConjUniformCudaHipBuiltIn | The CUDA built in conj |
CCopysignStdLib | The standard library copysign, implementation covered by the general template |
CCopysignUniformCudaHipBuiltIn | The CUDA built in copysign |
CCoshStdLib | The standard library cosh, implementation covered by the general template |
CCoshUniformCudaHipBuiltIn | The CUDA built in cosh |
CCosStdLib | The standard library cos, implementation covered by the general template |
CCosUniformCudaHipBuiltIn | The CUDA built in cos |
CErfStdLib | The standard library erf, implementation covered by the general template |
CErfUniformCudaHipBuiltIn | The CUDA built in erf |
CExpStdLib | The standard library exp, implementation covered by the general template |
CExpUniformCudaHipBuiltIn | The CUDA built in exp |
CFloorStdLib | The standard library floor, implementation covered by the general template |
CFloorUniformCudaHipBuiltIn | The CUDA built in floor |
CFmaStdLib | The standard library fma, implementation covered by the general template |
CFmaUniformCudaHipBuiltIn | The CUDA built in fma |
CFmodStdLib | The standard library fmod, implementation covered by the general template |
CFmodUniformCudaHipBuiltIn | The CUDA built in fmod |
CIsfiniteStdLib | The standard library isfinite, implementation covered by the general template |
CIsfiniteUniformCudaHipBuiltIn | The CUDA built in isfinite |
CIsinfStdLib | The standard library isinf, implementation covered by the general template |
CIsinfUniformCudaHipBuiltIn | The CUDA built in isinf |
CIsnanStdLib | The standard library isnan, implementation covered by the general template |
CIsnanUniformCudaHipBuiltIn | The CUDA built in isnan |
CLog10StdLib | The standard library log10, implementation covered by the general template |
CLog10UniformCudaHipBuiltIn | |
CLog2StdLib | The standard library log2, implementation covered by the general template |
CLog2UniformCudaHipBuiltIn | |
CLogStdLib | The standard library log, implementation covered by the general template |
CLogUniformCudaHipBuiltIn | |
CMathStdLib | The standard library math trait specializations |
CMathUniformCudaHipBuiltIn | The standard library math trait specializations |
CMaxStdLib | The standard library max |
CMaxUniformCudaHipBuiltIn | The CUDA built in max |
CMinStdLib | The standard library min |
CMinUniformCudaHipBuiltIn | The CUDA built in min |
CPowStdLib | The standard library pow, implementation covered by the general template |
CPowUniformCudaHipBuiltIn | The CUDA built in pow |
CRemainderStdLib | The standard library remainder, implementation covered by the general template |
CRemainderUniformCudaHipBuiltIn | The CUDA built in remainder |
CRoundStdLib | The standard library round, implementation covered by the general template |
CRoundUniformCudaHipBuiltIn | The CUDA round |
CRsqrtStdLib | The standard library rsqrt, implementation covered by the general template |
CRsqrtUniformCudaHipBuiltIn | The CUDA rsqrt |
CSinCosStdLib | The standard library sincos, implementation covered by the general template |
CSinCosUniformCudaHipBuiltIn | The CUDA sincos |
CSinhStdLib | The standard library sinh, implementation covered by the general template |
CSinhUniformCudaHipBuiltIn | The CUDA sinh |
CSinStdLib | The standard library sin, implementation covered by the general template |
CSinUniformCudaHipBuiltIn | The CUDA sin |
CSqrtStdLib | The standard library sqrt, implementation covered by the general template |
CSqrtUniformCudaHipBuiltIn | The CUDA sqrt |
CTanhStdLib | The standard library tanh, implementation covered by the general template |
CTanhUniformCudaHipBuiltIn | The CUDA tanh |
CTanStdLib | The standard library tan, implementation covered by the general template |
CTanUniformCudaHipBuiltIn | The CUDA tan |
CTruncStdLib | The standard library trunc, implementation covered by the general template |
CTruncUniformCudaHipBuiltIn | The CUDA trunc |
►Nmemory_scope | |
CBlock | Memory fences are observed by all threads in the same block |
CDevice | Memory fences are observed by all threads on the device |
CGrid | Memory fences are observed by all threads in the same grid |
►Nmeta | |
►Ndetail | |
CApplyImpl | |
CApplyImpl< TList< T... >, TApplicant > | |
CCartesianProductImpl | |
CCartesianProductImpl< TList > | |
CCartesianProductImpl< TList, Head< Ts... >, Tail... > | |
CCartesianProductImplHelper | |
CCartesianProductImplHelper< TList< TList<> >, Ts... > | |
CCartesianProductImplHelper< TList< Ts... > > | |
CCartesianProductImplHelper< TList< Ts... >, TList<>, Rests... > | |
CCartesianProductImplHelper< TList< X... >, Head< T, Ts... >, Rests... > | |
CCartesianProductImplHelper< TList< X... >, TList< H >, Rests... > | |
CConcatenateImpl | |
CConcatenateImpl< T > | |
CConcatenateImpl< TList< As... >, TList< Bs... >, TRest... > | |
CConvertIntegerSequence | |
CConvertIntegerSequence< TDstType, std::integer_sequence< T, Tvals... > > | |
CEmpty | Empty dependent type |
CFilterImpl | |
CFilterImpl< TList< Ts... >, TPred > | |
CFilterImplHelper | |
CFilterImplHelper< TList, TPred > | |
CFilterImplHelper< TList, TPred, T, Ts... > | |
CForEachTypeHelper | |
CForEachTypeHelper< TList< T, Ts... > > | |
CForEachTypeHelper< TList<> > | |
CFront | |
CFront< List< Head, Tail... > > | |
CIsParameterPackSetImpl | |
CIsParameterPackSetImpl< T, Ts... > | |
CIsParameterPackSetImpl<> | |
CIsSetImpl | |
CIsSetImpl< TList< Ts... > > | |
CMakeIntegerSequenceHelper | |
CMakeIntegerSequenceHelper< false, false, T, Tbegin, std::integral_constant< T, TIdx >, std::integer_sequence< T, Tvals... > > | |
CMakeIntegerSequenceHelper< false, true, T, Tbegin, std::integral_constant< T, Tbegin >, std::integer_sequence< T, Tvals... > > | |
CNonZeroImpl | |
CNonZeroImpl< std::integral_constant< T, TValue > > | |
CToListImpl | |
CToListImpl< TListType, TList, std::enable_if_t< alpaka::meta::isList< TList > > > | |
CTransformImpl | |
CTransformImpl< TList< Ts... >, TOp > | |
CUniqueHelper | |
CUniqueHelper< TList< Ts... >, U, Us... > | |
CUniqueImpl | |
CUniqueImpl< TList< Ts... > > | |
CContains | |
CContains< List< Head, Tail... >, Value > | |
CDependentFalseType | A false_type being dependent on a ignored template parameter. This allows to use static_assert in uninstantiated template specializations without triggering |
CInheritFromList | |
CInheritFromList< TList< TBases... > > | |
CIntegerSequenceValuesInRange | Checks if the values in the index sequence are within the given range |
CIntegerSequenceValuesInRange< std::integer_sequence< T, Tvals... >, T, Tmin, Tmax > | Checks if the values in the index sequence are within the given range |
CIntegerSequenceValuesUnique | Checks if the values in the index sequence are unique |
CIntegerSequenceValuesUnique< std::integer_sequence< T, Tvals... > > | Checks if the values in the index sequence are unique |
CIntegralValuesInRange | Checks if the integral values are within the given range |
CIntegralValuesInRange< T, Tmin, Tmax > | Checks if the integral values are within the given range |
CIntegralValuesInRange< T, Tmin, Tmax, I, Tvals... > | Checks if the integral values are within the given range |
CIntegralValuesUnique | Checks if the integral values are unique |
CIsArrayOrVector | |
CIsArrayOrVector< alpaka::Vec< N, T > > | |
CIsArrayOrVector< std::array< T, N > > | |
CIsArrayOrVector< std::vector< T, A > > | |
CIsArrayOrVector< T[N]> | |
CIsList | |
CIsList< TList< TTypes... > > | |
Cmax | |
Cmin | |
CToList | Takes an arbitrary number of types (T) and creates a type list of type TListType with the types (T). If T is a single template parameter and it satisfies alpaka::meta::isList, the type of the structure is T (no type change). For example std::tuple can be used as TListType |
CToList< TListType, T > | |
CToList< TListType, T, Ts... > | |
►Nomp | |
CSchedule | Representation of OpenMP schedule information: kind and chunk size. This class can be used regardless of whether OpenMP is enabled |
Norigin | Defines the origins available for getting extent and indices of kernel executions |
Nproperty | Properties to define queue behavior |
►Nrand | |
►Ndistribution | The random number generator distribution specifics |
►Ncpu | |
CNormalReal | The CPU random number normal distribution |
CUniformReal | The CPU random number uniform distribution |
CUniformUint | The CPU random number normal distribution |
►Ngpu | |
Ndetail | |
►Ntrait | The random number generator distribution trait |
CCreateNormalReal | The random number float normal distribution get trait |
CCreateNormalReal< RandDefault, T, std::enable_if_t< std::is_floating_point_v< T > > > | The GPU device random number float normal distribution get trait specialization |
CCreateNormalReal< RandUniformCudaHipRand< TApi >, T, std::enable_if_t< std::is_floating_point_v< T > > > | The CUDA/HIP random number float normal distribution get trait specialization |
CCreateUniformReal | The random number float uniform distribution get trait |
CCreateUniformReal< RandDefault, T, std::enable_if_t< std::is_floating_point_v< T > > > | The GPU device random number float uniform distribution get trait specialization |
CCreateUniformReal< RandUniformCudaHipRand< TApi >, T, std::enable_if_t< std::is_floating_point_v< T > > > | The CUDA/HIP random number float uniform distribution get trait specialization |
CCreateUniformUint | The random number integer uniform distribution get trait |
CCreateUniformUint< RandDefault, T, std::enable_if_t< std::is_integral_v< T > > > | The GPU device random number integer uniform distribution get trait specialization |
CCreateUniformUint< RandUniformCudaHipRand< TApi >, T, std::enable_if_t< std::is_integral_v< T > > > | The CUDA/HIP random number integer uniform distribution get trait specialization |
►Nuniform_cuda_hip | |
CNormalReal< double > | The CUDA/HIP random number float normal distribution |
CNormalReal< float > | The CUDA/HIP random number float normal distribution |
CUniformReal< double > | The CUDA/HIP random number float uniform distribution |
CUniformReal< float > | The CUDA/HIP random number float uniform distribution |
CUniformUint< unsigned int > | The CUDA/HIP random number unsigned integer uniform distribution |
►Nengine | The random number generator engine specifics |
►Ncpu | |
CTinyMTengine | Implementation of std::UniformRandomBitGenerator for TinyMT32 |
►Ntrait | The random number generator engine trait |
CCreateDefault | The random number default generator engine get trait |
CCreateDefault< MersenneTwister > | |
CCreateDefault< RandDefault > | The GPU device random number default generator get trait specialization |
CCreateDefault< RandomDevice > | |
CCreateDefault< RandUniformCudaHipRand< TApi > > | The CUDA/HIP random number default generator get trait specialization |
CCreateDefault< TinyMersenneTwister > | The CPU device random number default generator get trait specialization |
Nuniform_cuda_hip | |
CPhiloxBaseCommon | |
CPhiloxConstants | |
CPhiloxParams | |
CPhiloxSingle | |
CPhiloxStateless | |
CPhiloxStatelessKeyedBase | |
CPhiloxStateSingle | |
CPhiloxStateVector | |
CPhiloxVector | |
CConceptRand | |
CEngineCallHostAccProxy | |
CMersenneTwister | The standard library mersenne twister implementation |
CPhilox4x32x10 | |
CPhilox4x32x10Vector | |
CPhiloxStateless4x32x10Vector | |
CRandDefault | |
CRandomDevice | The standard library rand device implementation |
CRandUniformCudaHipRand | The CUDA/HIP rand implementation |
CTinyMersenneTwister | "Tiny" state mersenne twister implementation |
CUniformReal | TEMP: Distributions to be decided on later. The generator should be compatible with STL as of now |
►Ntest | The test specifics |
►Ncpu | |
►Ndetail | |
CEventHostManualTriggerCpuImpl | Event that can be enqueued into a queue and can be triggered by the Host |
►Ndetail | The detail namespace is used to separate implementation details from user accessible code |
CStreamOutAccName | The accelerator name write wrapper |
Ninteg | |
►Ntrait | |
CBegin | |
CDefaultQueueType | The default queue type trait for devices |
CDefaultQueueType< DevCpu > | The default queue type trait specialization for the CPU device |
CDefaultQueueType< DevUniformCudaHipRt< TApi > > | The default queue type trait specialization for the CUDA/HIP device |
CEnd | |
CEventHostManualTriggerType | |
CEventHostManualTriggerType< DevCpu > | |
CEventHostManualTriggerType< DevCudaRt > | |
CIsBlockingQueue | The blocking queue trait |
CIsBlockingQueue< QueueCpuOmp2Collective > | The blocking queue trait specialization for a OpenMP2 collective CPU queue |
CIsBlockingQueue< QueueGenericThreadsBlocking< TDev > > | The blocking queue trait specialization for a blocking CPU queue |
CIsBlockingQueue< QueueGenericThreadsNonBlocking< TDev > > | The blocking queue trait specialization for a non-blocking CPU queue |
CIsBlockingQueue< QueueUniformCudaHipRtBlocking< TApi > > | The blocking queue trait specialization for a blocking CUDA/HIP RT queue |
CIsBlockingQueue< QueueUniformCudaHipRtNonBlocking< TApi > > | The blocking queue trait specialization for a non-blocking CUDA/HIP RT queue |
CIsEventHostManualTriggerSupported | |
CIsEventHostManualTriggerSupported< DevCpu > | The CPU event host manual trigger support get trait specialization |
CIsEventHostManualTriggerSupported< DevCudaRt > | The CPU event host manual trigger support get trait specialization |
CIteratorView | |
►Nuniform_cuda_hip | |
►Ndetail | |
CEventHostManualTriggerCudaImpl | |
CArray | |
CEventHostManualTriggerCpu | Event that can be enqueued into a queue and can be triggered by the Host |
CEventHostManualTriggerCuda | |
CKernelExecutionFixture | The fixture for executing a kernel on a given accelerator |
CQueueTestFixture | |
CVerifyBytesSetKernel | Compares element-wise that all bytes are set to the same value |
CVerifyViewsEqualKernel | Compares iterators element-wise |
►Ntrait | The accelerator traits |
►Ndetail | |
CAtomicOp | |
CAtomicOp< BlockAnd > | |
CAtomicOp< BlockCount > | |
CAtomicOp< BlockOr > | |
CEmulateAtomic | Emulate atomic |
CEmulateAtomic< alpaka::AtomicAnd, alpaka::AtomicUniformCudaHipBuiltIn, T, THierarchy, std::enable_if_t< std::is_floating_point_v< T > > > | AtomicAnd can not be implemented for floating point types! |
CEmulateAtomic< alpaka::AtomicCas, alpaka::AtomicUniformCudaHipBuiltIn, T, THierarchy > | Emulate AtomicCas with equivalent unisigned integral type |
CEmulateAtomic< alpaka::AtomicDec, alpaka::AtomicUniformCudaHipBuiltIn, T, THierarchy, std::enable_if_t< std::is_floating_point_v< T > > > | AtomicDec can not be implemented for floating point types! |
CEmulateAtomic< alpaka::AtomicInc, alpaka::AtomicUniformCudaHipBuiltIn, T, THierarchy, std::enable_if_t< std::is_floating_point_v< T > > > | AtomicInc can not be implemented for floating point types! |
CEmulateAtomic< alpaka::AtomicOr, alpaka::AtomicUniformCudaHipBuiltIn, T, THierarchy, std::enable_if_t< std::is_floating_point_v< T > > > | AtomicOr can not be implemented for floating point types! |
CEmulateAtomic< alpaka::AtomicSub, alpaka::AtomicUniformCudaHipBuiltIn, T, THierarchy > | Emulate AtomicSub with atomicAdd |
CEmulateAtomic< alpaka::AtomicXor, alpaka::AtomicUniformCudaHipBuiltIn, T, THierarchy, std::enable_if_t< std::is_floating_point_v< T > > > | AtomicXor can not be implemented for floating point types! |
CEmulationBase | |
Ngeneric | |
CAccToTag | |
CAccToTag< alpaka::AccGpuCudaRt< TDim, TIdx > > | |
CAccType | The accelerator type trait |
CAccType< AccGpuUniformCudaHipRt< TApi, TDim, TIdx > > | The GPU CUDA accelerator accelerator type trait specialization |
CAccType< TaskKernelGpuUniformCudaHipRt< TApi, TAcc, TDim, TIdx, TKernelFnObj, TArgs... > > | The GPU CUDA/HIP execution task accelerator type trait specialization |
CAsyncBufAlloc | The stream-ordered memory allocator trait |
CAsyncBufAlloc< TElem, TDim, TIdx, DevCpu > | The BufCpu stream-ordered memory allocation trait specialization |
CAsyncBufAlloc< TElem, TDim, TIdx, DevUniformCudaHipRt< TApi > > | The CUDA/HIP stream-ordered memory allocation trait specialization |
CAtomicOp | The atomic operation trait |
CAtomicOp< AtomicCas, AtomicUniformCudaHipBuiltIn, T, THierarchy > | |
CAtomicOp< TOp, AtomicUniformCudaHipBuiltIn, T, THierarchy > | Generic atomic implementation |
CBlockSharedMemDynSizeBytes | The trait for getting the size of the block shared dynamic memory of a kernel |
CBufAlloc | The memory allocator trait |
CBufAlloc< TElem, Dim, TIdx, DevUniformCudaHipRt< TApi > > | The CUDA/HIP memory allocation trait specialization |
CBufAlloc< TElem, TDim, TIdx, DevCpu > | The BufCpu memory allocation trait specialization |
CBufAllocMapped | The pinned/mapped memory allocator trait |
CBufAllocMapped< PlatformCpu, TElem, TDim, TIdx > | The pinned/mapped memory allocation trait specialization |
CBufType | The memory buffer type trait |
CBufType< DevCpu, TElem, TDim, TIdx > | The CPU device memory buffer type trait specialization |
CCastVec | Trait for casting a vector |
CCastVec< TValNew, Vec< TDim, TVal > > | |
CConcatVec | Trait for concatenating two vectors |
CConcatVec< Vec< TDimL, TVal >, Vec< TDimR, TVal > > | Concatenation specialization for Vec |
CCreateSubView | The sub view creation trait |
CCreateTaskKernel | The kernel execution task creation trait |
CCreateTaskKernel< AccGpuUniformCudaHipRt< TApi, TDim, TIdx >, TWorkDiv, TKernelFnObj, TArgs... > | The GPU CUDA accelerator execution task type trait specialization |
CCreateTaskMemcpy | The memory copy task trait |
CCreateTaskMemcpy< TDim, DevCpu, DevCpu > | The CPU device memory copy trait specialization |
CCreateTaskMemset | The memory set task trait |
CCreateTaskMemset< TDim, DevCpu > | The CPU device memory set trait specialization |
CCreateTaskMemset< TDim, DevUniformCudaHipRt< TApi > > | The CUDA device memory set trait specialization |
CCreateViewPlainPtr | The device memory view creation trait |
CCreateViewPlainPtr< DevCpu > | The CPU device CreateViewPlainPtr trait specialization |
CCreateViewPlainPtr< DevUniformCudaHipRt< TApi > > | The CUDA/HIP RT device CreateViewPlainPtr trait specialization |
CCurrentThreadWaitFor | The thread wait trait |
CCurrentThreadWaitFor< alpaka::generic::detail::EventGenericThreadsImpl< TDev > > | The CPU device event implementation thread wait trait specialization |
CCurrentThreadWaitFor< DevCpu > | The CPU device thread wait specialization |
CCurrentThreadWaitFor< EventGenericThreads< TDev > > | The CPU device event thread wait trait specialization |
CCurrentThreadWaitFor< EventUniformCudaHipRt< TApi > > | The CUDA/HIP RT device event thread wait trait specialization |
CCurrentThreadWaitFor< QueueCpuOmp2Collective > | The CPU blocking device queue thread wait trait specialization |
CCurrentThreadWaitFor< QueueGenericThreadsBlocking< TDev > > | The CPU blocking device queue thread wait trait specialization |
CCurrentThreadWaitFor< QueueGenericThreadsNonBlocking< TDev > > | The CPU non-blocking device queue thread wait trait specialization |
CCurrentThreadWaitFor< uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking > > | The CUDA/HIP RT queue thread wait trait specialization |
CDeclareSharedVar | The block shared static memory variable allocation operation trait |
CDeclareSharedVar< T, TuniqueId, BlockSharedMemStMember< TDataAlignBytes > > | |
CDeclareSharedVar< T, TuniqueId, BlockSharedMemStMemberMasterSync< TDataAlignBytes > > | |
CDevType | The device type trait |
CDevType< AccGpuUniformCudaHipRt< TApi, TDim, TIdx > > | The GPU CUDA accelerator device type trait specialization |
CDevType< BufCpu< TElem, TDim, TIdx > > | The BufCpu device type trait specialization |
CDevType< BufUniformCudaHipRt< TApi, TElem, TDim, TIdx > > | The BufUniformCudaHipRt device type trait specialization |
CDevType< EventGenericThreads< TDev > > | The CPU device event device type trait specialization |
CDevType< EventUniformCudaHipRt< TApi > > | The CUDA/HIP RT device event device type trait specialization |
CDevType< PlatformCpu > | The CPU device device type trait specialization |
CDevType< PlatformUniformCudaHipRt< TApi > > | The CUDA/HIP RT platform device type trait specialization |
CDevType< QueueCpuOmp2Collective > | The CPU blocking device queue device type trait specialization |
CDevType< QueueGenericThreadsBlocking< TDev > > | The CPU blocking device queue device type trait specialization |
CDevType< QueueGenericThreadsNonBlocking< TDev > > | The CPU non-blocking device queue device type trait specialization |
CDevType< std::array< TElem, Tsize > > | The std::array device type trait specialization |
CDevType< std::vector< TElem, TAllocator > > | The std::vector device type trait specialization |
CDevType< TaskKernelGpuUniformCudaHipRt< TApi, TAcc, TDim, TIdx, TKernelFnObj, TArgs... > > | The GPU CUDA/HIP execution task device type trait specialization |
CDevType< TDev, std::enable_if_t< interface::ImplementsInterface< ConceptDev, TDev >::value > > | Get device type |
CDevType< uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking > > | The CUDA/HIP RT blocking queue device type trait specialization |
CDevType< ViewConst< TView > > | |
CDevType< ViewPlainPtr< TDev, TElem, TDim, TIdx > > | The ViewPlainPtr device type trait specialization |
CDevType< ViewSubView< TDev, TElem, TDim, TIdx > > | The ViewSubView device type trait specialization |
CDimType | The dimension getter type trait |
CDimType< AccGpuUniformCudaHipRt< TApi, TDim, TIdx > > | The GPU CUDA accelerator dimension getter trait specialization |
CDimType< bt::IdxBtLinear< TDim, TIdx > > | The IdxBtLinear index dimension get trait specialization |
CDimType< bt::IdxBtRefThreadIdMap< TDim, TIdx > > | The CPU threads accelerator index dimension get trait specialization |
CDimType< BufCpu< TElem, TDim, TIdx > > | The BufCpu dimension getter trait |
CDimType< BufUniformCudaHipRt< TApi, TElem, TDim, TIdx > > | The BufUniformCudaHipRt dimension getter trait specialization |
CDimType< gb::IdxGbLinear< TDim, TIdx > > | The IdxGbLinear index dimension get trait specialization |
CDimType< gb::IdxGbRef< TDim, TIdx > > | The IdxGbRef grid block index dimension get trait specialization |
CDimType< std::array< TElem, Tsize > > | The std::array dimension getter trait specialization |
CDimType< std::vector< TElem, TAllocator > > | The std::vector dimension getter trait specialization |
CDimType< T, std::enable_if_t< meta::Contains< alpaka::detail::CudaHipBuiltinTypes1, T >::value > > | The CUDA/HIP vectors 1D dimension get trait specialization |
CDimType< T, std::enable_if_t< meta::Contains< alpaka::detail::CudaHipBuiltinTypes2, T >::value > > | The CUDA/HIP vectors 2D dimension get trait specialization |
CDimType< T, std::enable_if_t< meta::Contains< alpaka::detail::CudaHipBuiltinTypes3, T >::value > > | The CUDA/HIP vectors 3D dimension get trait specialization |
CDimType< T, std::enable_if_t< meta::Contains< alpaka::detail::CudaHipBuiltinTypes4, T >::value > > | The CUDA/HIP vectors 4D dimension get trait specialization |
CDimType< T, std::enable_if_t< std::is_arithmetic_v< T > > > | The arithmetic type dimension getter trait specialization |
CDimType< TaskKernelGpuUniformCudaHipRt< TApi, TAcc, TDim, TIdx, TKernelFnObj, TArgs... > > | The GPU CUDA/HIP execution task dimension getter trait specialization |
CDimType< Vec< TDim, TVal > > | The Vec dimension get trait specialization |
CDimType< ViewConst< TView > > | |
CDimType< ViewPlainPtr< TDev, TElem, TDim, TIdx > > | The ViewPlainPtr dimension getter trait |
CDimType< ViewSubView< TDev, TElem, TDim, TIdx > > | The ViewSubView dimension getter trait specialization |
CDimType< WorkDivMembers< TDim, TIdx > > | The WorkDivMembers dimension get trait specialization |
CDimType< WorkDivUniformCudaHipBuiltIn< TDim, TIdx > > | The GPU CUDA/HIP accelerator work division dimension get trait specialization |
CElemType | The element type trait |
CElemType< BufCpu< TElem, TDim, TIdx > > | The BufCpu memory element type get trait specialization |
CElemType< BufUniformCudaHipRt< TApi, TElem, TDim, TIdx > > | The BufUniformCudaHipRt memory element type get trait specialization |
CElemType< std::array< TElem, Tsize > > | The std::array memory element type get trait specialization |
CElemType< std::vector< TElem, TAllocator > > | The std::vector memory element type get trait specialization |
CElemType< T, std::enable_if_t< alpaka::detail::isCudaHipBuiltInType< T > > > | The CUDA/HIP vectors elem type trait specialization |
CElemType< T, std::enable_if_t< std::is_fundamental_v< T > > > | The fundamental type elem type trait specialization |
CElemType< ViewConst< TView > > | |
CElemType< ViewPlainPtr< TDev, TElem, TDim, TIdx > > | The ViewPlainPtr memory element type get trait specialization |
CElemType< ViewSubView< TDev, TElem, TDim, TIdx > > | The ViewSubView memory element type get trait specialization |
CEmpty | The queue empty trait |
CEmpty< QueueCpuOmp2Collective > | The CPU blocking device queue test trait specialization |
CEmpty< QueueGenericThreadsBlocking< TDev > > | The CPU blocking device queue test trait specialization |
CEmpty< QueueGenericThreadsNonBlocking< TDev > > | The CPU non-blocking device queue test trait specialization |
CEmpty< uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking > > | The CUDA/HIP RT queue test trait specialization |
CEnqueue | The queue enqueue trait |
CEnqueue< alpaka::generic::detail::QueueGenericThreadsBlockingImpl< TDev >, EventGenericThreads< TDev > > | The CPU blocking device queue enqueue trait specialization |
CEnqueue< alpaka::generic::detail::QueueGenericThreadsNonBlockingImpl< TDev >, EventGenericThreads< TDev > > | The CPU non-blocking device queue enqueue trait specialization |
CEnqueue< cpu::detail::QueueCpuOmp2CollectiveImpl, EventCpu > | The CPU OpenMP2 collective device queue enqueue trait specialization |
CEnqueue< QueueCpuOmp2Collective, EventCpu > | The CPU OpenMP2 collective device queue enqueue trait specialization |
CEnqueue< QueueCpuOmp2Collective, TaskKernelCpuOmp2Blocks< TDim, TIdx, TKernelFnObj, TArgs... > > | The CPU blocking device queue enqueue trait specialization. This default implementation for all tasks directly invokes the function call operator of the task |
CEnqueue< QueueCpuOmp2Collective, test::EventHostManualTriggerCpu<> > | |
CEnqueue< QueueCpuOmp2Collective, TTask > | The CPU blocking device queue enqueue trait specialization. This default implementation for all tasks directly invokes the function call operator of the task |
CEnqueue< QueueCudaRtBlocking, test::EventHostManualTriggerCuda > | |
CEnqueue< QueueCudaRtNonBlocking, test::EventHostManualTriggerCuda > | |
CEnqueue< QueueGenericThreadsBlocking< TDev >, EventGenericThreads< TDev > > | The CPU blocking device queue enqueue trait specialization |
CEnqueue< QueueGenericThreadsBlocking< TDev >, test::EventHostManualTriggerCpu< TDev > > | |
CEnqueue< QueueGenericThreadsBlocking< TDev >, TTask > | The CPU blocking device queue enqueue trait specialization. This default implementation for all tasks directly invokes the function call operator of the task |
CEnqueue< QueueGenericThreadsNonBlocking< TDev >, EventGenericThreads< TDev > > | The CPU non-blocking device queue enqueue trait specialization |
CEnqueue< QueueGenericThreadsNonBlocking< TDev >, test::EventHostManualTriggerCpu< TDev > > | |
CEnqueue< QueueGenericThreadsNonBlocking< TDev >, TTask > | The CPU non-blocking device queue enqueue trait specialization. This default implementation for all tasks directly invokes the function call operator of the task |
CEnqueue< QueueUniformCudaHipRtBlocking< TApi >, alpaka::detail::TaskSetUniformCudaHip< TApi, DimInt< 0u >, TView, TExtent > > | The CUDA blocking device queue scalar set enqueue trait specialization |
CEnqueue< QueueUniformCudaHipRtBlocking< TApi >, alpaka::detail::TaskSetUniformCudaHip< TApi, DimInt< 1u >, TView, TExtent > > | The CUDA blocking device queue 1D set enqueue trait specialization |
CEnqueue< QueueUniformCudaHipRtBlocking< TApi >, alpaka::detail::TaskSetUniformCudaHip< TApi, DimInt< 2u >, TView, TExtent > > | The CUDA blocking device queue 2D set enqueue trait specialization |
CEnqueue< QueueUniformCudaHipRtBlocking< TApi >, alpaka::detail::TaskSetUniformCudaHip< TApi, DimInt< 3u >, TView, TExtent > > | The CUDA blocking device queue 3D set enqueue trait specialization |
CEnqueue< QueueUniformCudaHipRtBlocking< TApi >, EventUniformCudaHipRt< TApi > > | The CUDA/HIP RT queue enqueue trait specialization |
CEnqueue< QueueUniformCudaHipRtNonBlocking< TApi >, alpaka::detail::TaskSetUniformCudaHip< TApi, DimInt< 0u >, TView, TExtent > > | The CUDA non-blocking device queue scalar set enqueue trait specialization |
CEnqueue< QueueUniformCudaHipRtNonBlocking< TApi >, alpaka::detail::TaskSetUniformCudaHip< TApi, DimInt< 1u >, TView, TExtent > > | The CUDA non-blocking device queue 1D set enqueue trait specialization |
CEnqueue< QueueUniformCudaHipRtNonBlocking< TApi >, alpaka::detail::TaskSetUniformCudaHip< TApi, DimInt< 2u >, TView, TExtent > > | The CUDA non-blocking device queue 2D set enqueue trait specialization |
CEnqueue< QueueUniformCudaHipRtNonBlocking< TApi >, alpaka::detail::TaskSetUniformCudaHip< TApi, DimInt< 3u >, TView, TExtent > > | The CUDA non-blocking device queue 3D set enqueue trait specialization |
CEnqueue< QueueUniformCudaHipRtNonBlocking< TApi >, EventUniformCudaHipRt< TApi > > | The CUDA/HIP RT queue enqueue trait specialization |
CEnqueue< uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking >, TaskKernelGpuUniformCudaHipRt< TApi, TAcc, TDim, TIdx, TKernelFnObj, TArgs... > > | The CUDA/HIP kernel enqueue trait specialization |
►CEnqueue< uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking >, TTask > | The CUDA/HIP RT blocking queue enqueue trait specialization |
CHostFuncData | |
CEventType | The event type trait |
CEventType< QueueCpuOmp2Collective > | The CPU blocking device queue event type trait specialization |
CEventType< QueueGenericThreadsBlocking< TDev > > | The CPU blocking device queue event type trait specialization |
CEventType< QueueGenericThreadsNonBlocking< TDev > > | The CPU non-blocking device queue event type trait specialization |
CEventType< uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking > > | The CUDA/HIP RT blocking queue event type trait specialization |
CFfs | The ffs trait |
CFree | The memory free trait |
CFreeSharedVars | The block shared static memory free operation trait |
CFreeSharedVars< BlockSharedMemStMember< TDataAlignBytes > > | |
CFreeSharedVars< BlockSharedMemStMemberMasterSync< TDataAlignBytes > > | |
CFunctionAttributes | The structure template to access to the functions attributes of a kernel function object |
CFunctionAttributes< AccGpuUniformCudaHipRt< TApi, TDim, TIdx >, TDev, TKernelFn, TArgs... > | Specialisation of the class template FunctionAttributes |
CGetAccDevProps | The device properties get trait |
CGetAccDevProps< AccGpuUniformCudaHipRt< TApi, TDim, TIdx > > | The GPU CUDA accelerator device properties get trait specialization |
CGetAccName | The accelerator name trait |
CGetAccName< AccGpuUniformCudaHipRt< TApi, TDim, TIdx > > | The GPU CUDA accelerator name trait specialization |
CGetDev | The device get trait |
CGetDev< BufCpu< TElem, TDim, TIdx > > | The BufCpu device get trait specialization |
CGetDev< BufUniformCudaHipRt< TApi, TElem, TDim, TIdx > > | The BufUniformCudaHipRt device get trait specialization |
CGetDev< EventGenericThreads< TDev > > | The CPU device event device get trait specialization |
CGetDev< EventUniformCudaHipRt< TApi > > | The CUDA/HIP RT device event device get trait specialization |
CGetDev< QueueCpuOmp2Collective > | The CPU blocking device queue device get trait specialization |
CGetDev< QueueGenericThreadsBlocking< TDev > > | The CPU blocking device queue device get trait specialization |
CGetDev< QueueGenericThreadsNonBlocking< TDev > > | The CPU non-blocking device queue device get trait specialization |
CGetDev< std::array< TElem, Tsize > > | The std::array device get trait specialization |
CGetDev< std::vector< TElem, TAllocator > > | The std::vector device get trait specialization |
CGetDev< test::EventHostManualTriggerCpu< TDev > > | The CPU device event device get trait specialization |
CGetDev< test::EventHostManualTriggerCuda > | The CPU device event device get trait specialization |
CGetDev< uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking > > | The CUDA/HIP RT queue device get trait specialization |
CGetDev< ViewConst< TView > > | |
CGetDev< ViewPlainPtr< TDev, TElem, TDim, TIdx > > | The ViewPlainPtr device get trait specialization |
CGetDev< ViewSubView< TDev, TElem, TDim, TIdx > > | The ViewSubView device get trait specialization |
CGetDevByIdx | The device get trait |
CGetDevByIdx< PlatformCpu > | The CPU platform device get trait specialization |
CGetDevByIdx< PlatformUniformCudaHipRt< TApi > > | The CUDA/HIP RT platform device get trait specialization |
CGetDevCount | The device count get trait |
CGetDevCount< PlatformCpu > | The CPU platform device count get trait specialization |
CGetDevCount< PlatformUniformCudaHipRt< TApi > > | The CUDA/HIP RT platform device count get trait specialization |
CGetDynSharedMem | The block shared dynamic memory get trait |
CGetExtent | The extent get trait |
CGetExtents | The GetExtents trait for getting the extents of an object as an alpaka::Vec |
CGetExtents< BufCpu< TElem, TDim, TIdx > > | The BufCpu width get trait specialization |
CGetExtents< BufUniformCudaHipRt< TApi, TElem, TDim, TIdx > > | The BufUniformCudaHipRt extent get trait specialization |
CGetExtents< Integral, std::enable_if_t< std::is_integral_v< Integral > > > | |
CGetExtents< std::array< TElem, Tsize > > | |
CGetExtents< std::vector< TElem, TAllocator > > | |
CGetExtents< TCudaHipBuiltin, std::enable_if_t< alpaka::detail::isCudaHipBuiltInType< TCudaHipBuiltin > > > | |
CGetExtents< Vec< TDim, TVal > > | The Vec extent get trait specialization |
CGetExtents< ViewConst< TView > > | |
CGetExtents< ViewPlainPtr< TDev, TElem, TDim, TIdx > > | |
CGetExtents< ViewSubView< TDev, TElem, TDim, TIdx > > | The ViewSubView width get trait specialization |
CGetFreeMemBytes | The device free memory size get trait |
CGetIdx | The index get trait |
CGetIdx< bt::IdxBtLinear< DimInt< 1u >, TIdx >, origin::Block, unit::Threads > | |
CGetIdx< bt::IdxBtLinear< TDim, TIdx >, origin::Block, unit::Threads > | The IdxBtLinear block thread index get trait specialization |
CGetIdx< bt::IdxBtRefThreadIdMap< TDim, TIdx >, origin::Block, unit::Threads > | The CPU threads accelerator block thread index get trait specialization |
CGetIdx< gb::IdxGbLinear< DimInt< 1u >, TIdx >, origin::Grid, unit::Blocks > | |
CGetIdx< gb::IdxGbLinear< TDim, TIdx >, origin::Grid, unit::Blocks > | The IdxGbLinear grid block index get trait specialization |
CGetIdx< gb::IdxGbRef< TDim, TIdx >, origin::Grid, unit::Blocks > | The IdxGbRef grid block index grid block index get trait specialization |
CGetIdx< TIdx, origin::Grid, unit::Threads > | The grid thread index get trait specialization |
CGetIdx< TIdxBt, origin::Block, unit::Threads > | The block thread index get trait specialization for classes with IdxBtBase member type |
CGetIdx< TIdxGb, origin::Grid, unit::Blocks > | The grid block index get trait specialization for classes with IdxGbBase member type |
CGetMemBytes | The device memory size get trait |
CGetName | The device name get trait |
CGetOffset | The x offset get trait |
CGetOffsets | The GetOffsets trait for getting the offsets of an object as an alpaka::Vec |
CGetOffsets< BufCpu< TElem, TDim, TIdx > > | The BufCpu offset get trait specialization |
CGetOffsets< BufUniformCudaHipRt< TApi, TElem, TDim, TIdx > > | The BufUniformCudaHipRt offset get trait specialization |
CGetOffsets< std::array< TElem, Tsize > > | The std::array offset get trait specialization |
CGetOffsets< std::vector< TElem, TAllocator > > | The std::vector offset get trait specialization |
CGetOffsets< TCudaHipBuiltin, std::enable_if_t< alpaka::detail::isCudaHipBuiltInType< TCudaHipBuiltin > > > | |
CGetOffsets< TIntegral, std::enable_if_t< std::is_integral_v< TIntegral > > > | The unsigned integral x offset get trait specialization |
CGetOffsets< Vec< TDim, TVal > > | The Vec offset get trait specialization |
CGetOffsets< ViewConst< TView > > | |
CGetOffsets< ViewPlainPtr< TDev, TElem, TDim, TIdx > > | The ViewPlainPtr offset get trait specialization |
CGetOffsets< ViewSubView< TDev, TElem, TDim, TIdx > > | The ViewSubView x offset get trait specialization |
CGetPitchBytes | The pitch in bytes. This is the distance in bytes in the linear memory between two consecutive elements in the next higher dimension (TIdx-1) |
CGetPitchesInBytes | Customization point for getPitchesInBytes. The default implementation uses the extent to calculate the pitches |
CGetPitchesInBytes< BufUniformCudaHipRt< TApi, TElem, TDim, TIdx > > | |
CGetPitchesInBytes< ViewConst< TView > > | |
CGetPitchesInBytes< ViewPlainPtr< TDev, TElem, TDim, TIdx > > | |
CGetPitchesInBytes< ViewSubView< TDev, TElem, TDim, TIdx > > | The ViewSubView pitch get trait specialization |
CGetPreferredWarpSize | The device preferred warp size get trait |
CGetPtrDev | The pointer on device get trait |
CGetPtrDev< BufCpu< TElem, TDim, TIdx >, DevCpu > | The BufCpu pointer on device get trait specialization |
CGetPtrDev< BufCpu< TElem, TDim, TIdx >, DevUniformCudaHipRt< TApi > > | The BufCpu pointer on CUDA/HIP device get trait specialization |
CGetPtrDev< BufUniformCudaHipRt< TApi, TElem, TDim, TIdx >, DevUniformCudaHipRt< TApi > > | The BufUniformCudaHipRt pointer on device get trait specialization |
CGetPtrNative | The native pointer get trait |
CGetPtrNative< BufCpu< TElem, TDim, TIdx > > | The BufCpu native pointer get trait specialization |
CGetPtrNative< BufUniformCudaHipRt< TApi, TElem, TDim, TIdx > > | The BufUniformCudaHipRt native pointer get trait specialization |
CGetPtrNative< std::array< TElem, Tsize > > | The std::array native pointer get trait specialization |
CGetPtrNative< std::vector< TElem, TAllocator > > | The std::vector native pointer get trait specialization |
CGetPtrNative< ViewConst< TView > > | |
CGetPtrNative< ViewPlainPtr< TDev, TElem, TDim, TIdx > > | The ViewPlainPtr native pointer get trait specialization |
CGetPtrNative< ViewSubView< TDev, TElem, TDim, TIdx > > | The ViewSubView native pointer get trait specialization |
CGetWarpSizes | The device warp size get trait |
CGetWorkDiv | The work div trait |
CGetWorkDiv< TWorkDiv, origin::Block, unit::Elems > | The work div block element extent trait specialization |
CGetWorkDiv< TWorkDiv, origin::Grid, unit::Elems > | The work div grid element extent trait specialization |
CGetWorkDiv< TWorkDiv, origin::Grid, unit::Threads > | The work div grid thread extent trait specialization |
CGetWorkDiv< WorkDivMembers< TDim, TIdx >, origin::Block, unit::Threads > | The WorkDivMembers block thread extent trait specialization |
CGetWorkDiv< WorkDivMembers< TDim, TIdx >, origin::Grid, unit::Blocks > | The WorkDivMembers grid block extent trait specialization |
CGetWorkDiv< WorkDivMembers< TDim, TIdx >, origin::Thread, unit::Elems > | The WorkDivMembers thread element extent trait specialization |
CGetWorkDiv< WorkDivUniformCudaHipBuiltIn< TDim, TIdx >, origin::Block, unit::Threads > | The GPU CUDA/HIP accelerator work division block thread extent trait specialization |
CGetWorkDiv< WorkDivUniformCudaHipBuiltIn< TDim, TIdx >, origin::Grid, unit::Blocks > | The GPU CUDA/HIP accelerator work division grid block extent trait specialization |
CGetWorkDiv< WorkDivUniformCudaHipBuiltIn< TDim, TIdx >, origin::Thread, unit::Elems > | The GPU CUDA/HIP accelerator work division thread element extent trait specialization |
CHasAsyncBufSupport | The stream-ordered memory allocation capability trait |
CHasAsyncBufSupport< TDim, DevCpu > | The BufCpu stream-ordered memory allocation capability trait specialization |
CHasAsyncBufSupport< TDim, DevUniformCudaHipRt< TApi > > | The CUDA/HIP stream-ordered memory allocation capability trait specialization |
CHasMappedBufSupport | The pinned/mapped memory allocation capability trait |
CHasMappedBufSupport< PlatformCpu > | The pinned/mapped memory allocation capability trait specialization |
CHasMappedBufSupport< PlatformUniformCudaHipRt< TApi > > | The pinned/mapped memory allocation capability trait specialization |
CIdxType | The idx type trait |
CIdxType< AccGpuUniformCudaHipRt< TApi, TDim, TIdx > > | The GPU CUDA accelerator idx type trait specialization |
CIdxType< bt::IdxBtLinear< TDim, TIdx > > | The IdxBtLinear block thread index idx type trait specialization |
CIdxType< bt::IdxBtRefThreadIdMap< TDim, TIdx > > | The CPU threads accelerator block thread index idx type trait specialization |
CIdxType< BufCpu< TElem, TDim, TIdx > > | The BufCpu idx type trait specialization |
CIdxType< BufUniformCudaHipRt< TApi, TElem, TDim, TIdx > > | The BufUniformCudaHipRt idx type trait specialization |
CIdxType< gb::IdxGbLinear< TDim, TIdx > > | The IdxGbLinear grid block index idx type trait specialization |
CIdxType< gb::IdxGbRef< TDim, TIdx > > | The IdxGbRef grid block index idx type trait specialization |
CIdxType< std::array< TElem, Tsize > > | The std::vector idx type trait specialization |
CIdxType< std::vector< TElem, TAllocator > > | The std::vector idx type trait specialization |
CIdxType< T, std::enable_if_t< std::is_arithmetic_v< T > > > | The arithmetic idx type trait specialization |
CIdxType< TaskKernelGpuUniformCudaHipRt< TApi, TAcc, TDim, TIdx, TKernelFnObj, TArgs... > > | The GPU CUDA/HIP execution task idx type trait specialization |
CIdxType< TIdx, std::enable_if_t< alpaka::detail::isCudaHipBuiltInType< TIdx > > > | The CUDA/HIP vectors idx type trait specialization |
CIdxType< Vec< TDim, TVal > > | The Vec idx type trait specialization |
CIdxType< ViewConst< TView > > | |
CIdxType< ViewPlainPtr< TDev, TElem, TDim, TIdx > > | The ViewPlainPtr idx type trait specialization |
CIdxType< ViewSubView< TDev, TElem, TDim, TIdx > > | The ViewSubView idx type trait specialization |
CIdxType< WorkDivMembers< TDim, TIdx > > | The WorkDivMembers idx type trait specialization |
CIdxType< WorkDivUniformCudaHipBuiltIn< TDim, TIdx > > | The GPU CUDA/HIP accelerator work division idx type trait specialization |
CIsComplete | The event tester trait |
CIsComplete< EventGenericThreads< TDev > > | The CPU device event test trait specialization |
CIsComplete< EventUniformCudaHipRt< TApi > > | The CUDA/HIP RT device event test trait specialization |
CIsComplete< test::EventHostManualTriggerCpu< TDev > > | The CPU device event test trait specialization |
CIsComplete< test::EventHostManualTriggerCuda > | The CPU device event test trait specialization |
CIsMultiThreadAcc | The multi thread accelerator trait |
CIsMultiThreadAcc< AccGpuUniformCudaHipRt< TApi, TDim, TIdx > > | The GPU CUDA multi thread accelerator type trait specialization |
CIsSingleThreadAcc | The single thread accelerator trait |
CIsSingleThreadAcc< AccGpuUniformCudaHipRt< TApi, TDim, TIdx > > | The GPU CUDA single thread accelerator type trait specialization |
CMalloc | The memory allocation trait |
CMemFence | The mem_fence trait |
CNativeHandle | The native handle trait |
CNativeHandle< EventUniformCudaHipRt< TApi > > | The CUDA/HIP RT event native handle trait specialization |
CNativeHandle< uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking > > | The CUDA/HIP RT blocking queue native handle trait specialization |
COmpSchedule | The trait for getting the schedule to use when a kernel is run using the CpuOmp2Blocks accelerator |
CPlatformType | The platform type trait |
CPlatformType< AccGpuUniformCudaHipRt< TApi, TDim, TIdx > > | The CPU CUDA execution task platform type trait specialization |
CPlatformType< DevCpu > | The CPU device platform type trait specialization |
CPlatformType< TaskKernelGpuUniformCudaHipRt< TApi, TAcc, TDim, TIdx, TKernelFnObj, TArgs... > > | The CPU CUDA/HIP execution task platform type trait specialization |
CPlatformType< TPlatform, std::enable_if_t< interface::ImplementsInterface< ConceptPlatform, TPlatform >::value > > | |
CPopcount | The popcount trait |
CQueueType | Queue for an accelerator |
CQueueType< DevCpu, Blocking > | |
CQueueType< DevCpu, NonBlocking > | |
CQueueType< TAcc, TProperty, std::enable_if_t< interface::ImplementsInterface< ConceptAcc, TAcc >::value > > | |
CQueueType< TPlatform, TProperty, std::enable_if_t< interface::ImplementsInterface< ConceptPlatform, TPlatform >::value > > | |
CReset | The device reset trait |
CReverseVec | Trait for reversing a vector |
CReverseVec< Vec< TDim, TVal > > | ReverseVec specialization for Vec |
CSubVecFromIndices | Trait for selecting a sub-vector |
CSubVecFromIndices< Vec< TDim, TVal >, std::index_sequence< TIndices... > > | Specialization for selecting a sub-vector |
CSyncBlockThreads | The block synchronization operation trait |
CSyncBlockThreads< BlockSyncBarrierOmp > | |
CSyncBlockThreads< BlockSyncBarrierThread< TIdx > > | |
CSyncBlockThreadsPredicate | The block synchronization and predicate operation trait |
CSyncBlockThreadsPredicate< TOp, BlockSyncBarrierOmp > | |
CSyncBlockThreadsPredicate< TOp, BlockSyncBarrierThread< TIdx > > | |
CTagToAcc | |
CTagToAcc< alpaka::TagGpuCudaRt, TDim, TIdx > | |
CWaiterWaitFor | The waiter wait trait |
CWaiterWaitFor< alpaka::generic::detail::QueueGenericThreadsBlockingImpl< TDev >, EventGenericThreads< TDev > > | The CPU blocking device queue event wait trait specialization |
CWaiterWaitFor< alpaka::generic::detail::QueueGenericThreadsNonBlockingImpl< TDev >, EventGenericThreads< TDev > > | The CPU non-blocking device queue event wait trait specialization |
CWaiterWaitFor< cpu::detail::QueueCpuOmp2CollectiveImpl, EventCpu > | The CPU OpenMP2 collective device queue event wait trait specialization |
CWaiterWaitFor< DevUniformCudaHipRt< TApi >, EventUniformCudaHipRt< TApi > > | The CUDA/HIP RT device event wait trait specialization |
CWaiterWaitFor< QueueCpuOmp2Collective, EventCpu > | The CPU OpenMP2 collective queue event wait trait specialization |
CWaiterWaitFor< QueueGenericThreadsBlocking< TDev >, EventGenericThreads< TDev > > | The CPU blocking device queue event wait trait specialization |
CWaiterWaitFor< QueueGenericThreadsNonBlocking< TDev >, EventGenericThreads< TDev > > | The CPU non-blocking device queue event wait trait specialization |
CWaiterWaitFor< QueueUniformCudaHipRtBlocking< TApi >, EventUniformCudaHipRt< TApi > > | The CUDA/HIP RT queue event wait trait specialization |
CWaiterWaitFor< QueueUniformCudaHipRtNonBlocking< TApi >, EventUniformCudaHipRt< TApi > > | The CUDA/HIP RT queue event wait trait specialization |
CWaiterWaitFor< TDev, EventGenericThreads< TDev > > | The CPU non-blocking device event wait trait specialization |
CWarpSize | The trait for getting the warp size required by a kernel |
►Nuniform_cuda_hip | |
►Ndetail | |
CEventUniformCudaHipImpl | The CUDA/HIP RT device event implementation |
CQueueUniformCudaHipRt | The CUDA/HIP RT queue |
CQueueUniformCudaHipRtImpl | The CUDA/HIP RT queue implementation |
Nunit | Defines the units available for getting extent and indices of kernel executions |
►Nwarp | |
►Ntrait | The warp traits |
CActivemask | The active mask trait |
CAll | The all warp vote trait |
CAny | The any warp vote trait |
CBallot | The ballot warp vote trait |
CGetSize | The warp size trait |
CShfl | The shfl warp swizzling trait |
CShflDown | The shfl down warp swizzling trait |
CShflUp | The shfl up warp swizzling trait |
CShflXor | The shfl xor warp swizzling trait |
CConceptWarp | |
CWarpSingleThread | The single-threaded warp to emulate it on CPUs |
CWarpUniformCudaHipBuiltIn | The GPU CUDA/HIP warp |
CAccCpuOmp2Blocks | The CPU OpenMP 2.0 block accelerator |
CAccCpuOmp2Threads | The CPU OpenMP 2.0 thread accelerator |
CAccCpuSerial | The CPU serial accelerator |
CAccCpuThreads | The CPU threads accelerator |
CAccDevProps | The acceleration properties on a device |
CAccGpuUniformCudaHipRt | The GPU CUDA accelerator |
CAccIsEnabled | Check if the accelerator is enabled for a given tag |
CAccIsEnabled< TTag, std::void_t< TagToAcc< TTag, alpaka::DimInt< 1 >, int > > > | |
CAllocCpuAligned | The CPU boost aligned allocator |
CAllocCpuNew | The CPU new allocator |
►CApiCudaRt | |
CHostFnAdaptor | |
CAtomicAdd | The addition function object |
CAtomicAnd | The and function object |
CAtomicAtomicRef | The atomic ops based on atomic_ref for CPU accelerators |
CAtomicCas | The compare and swap function object |
CAtomicDec | The decrement function object |
CAtomicExch | The exchange function object |
CAtomicInc | The increment function object |
CAtomicMax | The maximum function object |
CAtomicMin | The minimum function object |
CAtomicNoOp | The NoOp atomic ops |
CAtomicOmpBuiltIn | The OpenMP accelerators atomic ops |
CAtomicOr | The or function object |
CAtomicSub | The subtraction function object |
CAtomicUniformCudaHipBuiltIn | The GPU CUDA/HIP accelerator atomic ops |
CAtomicXor | The exclusive or function object |
CBlockAnd | The logical and function object |
CBlockCount | The counting function object |
CBlockOr | The logical or function object |
CBlockSharedMemDynMember | Dynamic block shared memory provider using fixed-size member array to allocate memory on the stack or in shared memory |
CBlockSharedMemDynUniformCudaHipBuiltIn | The GPU CUDA/HIP block shared memory allocator |
CBlockSharedMemStMember | Static block shared memory provider using a pointer to externally allocated fixed-size memory, likely provided by BlockSharedMemDynMember |
CBlockSharedMemStMemberMasterSync | |
CBlockSharedMemStUniformCudaHipBuiltIn | The GPU CUDA/HIP block shared memory allocator |
CBlockSyncBarrierOmp | The OpenMP barrier block synchronization |
CBlockSyncBarrierThread | The thread id map barrier block synchronization |
CBlockSyncNoOp | The no op block synchronization |
CBlockSyncUniformCudaHipBuiltIn | The GPU CUDA/HIP block synchronization |
CBufCpu | The CPU memory buffer |
CBufUniformCudaHipRt | The CUDA/HIP memory buffer |
CConceptAcc | |
CConceptAtomicBlocks | |
CConceptAtomicGrids | |
CConceptAtomicThreads | |
CConceptBlockSharedDyn | |
CConceptBlockSharedSt | |
CConceptBlockSync | |
CConceptCurrentThreadWaitFor | |
CConceptIdxBt | |
CConceptIdxGb | |
CConceptIntrinsic | |
CConceptMemAlloc | |
CConceptMemFence | |
CConceptPlatform | |
CConceptWorkDiv | |
CDevCpu | The CPU device handle |
CDevUniformCudaHipRt | The CUDA/HIP RT device handle |
CElementIndex | |
CEventGenericThreads | The CPU device event |
CEventUniformCudaHipRt | The CUDA/HIP RT device event |
CIGenericThreadsQueue | The CPU queue interface |
CInterfaceTag | |
CIntrinsicCpu | The CPU intrinsic |
CIntrinsicFallback | The Fallback intrinsic |
CIntrinsicUniformCudaHipBuiltIn | The GPU CUDA/HIP intrinsic |
CIsKernelArgumentTriviallyCopyable | Check if a type used as kernel argument is trivially copyable |
CIsKernelTriviallyCopyable | Check if the kernel type is trivially copyable |
CKernelCfg | Kernel start configuration to determine a valid work division |
CKernelFunctionAttributes | Kernel function attributes struct. Attributes are filled by calling the API of the accelerator using the kernel function as an argument. In case of a CPU backend, maxThreadsPerBlock is set to 1 and other values remain zero since there are no correponding API functions to get the values |
CMemFenceCpu | The default CPU memory fence |
CMemFenceCpuSerial | The serial CPU memory fence |
CMemFenceOmp2Blocks | The CPU OpenMP 2.0 block memory fence |
CMemFenceOmp2Threads | The CPU OpenMP 2.0 block memory fence |
CMemFenceUniformCudaHipBuiltIn | The GPU CUDA/HIP memory fence |
CMemSetKernel | Any device ND memory set kernel |
CPlatformCpu | The CPU device platform |
CPlatformUniformCudaHipRt | The CUDA/HIP RT platform |
CQueueCpuOmp2Collective | The CPU collective device queue |
CQueueGenericThreadsBlocking | The CPU device queue |
CQueueGenericThreadsNonBlocking | The CPU device queue |
Cremove_restrict | Removes restrict from a type |
Cremove_restrict< T *__restrict__ > | |
CTagCpuOmp2Blocks | |
CTagCpuOmp2Threads | |
CTagCpuSerial | |
CTagCpuSycl | |
CTagCpuTbbBlocks | |
CTagCpuThreads | |
CTagFpgaSyclIntel | |
CTagGenericSycl | |
CTagGpuCudaRt | |
CTagGpuHipRt | |
CTagGpuSyclIntel | |
CTaskKernelCpuOmp2Blocks | The CPU OpenMP 2.0 block accelerator execution task |
CTaskKernelCpuOmp2Threads | The CPU OpenMP 2.0 thread accelerator execution task |
CTaskKernelCpuSerial | The CPU serial execution task implementation |
CTaskKernelCpuThreads | The CPU threads execution task |
CTaskKernelGpuUniformCudaHipRt | The GPU CUDA/HIP accelerator execution task |
CVec | A n-dimensional vector |
CViewConst | A non-modifiable wrapper around a view. This view acts as the wrapped view, but the underlying data is only exposed const-qualified |
CViewPlainPtr | The memory view to wrap plain pointers |
CViewSubView | A sub-view to a view |
CWorkDivMembers | A basic class holding the work division as grid block extent, block thread and thread element extent |
CWorkDivUniformCudaHipBuiltIn | The GPU CUDA/HIP accelerator work division |
NalpakaGlobal | These types must be in the global namespace for checking existence of respective functions in global namespace via SFINAE, so we use inline namespace |
►Nstd | STL namespace |
Ctuple_element< I, alpaka::Vec< TDim, TVal > > | |
Ctuple_size< alpaka::Vec< TDim, TVal > > |