alpaka
Abstraction Library for Parallel Kernel Acceleration
|
The alpaka accelerator library. More...
Namespaces | |
bt | |
concepts | |
core | |
cpu | |
cuda | |
detail | |
gb | |
generic | |
hierarchy | |
Defines the parallelism hierarchy levels of alpaka. | |
interface | |
internal | |
math | |
memory_scope | |
meta | |
omp | |
origin | |
Defines the origins available for getting extent and indices of kernel executions. | |
property | |
Properties to define queue behavior. | |
rand | |
test | |
The test specifics. | |
trait | |
The accelerator traits. | |
uniform_cuda_hip | |
unit | |
Defines the units available for getting extent and indices of kernel executions. | |
warp | |
Classes | |
class | AccCpuOmp2Blocks |
The CPU OpenMP 2.0 block accelerator. More... | |
class | AccCpuOmp2Threads |
The CPU OpenMP 2.0 thread accelerator. More... | |
class | AccCpuSerial |
The CPU serial accelerator. More... | |
class | AccCpuThreads |
The CPU threads accelerator. More... | |
struct | AccDevProps |
The acceleration properties on a device. More... | |
class | AccGpuUniformCudaHipRt |
The GPU CUDA accelerator. More... | |
struct | AccIsEnabled |
check if the accelerator is enabled for a given tag More... | |
struct | AccIsEnabled< TTag, std::void_t< TagToAcc< TTag, alpaka::DimInt< 1 >, int > > > |
class | AllocCpuAligned |
The CPU boost aligned allocator. More... | |
class | AllocCpuNew |
The CPU new allocator. More... | |
struct | ApiCudaRt |
struct | AtomicAdd |
The addition function object. More... | |
struct | AtomicAnd |
The and function object. More... | |
class | AtomicAtomicRef |
The atomic ops based on atomic_ref for CPU accelerators. More... | |
struct | AtomicCas |
The compare and swap function object. More... | |
struct | AtomicDec |
The decrement function object. More... | |
struct | AtomicExch |
The exchange function object. More... | |
struct | AtomicInc |
The increment function object. More... | |
struct | AtomicMax |
The maximum function object. More... | |
struct | AtomicMin |
The minimum function object. More... | |
class | AtomicNoOp |
The NoOp atomic ops. More... | |
class | AtomicOmpBuiltIn |
The OpenMP accelerators atomic ops. More... | |
struct | AtomicOr |
The or function object. More... | |
struct | AtomicSub |
The subtraction function object. More... | |
class | AtomicUniformCudaHipBuiltIn |
The GPU CUDA/HIP accelerator atomic ops. More... | |
struct | AtomicXor |
The exclusive or function object. More... | |
struct | BlockAnd |
The logical and function object. More... | |
struct | BlockCount |
The counting function object. More... | |
struct | BlockOr |
The logical or function object. More... | |
class | BlockSharedMemDynMember |
Dynamic block shared memory provider using fixed-size member array to allocate memory on the stack or in shared memory. More... | |
class | BlockSharedMemDynUniformCudaHipBuiltIn |
The GPU CUDA/HIP block shared memory allocator. More... | |
class | BlockSharedMemStMember |
Static block shared memory provider using a pointer to externally allocated fixed-size memory, likely provided by BlockSharedMemDynMember. More... | |
class | BlockSharedMemStMemberMasterSync |
class | BlockSharedMemStUniformCudaHipBuiltIn |
The GPU CUDA/HIP block shared memory allocator. More... | |
class | BlockSyncBarrierOmp |
The OpenMP barrier block synchronization. More... | |
class | BlockSyncBarrierThread |
The thread id map barrier block synchronization. More... | |
class | BlockSyncNoOp |
The no op block synchronization. More... | |
class | BlockSyncUniformCudaHipBuiltIn |
The GPU CUDA/HIP block synchronization. More... | |
class | BufCpu |
The CPU memory buffer. More... | |
struct | BufUniformCudaHipRt |
The CUDA/HIP memory buffer. More... | |
struct | ConceptAcc |
struct | ConceptAtomicBlocks |
struct | ConceptAtomicGrids |
struct | ConceptAtomicThreads |
struct | ConceptBlockSharedDyn |
struct | ConceptBlockSharedSt |
struct | ConceptBlockSync |
struct | ConceptCurrentThreadWaitFor |
struct | ConceptIdxBt |
struct | ConceptIdxGb |
struct | ConceptIntrinsic |
struct | ConceptMemAlloc |
struct | ConceptMemFence |
struct | ConceptPlatform |
struct | ConceptWorkDiv |
class | DevCpu |
The CPU device handle. More... | |
class | DevUniformCudaHipRt |
The CUDA/HIP RT device handle. More... | |
struct | ElementIndex |
class | EventGenericThreads |
The CPU device event. More... | |
class | EventUniformCudaHipRt |
The CUDA/HIP RT device event. More... | |
class | IGenericThreadsQueue |
The CPU queue interface. More... | |
struct | InterfaceTag |
class | IntrinsicCpu |
The CPU intrinsic. More... | |
class | IntrinsicFallback |
The Fallback intrinsic. More... | |
class | IntrinsicUniformCudaHipBuiltIn |
The GPU CUDA/HIP intrinsic. More... | |
struct | IsKernelArgumentTriviallyCopyable |
Check if a type used as kernel argument is trivially copyable. More... | |
struct | IsKernelTriviallyCopyable |
Check if the kernel type is trivially copyable. More... | |
struct | KernelCfg |
Kernel start configuration to determine a valid work division. More... | |
struct | KernelFunctionAttributes |
Kernel function attributes struct. Attributes are filled by calling the API of the accelerator using the kernel function as an argument. In case of a CPU backend, maxThreadsPerBlock is set to 1 and other values remain zero since there are no correponding API functions to get the values. More... | |
class | MemFenceCpu |
The default CPU memory fence. More... | |
class | MemFenceCpuSerial |
The serial CPU memory fence. More... | |
class | MemFenceOmp2Blocks |
The CPU OpenMP 2.0 block memory fence. More... | |
class | MemFenceOmp2Threads |
The CPU OpenMP 2.0 block memory fence. More... | |
class | MemFenceUniformCudaHipBuiltIn |
The GPU CUDA/HIP memory fence. More... | |
class | MemSetKernel |
any device ND memory set kernel. More... | |
struct | PlatformCpu |
The CPU device platform. More... | |
struct | PlatformUniformCudaHipRt |
The CUDA/HIP RT platform. More... | |
struct | QueueCpuOmp2Collective |
The CPU collective device queue. More... | |
class | QueueGenericThreadsBlocking |
The CPU device queue. More... | |
class | QueueGenericThreadsNonBlocking |
The CPU device queue. More... | |
struct | remove_restrict |
Removes restrict from a type. More... | |
struct | remove_restrict< T *__restrict__ > |
struct | TagCpuOmp2Blocks |
struct | TagCpuOmp2Threads |
struct | TagCpuSerial |
struct | TagCpuSycl |
struct | TagCpuTbbBlocks |
struct | TagCpuThreads |
struct | TagFpgaSyclIntel |
struct | TagGenericSycl |
struct | TagGpuCudaRt |
struct | TagGpuHipRt |
struct | TagGpuSyclIntel |
class | TaskKernelCpuOmp2Blocks |
The CPU OpenMP 2.0 block accelerator execution task. More... | |
class | TaskKernelCpuOmp2Threads |
The CPU OpenMP 2.0 thread accelerator execution task. More... | |
class | TaskKernelCpuSerial |
The CPU serial execution task implementation. More... | |
class | TaskKernelCpuThreads |
The CPU threads execution task. More... | |
class | TaskKernelGpuUniformCudaHipRt |
The GPU CUDA/HIP accelerator execution task. More... | |
class | Vec |
A n-dimensional vector. More... | |
struct | ViewConst |
A non-modifiable wrapper around a view. This view acts as the wrapped view, but the underlying data is only exposed const-qualified. More... | |
struct | ViewPlainPtr |
The memory view to wrap plain pointers. More... | |
class | ViewSubView |
A sub-view to a view. More... | |
class | WorkDivMembers |
A basic class holding the work division as grid block extent, block thread and thread element extent. More... | |
class | WorkDivUniformCudaHipBuiltIn |
The GPU CUDA/HIP accelerator work division. More... | |
Typedefs | |
template<typename T > | |
using | Acc = typename trait::AccType< T >::type |
The accelerator type trait alias template to remove the ::type. More... | |
template<typename TDim , typename TIdx > | |
using | AccGpuCudaRt = AccGpuUniformCudaHipRt< ApiCudaRt, TDim, TIdx > |
using | AccTags = std::tuple< alpaka::TagCpuSerial, alpaka::TagCpuThreads, alpaka::TagCpuTbbBlocks, alpaka::TagCpuOmp2Blocks, alpaka::TagCpuOmp2Threads, alpaka::TagGpuCudaRt, alpaka::TagGpuHipRt, alpaka::TagCpuSycl, alpaka::TagFpgaSyclIntel, alpaka::TagGpuSyclIntel > |
list of all available tags More... | |
template<typename TAcc > | |
using | AccToTag = typename trait::AccToTag< TAcc >::type |
maps an acc type to a tag type More... | |
using | AtomicCpu = AtomicAtomicRef |
template<typename TGridAtomic , typename TBlockAtomic , typename TThreadAtomic > | |
using | AtomicHierarchy = alpaka::meta::InheritFromList< alpaka::meta::Unique< std::tuple< TGridAtomic, TBlockAtomic, TThreadAtomic, interface::Implements< ConceptAtomicGrids, TGridAtomic >, interface::Implements< ConceptAtomicBlocks, TBlockAtomic >, interface::Implements< ConceptAtomicThreads, TThreadAtomic > >> > |
build a single class to inherit from different atomic implementations More... | |
template<typename THierarchy > | |
using | AtomicHierarchyConcept = typename detail::AtomicHierarchyConceptType< THierarchy >::type |
template<typename TDev , typename TElem , typename TDim , typename TIdx > | |
using | Buf = typename trait::BufType< alpaka::Dev< TDev >, TElem, TDim, TIdx >::type |
The memory buffer type trait alias template to remove the ::type. More... | |
template<typename TElem , typename TDim , typename TIdx > | |
using | BufCudaRt = BufUniformCudaHipRt< ApiCudaRt, TElem, TDim, TIdx > |
template<typename T > | |
using | Dev = typename trait::DevType< T >::type |
The device type trait alias template to remove the ::type. More... | |
using | DevCudaRt = DevUniformCudaHipRt< ApiCudaRt > |
The CUDA RT device handle. More... | |
template<typename TAcc , typename T > | |
using | DevGlobal = typename detail::DevGlobalTrait< typename alpaka::trait::AccToTag< TAcc >::type, T >::Type |
template<typename T > | |
using | Dim = typename trait::DimType< T >::type |
The dimension type trait alias template to remove the ::type. More... | |
template<std::size_t N> | |
using | DimInt = std::integral_constant< std::size_t, N > |
template<typename TView > | |
using | Elem = std::remove_volatile_t< typename trait::ElemType< TView >::type > |
The element type trait alias template to remove the ::type. More... | |
using | EnabledAccTags = alpaka::meta::Filter< AccTags, alpaka::AccIsEnabled > |
list of all tags where the related accelerator is enabled More... | |
template<typename T > | |
using | Event = typename trait::EventType< T >::type |
The event type trait alias template to remove the ::type. More... | |
using | EventCpu = EventGenericThreads< DevCpu > |
using | EventCudaRt = EventUniformCudaHipRt< ApiCudaRt > |
The CUDA RT device event. More... | |
template<class TDim , class TIdx > | |
using | ExampleDefaultAcc = alpaka::AccGpuCudaRt< TDim, TIdx > |
Alias for the default accelerator used by examples. From a list of all accelerators the first one which is enabled is chosen. AccCpuSerial is selected last. More... | |
template<typename T > | |
using | Idx = typename trait::IdxType< T >::type |
template<typename TImpl > | |
using | NativeHandle = decltype(getNativeHandle(std::declval< TImpl >())) |
Alias to the type of the native handle. More... | |
template<typename T > | |
using | Platform = typename trait::PlatformType< T >::type |
The platform type trait alias template to remove the ::type. More... | |
using | PlatformCudaRt = PlatformUniformCudaHipRt< ApiCudaRt > |
The CUDA RT platform. More... | |
template<typename TEnv , typename TProperty > | |
using | Queue = typename trait::QueueType< TEnv, TProperty >::type |
Queue based on the environment and a property. More... | |
using | QueueCpuBlocking = QueueGenericThreadsBlocking< DevCpu > |
using | QueueCpuNonBlocking = QueueGenericThreadsNonBlocking< DevCpu > |
using | QueueCudaRtBlocking = QueueUniformCudaHipRtBlocking< ApiCudaRt > |
The CUDA RT blocking queue. More... | |
using | QueueCudaRtNonBlocking = QueueUniformCudaHipRtNonBlocking< ApiCudaRt > |
The CUDA RT non-blocking queue. More... | |
template<typename TApi > | |
using | QueueUniformCudaHipRtBlocking = uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, true > |
The CUDA/HIP RT blocking queue. More... | |
template<typename TApi > | |
using | QueueUniformCudaHipRtNonBlocking = uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, false > |
The CUDA/HIP RT non-blocking queue. More... | |
template<typename T > | |
using | remove_restrict_t = typename remove_restrict< T >::type |
Helper to remove restrict from a type. More... | |
template<concepts::Tag TTag, typename TDim , typename TIdx > | |
using | TagToAcc = typename trait::TagToAcc< TTag, TDim, TIdx >::type |
maps a tag type to an acc type More... | |
template<typename TAcc , typename TDev , typename TDim , typename TIdx , typename TKernelFnObj , typename... TArgs> | |
using | TaskKernelGpuCudaRt = TaskKernelGpuUniformCudaHipRt< ApiCudaRt, TAcc, TDev, TDim, TIdx, TKernelFnObj, TArgs... > |
Enumerations | |
enum class | GridBlockExtentSubDivRestrictions { EqualExtent , CloseToEqualExtent , Unrestricted } |
The grid block extent subdivision restrictions. More... | |
Functions | |
template<typename TElem , typename TIdx , typename TExtent , typename TQueue > | |
ALPAKA_FN_HOST auto | allocAsyncBuf (TQueue queue, TExtent const &extent=TExtent()) |
Allocates stream-ordered memory on the given device. More... | |
template<typename TElem , typename TIdx , typename TExtent , typename TQueue > | |
ALPAKA_FN_HOST auto | allocAsyncBufIfSupported (TQueue queue, TExtent const &extent=TExtent()) |
If supported, allocates stream-ordered memory on the given queue and the associated device. Otherwise, allocates regular memory on the device associated to the queue. Please note that stream-ordered and regular memory have different semantics: this function is provided for convenience in the cases where the difference is not relevant, and the stream-ordered memory is only used as a performance optimisation. More... | |
template<typename TElem , typename TIdx , typename TExtent , typename TDev > | |
ALPAKA_FN_HOST auto | allocBuf (TDev const &dev, TExtent const &extent=TExtent()) |
Allocates memory on the given device. More... | |
template<typename TElem , typename TIdx , typename TExtent , typename TPlatform > | |
ALPAKA_FN_HOST auto | allocMappedBuf (DevCpu const &host, TPlatform const &platform, TExtent const &extent=TExtent()) |
Allocates pinned/mapped host memory, accessible by all devices in the given platform. More... | |
template<typename TElem , typename TIdx , typename TExtent , typename TPlatform > | |
ALPAKA_FN_HOST auto | allocMappedBufIfSupported (DevCpu const &host, TPlatform const &platform, TExtent const &extent=TExtent()) |
If supported, allocates pinned/mapped host memory, accessible by all devices in the given platform. Otherwise, allocates regular host memory. Please note that pinned/mapped and regular memory may have different semantics: this function is provided for convenience in the cases where the difference is not relevant, and the pinned/mapped memory is only used as a performance optimisation. More... | |
template<typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicAdd (TAtomic const &atomic, T *const addr, T const &value, THierarchy const &hier=THierarchy()) -> T |
Executes an atomic add operation. More... | |
template<typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicAnd (TAtomic const &atomic, T *const addr, T const &value, THierarchy const &hier=THierarchy()) -> T |
Executes an atomic and operation. More... | |
template<typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicCas (TAtomic const &atomic, T *const addr, T const &compare, T const &value, THierarchy const &hier=THierarchy()) -> T |
Executes an atomic compare-and-swap operation. More... | |
template<typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicDec (TAtomic const &atomic, T *const addr, T const &value, THierarchy const &hier=THierarchy()) -> T |
Executes an atomic decrement operation. More... | |
template<typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicExch (TAtomic const &atomic, T *const addr, T const &value, THierarchy const &hier=THierarchy()) -> T |
Executes an atomic exchange operation. More... | |
template<typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicInc (TAtomic const &atomic, T *const addr, T const &value, THierarchy const &hier=THierarchy()) -> T |
Executes an atomic increment operation. More... | |
template<typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicMax (TAtomic const &atomic, T *const addr, T const &value, THierarchy const &hier=THierarchy()) -> T |
Executes an atomic max operation. More... | |
template<typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicMin (TAtomic const &atomic, T *const addr, T const &value, THierarchy const &hier=THierarchy()) -> T |
Executes an atomic min operation. More... | |
template<typename TOp , typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicOp (TAtomic const &atomic, T *const addr, T const &compare, T const &value, THierarchy const &=THierarchy()) -> T |
Executes the given operation atomically. More... | |
template<typename TOp , typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicOp (TAtomic const &atomic, T *const addr, T const &value, THierarchy const &=THierarchy()) -> T |
Executes the given operation atomically. More... | |
template<typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicOr (TAtomic const &atomic, T *const addr, T const &value, THierarchy const &hier=THierarchy()) -> T |
Executes an atomic or operation. More... | |
template<typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicSub (TAtomic const &atomic, T *const addr, T const &value, THierarchy const &hier=THierarchy()) -> T |
Executes an atomic sub operation. More... | |
template<typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicXor (TAtomic const &atomic, T *const addr, T const &value, THierarchy const &hier=THierarchy()) -> T |
Executes an atomic xor operation. More... | |
template<typename TVal , typename TVec > | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | castVec (TVec const &vec) |
template<typename TVecL , typename TVecR > | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | concatVec (TVecL const &vecL, TVecR const &vecR) |
template<typename TView , typename TExtent , typename TOffsets > | |
auto | createSubView (TView &view, TExtent const &extent, TOffsets const &offset=TExtent()) |
Creates a sub view to an existing view. More... | |
template<typename TAcc , typename TWorkDiv , typename TKernelFnObj , typename... TArgs> | |
ALPAKA_FN_HOST auto | createTaskKernel (TWorkDiv const &workDiv, TKernelFnObj const &kernelFnObj, TArgs &&... args) |
Creates a kernel execution task. More... | |
template<typename TExtent , typename TViewSrc , typename TViewDstFwd > | |
ALPAKA_FN_HOST auto | createTaskMemcpy (TViewDstFwd &&viewDst, TViewSrc const &viewSrc, TExtent const &extent) |
Creates a memory copy task. More... | |
template<typename TExtent , typename TViewFwd > | |
ALPAKA_FN_HOST auto | createTaskMemset (TViewFwd &&view, std::uint8_t const &byte, TExtent const &extent) |
Create a memory set task. More... | |
template<typename TDev , typename TContainer > | |
auto | createView (TDev const &dev, TContainer &con) |
Creates a view to a contiguous container of device-accessible memory. More... | |
template<typename TDev , typename TContainer , typename TExtent > | |
auto | createView (TDev const &dev, TContainer &con, TExtent const &extent) |
Creates a view to a contiguous container of device-accessible memory. More... | |
template<typename TDev , typename TElem , typename TExtent > | |
auto | createView (TDev const &dev, TElem *pMem, TExtent const &extent) |
Creates a view to a device pointer. More... | |
template<typename TDev , typename TElem , typename TExtent , typename TPitch > | |
auto | createView (TDev const &dev, TElem *pMem, TExtent const &extent, TPitch pitch) |
Creates a view to a device pointer. More... | |
template<typename T , std::size_t TuniqueId, typename TBlockSharedMemSt > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto | declareSharedVar (TBlockSharedMemSt const &blockSharedMemSt) -> T & |
Declare a block shared variable. More... | |
template<typename TDim , typename TVal , typename... Vecs, typename = std::enable_if_t<(std::is_same_v<Vec<TDim, TVal>, Vecs> && ...)>> | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | elementwise_max (Vec< TDim, TVal > const &p, Vecs const &... qs) -> Vec< TDim, TVal > |
template<typename TDim , typename TVal , typename... Vecs, typename = std::enable_if_t<(std::is_same_v<Vec<TDim, TVal>, Vecs> && ...)>> | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | elementwise_min (Vec< TDim, TVal > const &p, Vecs const &... qs) -> Vec< TDim, TVal > |
template<typename TQueue > | |
ALPAKA_FN_HOST auto | empty (TQueue const &queue) -> bool |
Tests if the queue is empty (all ops in the given queue have been completed). More... | |
template<typename TQueue , typename TTask > | |
ALPAKA_FN_HOST auto | enqueue (TQueue &queue, TTask &&task) -> void |
Queues the given task in the given queue. More... | |
template<typename TAcc , typename TQueue , typename TWorkDiv , typename TKernelFnObj , typename... TArgs> | |
ALPAKA_FN_HOST auto | exec (TQueue &queue, TWorkDiv const &workDiv, TKernelFnObj const &kernelFnObj, TArgs &&... args) -> void |
Executes the given kernel in the given queue. More... | |
template<typename TCallable > | |
auto | executeForEachAccTag (TCallable &&callable) |
execute a callable for each active accelerator tag More... | |
template<typename TIntrinsic > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto | ffs (TIntrinsic const &intrinsic, std::int32_t value) -> std::int32_t |
Returns the 1-based position of the least significant bit set to 1 in the given 32-bit value. Returns 0 for input value 0. More... | |
template<typename TIntrinsic > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto | ffs (TIntrinsic const &intrinsic, std::int64_t value) -> std::int32_t |
Returns the 1-based position of the least significant bit set to 1 in the given 64-bit value. Returns 0 for input value 0. More... | |
template<typename TAlloc , typename T > | |
ALPAKA_FN_HOST auto | free (TAlloc const &alloc, T const *const ptr) -> void |
Frees the memory identified by the given pointer. More... | |
template<typename TBlockSharedMemSt > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto | freeSharedVars (TBlockSharedMemSt &blockSharedMemSt) -> void |
Frees all memory used by block shared variables. More... | |
template<typename TAcc , typename TDev > | |
ALPAKA_FN_HOST auto | getAccDevProps (TDev const &dev) -> AccDevProps< Dim< TAcc >, Idx< TAcc >> |
template<typename TAcc > | |
ALPAKA_FN_HOST auto | getAccName () -> std::string |
template<typename TAcc , typename TKernelFnObj , typename TDim , typename... TArgs> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getBlockSharedMemDynSizeBytes (TKernelFnObj const &kernelFnObj, Vec< TDim, Idx< TAcc >> const &blockThreadExtent, Vec< TDim, Idx< TAcc >> const &threadElemExtent, TArgs const &... args) -> std::size_t |
template<typename TExtent > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getDepth (TExtent const &extent=TExtent()) -> Idx< TExtent > |
template<typename T > | |
ALPAKA_FN_HOST auto | getDev (T const &t) |
template<typename TPlatform > | |
ALPAKA_FN_HOST auto | getDevByIdx (TPlatform const &platform, std::size_t const &devIdx) -> Dev< TPlatform > |
template<typename TPlatform > | |
ALPAKA_FN_HOST auto | getDevCount (TPlatform const &platform) |
template<typename TPlatform > | |
ALPAKA_FN_HOST auto | getDevs (TPlatform const &platform) -> std::vector< Dev< TPlatform >> |
template<typename T , typename TBlockSharedMemDyn > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto | getDynSharedMem (TBlockSharedMemDyn const &blockSharedMemDyn) -> T * |
Get block shared dynamic memory. More... | |
template<std::size_t Tidx, typename TExtent > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getExtent (TExtent const &extent=TExtent()) -> Idx< TExtent > |
template<typename T > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getExtentProduct (T const &object) -> Idx< T > |
template<typename T > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getExtents (T const &object) -> Vec< Dim< T >, Idx< T >> |
template<typename T > | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | getExtentVec (T const &object={}) -> Vec< Dim< T >, Idx< T >> |
template<typename TDim , typename T > | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | getExtentVecEnd (T const &object={}) -> Vec< TDim, Idx< T >> |
template<typename TDev > | |
ALPAKA_FN_HOST auto | getFreeMemBytes (TDev const &dev) -> std::size_t |
template<typename TAcc , typename TDev , typename TKernelFnObj , typename... TArgs> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST auto | getFunctionAttributes (TDev const &dev, TKernelFnObj const &kernelFnObj, TArgs &&... args) -> alpaka::KernelFunctionAttributes |
template<typename TExtent > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getHeight (TExtent const &extent=TExtent()) -> Idx< TExtent > |
template<typename TOrigin , typename TUnit , typename TIdx , typename TWorkDiv > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getIdx (TIdx const &idx, TWorkDiv const &workDiv) -> Vec< Dim< TWorkDiv >, Idx< TIdx >> |
Get the indices requested. More... | |
template<typename TOrigin , typename TUnit , typename TIdxWorkDiv > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getIdx (TIdxWorkDiv const &idxWorkDiv) -> Vec< Dim< TIdxWorkDiv >, Idx< TIdxWorkDiv >> |
Get the indices requested. More... | |
template<typename TIdxWorkDiv , typename TGridThreadIdx , typename TThreadElemExtent > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getIdxThreadFirstElem ([[maybe_unused]] TIdxWorkDiv const &idxWorkDiv, TGridThreadIdx const &gridThreadIdx, TThreadElemExtent const &threadElemExtent) -> Vec< Dim< TIdxWorkDiv >, Idx< TIdxWorkDiv >> |
Get the index of the first element this thread computes. More... | |
template<typename TIdxWorkDiv > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getIdxThreadFirstElem (TIdxWorkDiv const &idxWorkDiv) -> Vec< Dim< TIdxWorkDiv >, Idx< TIdxWorkDiv >> |
Get the index of the first element this thread computes. More... | |
template<typename TIdxWorkDiv , typename TGridThreadIdx > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getIdxThreadFirstElem (TIdxWorkDiv const &idxWorkDiv, TGridThreadIdx const &gridThreadIdx) -> Vec< Dim< TIdxWorkDiv >, Idx< TIdxWorkDiv >> |
Get the index of the first element this thread computes. More... | |
template<typename TDev > | |
ALPAKA_FN_HOST auto | getMemBytes (TDev const &dev) -> std::size_t |
template<typename TDev > | |
ALPAKA_FN_HOST auto | getName (TDev const &dev) -> std::string |
template<typename TImpl > | |
ALPAKA_FN_HOST auto | getNativeHandle (TImpl const &impl) |
Get the native handle of the alpaka object. It will return the alpaka object handle if there is any, otherwise it generates a compile time error. More... | |
template<std::size_t Tidx, typename TOffsets > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getOffset (TOffsets const &offsets) -> Idx< TOffsets > |
template<typename T > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getOffsets (T const &object) -> Vec< Dim< T >, Idx< T >> |
template<typename T > | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | getOffsetVec (T const &object={}) -> Vec< Dim< T >, Idx< T >> |
template<typename TDim , typename T > | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | getOffsetVecEnd (T const &object={}) -> Vec< TDim, Idx< T >> |
template<typename TOffsets > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getOffsetX (TOffsets const &offsets=TOffsets()) -> Idx< TOffsets > |
template<typename TOffsets > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getOffsetY (TOffsets const &offsets=TOffsets()) -> Idx< TOffsets > |
template<typename TOffsets > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getOffsetZ (TOffsets const &offsets=TOffsets()) -> Idx< TOffsets > |
template<typename TAcc , typename TKernelFnObj , typename TDim , typename... TArgs> | |
ALPAKA_FN_HOST auto | getOmpSchedule (TKernelFnObj const &kernelFnObj, Vec< TDim, Idx< TAcc >> const &blockThreadExtent, Vec< TDim, Idx< TAcc >> const &threadElemExtent, TArgs const &... args) |
template<std::size_t Tidx, typename TView > | |
ALPAKA_FN_HOST auto | getPitchBytes (TView const &view) -> Idx< TView > |
template<typename TView > | |
auto | getPitchBytesVec (TView const &view) -> Vec< Dim< TView >, Idx< TView >> |
template<typename TDim , typename TView > | |
ALPAKA_FN_HOST auto | getPitchBytesVecEnd (TView const &view=TView()) -> Vec< TDim, Idx< TView >> |
template<typename TView > | |
ALPAKA_FN_HOST auto | getPitchesInBytes (TView const &view) -> Vec< Dim< TView >, Idx< TView >> |
template<typename TDev > | |
constexpr ALPAKA_FN_HOST auto | getPreferredWarpSize (TDev const &dev) -> std::size_t |
template<typename TView , typename TDev > | |
ALPAKA_FN_HOST auto | getPtrDev (TView &view, TDev const &dev) -> Elem< TView > * |
Gets the pointer to the view on the given device. More... | |
template<typename TView , typename TDev > | |
ALPAKA_FN_HOST auto | getPtrDev (TView const &view, TDev const &dev) -> Elem< TView > const * |
Gets the pointer to the view on the given device. More... | |
template<typename TView > | |
ALPAKA_FN_HOST auto | getPtrNative (TView &view) -> Elem< TView > * |
Gets the native pointer of the memory view. More... | |
template<typename TView > | |
ALPAKA_FN_HOST auto | getPtrNative (TView const &view) -> Elem< TView > const * |
Gets the native pointer of the memory view. More... | |
template<typename TAcc , typename TDev , typename TGridElemExtent , typename TThreadElemExtent , typename TKernelFnObj , typename... TArgs> | |
ALPAKA_FN_HOST auto | getValidWorkDiv (KernelCfg< TAcc, TGridElemExtent, TThreadElemExtent > const &kernelCfg, [[maybe_unused]] TDev const &dev, TKernelFnObj const &kernelFnObj, TArgs &&... args) -> WorkDivMembers< Dim< TAcc >, Idx< TAcc >> |
template<typename TDev > | |
ALPAKA_FN_HOST auto | getWarpSizes (TDev const &dev) -> std::vector< std::size_t > |
template<typename TExtent > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getWidth (TExtent const &extent=TExtent()) -> Idx< TExtent > |
template<typename TOrigin , typename TUnit , typename TWorkDiv > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getWorkDiv (TWorkDiv const &workDiv) -> Vec< Dim< TWorkDiv >, Idx< TWorkDiv >> |
Get the extent requested. More... | |
template<typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and alpaka::Dim<TAcc>::value == 1>> | |
ALPAKA_FN_ACC auto | independentGroupElements (TAcc const &acc, TArgs... args) |
template<std::size_t Dim, typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and alpaka::Dim<TAcc>::value >= Dim> | |
ALPAKA_FN_ACC auto | independentGroupElementsAlong (TAcc const &acc, TArgs... args) |
template<typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and (alpaka::Dim<TAcc>::value > 0)>> | |
ALPAKA_FN_ACC auto | independentGroupElementsAlongX (TAcc const &acc, TArgs... args) |
template<typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and (alpaka::Dim<TAcc>::value > 1)>> | |
ALPAKA_FN_ACC auto | independentGroupElementsAlongY (TAcc const &acc, TArgs... args) |
template<typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and (alpaka::Dim<TAcc>::value > 2)>> | |
ALPAKA_FN_ACC auto | independentGroupElementsAlongZ (TAcc const &acc, TArgs... args) |
template<typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and alpaka::Dim<TAcc>::value == 1>> | |
ALPAKA_FN_ACC auto | independentGroups (TAcc const &acc, TArgs... args) |
template<std::size_t Dim, typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and alpaka::Dim<TAcc>::value >= Dim> | |
ALPAKA_FN_ACC auto | independentGroupsAlong (TAcc const &acc, TArgs... args) |
template<typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and (alpaka::Dim<TAcc>::value > 0)>> | |
ALPAKA_FN_ACC auto | independentGroupsAlongX (TAcc const &acc, TArgs... args) |
template<typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and (alpaka::Dim<TAcc>::value > 1)>> | |
ALPAKA_FN_ACC auto | independentGroupsAlongY (TAcc const &acc, TArgs... args) |
template<typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and (alpaka::Dim<TAcc>::value > 2)>> | |
ALPAKA_FN_ACC auto | independentGroupsAlongZ (TAcc const &acc, TArgs... args) |
template<typename TEvent > | |
ALPAKA_FN_HOST auto | isComplete (TEvent const &event) -> bool |
Tests if the given event has already been completed. More... | |
template<typename T > | |
void | isSupportedByAtomicAtomicRef () |
template<typename TDim , typename TIdx > | |
ALPAKA_FN_HOST auto | isValidAccDevProps (AccDevProps< TDim, TIdx > const &accDevProps) -> bool |
template<typename TWorkDiv , typename TDim , typename TIdx > | |
ALPAKA_FN_HOST auto | isValidWorkDiv (TWorkDiv const &workDiv, AccDevProps< TDim, TIdx > const &accDevProps) -> bool |
Checks if the work division is supported. More... | |
template<typename TAcc , typename TWorkDiv , typename TDim , typename TIdx > | |
ALPAKA_FN_HOST auto | isValidWorkDiv (TWorkDiv const &workDiv, AccDevProps< TDim, TIdx > const &accDevProps, KernelFunctionAttributes const &kernelFunctionAttributes) -> bool |
Checks if the work division is supported. More... | |
template<typename TAcc , typename TWorkDiv , typename TDev > | |
ALPAKA_FN_HOST auto | isValidWorkDiv (TWorkDiv const &workDiv, TDev const &dev) -> bool |
Checks if the work division is supported by the device. More... | |
template<typename TAcc , typename TWorkDiv , typename TDev , typename TKernelFnObj , typename... TArgs> | |
ALPAKA_FN_HOST auto | isValidWorkDiv (TWorkDiv const &workDiv, TDev const &dev, TKernelFnObj const &kernelFnObj, TArgs &&... args) -> bool |
Checks if the work division is supported for the kernel on the device. More... | |
template<typename T , typename TAlloc > | |
ALPAKA_FN_HOST auto | malloc (TAlloc const &alloc, std::size_t const &sizeElems) -> T * |
template<std::size_t TDimOut, std::size_t TDimIn, std::size_t TDimExtents, typename TElem > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | mapIdx (Vec< DimInt< TDimIn >, TElem > const &in, Vec< DimInt< TDimExtents >, TElem > const &extent) -> Vec< DimInt< TDimOut >, TElem > |
Maps an N-dimensional index to an N-dimensional position. At least one dimension must always be 1 or zero. More... | |
template<std::size_t TDimOut, std::size_t TDimIn, std::size_t TidxDimPitch, typename TElem > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | mapIdxPitchBytes (Vec< DimInt< TDimIn >, TElem > const &in, Vec< DimInt< TidxDimPitch >, TElem > const &pitches) -> Vec< DimInt< TDimOut >, TElem > |
Maps an N dimensional index to a N dimensional position based on the pitches of a view without padding or a byte view. At least one dimension must always be 1 or zero. More... | |
template<typename TMemFence , typename TMemScope > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto | mem_fence (TMemFence const &fence, TMemScope const &scope) -> void |
Issues memory fence instructions. More... | |
template<concepts::Tag TTag, typename TViewSrc , typename TTypeDst , typename TQueue , typename std::enable_if_t< std::is_same_v< TTag, TagCpuOmp2Blocks >||std::is_same_v< TTag, TagCpuOmp2Threads >||std::is_same_v< TTag, TagCpuSerial >||std::is_same_v< TTag, TagCpuTbbBlocks >||std::is_same_v< TTag, TagCpuThreads >, int > = 0> | |
ALPAKA_FN_HOST auto | memcpy (TQueue &queue, alpaka::detail::DevGlobalImplGeneric< TTag, TTypeDst > &viewDst, TViewSrc const &viewSrc) -> void |
template<concepts::Tag TTag, typename TExtent , typename TViewSrc , typename TTypeDst , typename TQueue , typename std::enable_if_t< std::is_same_v< TTag, TagCpuOmp2Blocks >||std::is_same_v< TTag, TagCpuOmp2Threads >||std::is_same_v< TTag, TagCpuSerial >||std::is_same_v< TTag, TagCpuTbbBlocks >||std::is_same_v< TTag, TagCpuThreads >, int > = 0> | |
ALPAKA_FN_HOST auto | memcpy (TQueue &queue, alpaka::detail::DevGlobalImplGeneric< TTag, TTypeDst > &viewDst, TViewSrc const &viewSrc, TExtent const &extent) -> void |
template<concepts::Tag TTag, typename TTypeSrc , typename TViewDstFwd , typename TQueue , typename std::enable_if_t< std::is_same_v< TTag, TagCpuOmp2Blocks >||std::is_same_v< TTag, TagCpuOmp2Threads >||std::is_same_v< TTag, TagCpuSerial >||std::is_same_v< TTag, TagCpuTbbBlocks >||std::is_same_v< TTag, TagCpuThreads >, int > = 0> | |
ALPAKA_FN_HOST auto | memcpy (TQueue &queue, TViewDstFwd &&viewDst, alpaka::detail::DevGlobalImplGeneric< TTag, TTypeSrc > &viewSrc) -> void |
template<concepts::Tag TTag, typename TExtent , typename TTypeSrc , typename TViewDstFwd , typename TQueue , typename std::enable_if_t< std::is_same_v< TTag, TagCpuOmp2Blocks >||std::is_same_v< TTag, TagCpuOmp2Threads >||std::is_same_v< TTag, TagCpuSerial >||std::is_same_v< TTag, TagCpuTbbBlocks >||std::is_same_v< TTag, TagCpuThreads >, int > = 0> | |
ALPAKA_FN_HOST auto | memcpy (TQueue &queue, TViewDstFwd &&viewDst, alpaka::detail::DevGlobalImplGeneric< TTag, TTypeSrc > &viewSrc, TExtent const &extent) -> void |
template<typename TViewSrc , typename TViewDstFwd , typename TQueue > | |
ALPAKA_FN_HOST auto | memcpy (TQueue &queue, TViewDstFwd &&viewDst, TViewSrc const &viewSrc) -> void |
Copies the entire memory of viewSrc to viewDst. Possibly copies between different memory spaces. More... | |
template<typename TExtent , typename TViewSrc , typename TViewDstFwd , typename TQueue > | |
ALPAKA_FN_HOST auto | memcpy (TQueue &queue, TViewDstFwd &&viewDst, TViewSrc const &viewSrc, TExtent const &extent) -> void |
Copies memory from a part of viewSrc to viewDst, described by extent. Possibly copies between different memory spaces. More... | |
template<concepts::Tag TTag, typename TApi , bool TBlocking, typename TTypeDst , typename TViewSrc , typename std::enable_if_t<(std::is_same_v< TTag, TagGpuCudaRt > &&std::is_same_v< TApi, ApiCudaRt >), int > = 0> | |
ALPAKA_FN_HOST auto | memcpy (uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking > &queue, alpaka::detail::DevGlobalImplGeneric< TTag, TTypeDst > &viewDst, TViewSrc const &viewSrc) |
template<concepts::Tag TTag, typename TApi , bool TBlocking, typename TTypeDst , typename TViewSrc , typename TExtent , typename std::enable_if_t<(std::is_same_v< TTag, TagGpuCudaRt > &&std::is_same_v< TApi, ApiCudaRt >), int > = 0> | |
ALPAKA_FN_HOST auto | memcpy (uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking > &queue, alpaka::detail::DevGlobalImplGeneric< TTag, TTypeDst > &viewDst, TViewSrc const &viewSrc, TExtent extent) |
template<concepts::Tag TTag, typename TApi , bool TBlocking, typename TViewDst , typename TTypeSrc , typename std::enable_if_t<(std::is_same_v< TTag, TagGpuCudaRt > &&std::is_same_v< TApi, ApiCudaRt >), int > = 0> | |
ALPAKA_FN_HOST auto | memcpy (uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking > &queue, TViewDst &viewDst, alpaka::detail::DevGlobalImplGeneric< TTag, TTypeSrc > &viewSrc) |
template<concepts::Tag TTag, typename TApi , bool TBlocking, typename TViewDst , typename TTypeSrc , typename TExtent , typename std::enable_if_t<(std::is_same_v< TTag, TagGpuCudaRt > &&std::is_same_v< TApi, ApiCudaRt >), int > = 0> | |
ALPAKA_FN_HOST auto | memcpy (uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking > &queue, TViewDst &viewDst, alpaka::detail::DevGlobalImplGeneric< TTag, TTypeSrc > &viewSrc, TExtent extent) |
template<typename TViewFwd , typename TQueue > | |
ALPAKA_FN_HOST auto | memset (TQueue &queue, TViewFwd &&view, std::uint8_t const &byte) -> void |
Sets each byte of the memory of the entire view to the given value. More... | |
template<typename TExtent , typename TViewFwd , typename TQueue > | |
ALPAKA_FN_HOST auto | memset (TQueue &queue, TViewFwd &&view, std::uint8_t const &byte, TExtent const &extent) -> void |
Sets the bytes of the memory of view, described by extent, to the given value. More... | |
template<typename TAcc , typename = std::enable_if_t<isAccelerator<TAcc>>> | |
constexpr ALPAKA_FN_ACC bool | oncePerBlock (TAcc const &acc) |
template<typename TAcc , typename = std::enable_if_t<isAccelerator<TAcc>>> | |
constexpr ALPAKA_FN_ACC bool | oncePerGrid (TAcc const &acc) |
template<typename TIntrinsic > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto | popcount (TIntrinsic const &intrinsic, std::uint32_t value) -> std::int32_t |
Returns the number of 1 bits in the given 32-bit value. More... | |
template<typename TIntrinsic > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto | popcount (TIntrinsic const &intrinsic, std::uint64_t value) -> std::int32_t |
Returns the number of 1 bits in the given 64-bit value. More... | |
template<typename TView > | |
ALPAKA_FN_HOST auto | print (TView const &view, std::ostream &os, std::string const &elementSeparator=", ", std::string const &rowSeparator="\n", std::string const &rowPrefix="[", std::string const &rowSuffix="]") -> void |
Prints the content of the view to the given queue. More... | |
template<typename TTuple > | |
void | printTagNames () |
Function to print the names of each tag in the given tuple of tags. More... | |
template<typename TDev > | |
ALPAKA_FN_HOST auto | reset (TDev const &dev) -> void |
Resets the device. What this method does is dependent on the accelerator. More... | |
template<typename TVec > | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | reverseVec (TVec const &vec) |
template<typename TDim , typename TIdx > | |
ALPAKA_FN_HOST auto | subDivideGridElems (Vec< TDim, TIdx > const &gridElemExtent, Vec< TDim, TIdx > const &threadElemExtent, AccDevProps< TDim, TIdx > const &accDevProps, TIdx kernelBlockThreadCountMax=static_cast< TIdx >(0u), bool blockThreadMustDivideGridThreadExtent=true, GridBlockExtentSubDivRestrictions gridBlockExtentSubDivRestrictions=GridBlockExtentSubDivRestrictions::Unrestricted) -> WorkDivMembers< TDim, TIdx > |
Subdivides the given grid thread extent into blocks restricted by the maxima allowed. More... | |
template<typename TSubDim , typename TVec > | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | subVecBegin (TVec const &vec) |
template<typename TSubDim , typename TVec > | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | subVecEnd (TVec const &vec) |
template<typename TIndexSequence , typename TVec > | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | subVecFromIndices (TVec const &vec) |
Builds a new vector by selecting the elements of the source vector in the given order. Repeating and swizzling elements is allowed. More... | |
template<typename TBlockSync > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto | syncBlockThreads (TBlockSync const &blockSync) -> void |
Synchronizes all threads within the current block (independently for all blocks). More... | |
template<typename TOp , typename TBlockSync > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto | syncBlockThreadsPredicate (TBlockSync const &blockSync, int predicate) -> int |
Synchronizes all threads within the current block (independently for all blocks), evaluates the predicate for all threads and returns the combination of all the results computed via TOp. More... | |
template<typename TDim , typename TVal > | |
constexpr ALPAKA_FN_HOST_ACC auto | toArray (Vec< TDim, TVal > const &v) -> std::array< TVal, TDim::value > |
Converts a Vec to a std::array. More... | |
template<typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and alpaka::Dim<TAcc>::value == 1>> | |
ALPAKA_FN_ACC auto | uniformElements (TAcc const &acc, TArgs... args) |
template<std::size_t Dim, typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and alpaka::Dim<TAcc>::value >= Dim> | |
ALPAKA_FN_ACC auto | uniformElementsAlong (TAcc const &acc, TArgs... args) |
template<typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and (alpaka::Dim<TAcc>::value > 0)>> | |
ALPAKA_FN_ACC auto | uniformElementsAlongX (TAcc const &acc, TArgs... args) |
template<typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and (alpaka::Dim<TAcc>::value > 1)>> | |
ALPAKA_FN_ACC auto | uniformElementsAlongY (TAcc const &acc, TArgs... args) |
template<typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and (alpaka::Dim<TAcc>::value > 2)>> | |
ALPAKA_FN_ACC auto | uniformElementsAlongZ (TAcc const &acc, TArgs... args) |
template<typename TAcc , typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and (alpaka::Dim<TAcc>::value > 0)>> | |
ALPAKA_FN_ACC auto | uniformElementsND (TAcc const &acc) |
template<typename TAcc , typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and (alpaka::Dim<TAcc>::value > 0)>> | |
ALPAKA_FN_ACC auto | uniformElementsND (TAcc const &acc, alpaka::Vec< alpaka::Dim< TAcc >, alpaka::Idx< TAcc >> extent) |
template<typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and alpaka::Dim<TAcc>::value == 1>> | |
ALPAKA_FN_ACC auto | uniformGroupElements (TAcc const &acc, TArgs... args) |
template<std::size_t Dim, typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and alpaka::Dim<TAcc>::value >= Dim> | |
ALPAKA_FN_ACC auto | uniformGroupElementsAlong (TAcc const &acc, TArgs... args) |
template<typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and (alpaka::Dim<TAcc>::value > 0)>> | |
ALPAKA_FN_ACC auto | uniformGroupElementsAlongX (TAcc const &acc, TArgs... args) |
template<typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and (alpaka::Dim<TAcc>::value > 1)>> | |
ALPAKA_FN_ACC auto | uniformGroupElementsAlongY (TAcc const &acc, TArgs... args) |
template<typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and (alpaka::Dim<TAcc>::value > 2)>> | |
ALPAKA_FN_ACC auto | uniformGroupElementsAlongZ (TAcc const &acc, TArgs... args) |
template<typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and alpaka::Dim<TAcc>::value == 1>> | |
ALPAKA_FN_ACC auto | uniformGroups (TAcc const &acc, TArgs... args) |
template<std::size_t Dim, typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and alpaka::Dim<TAcc>::value >= Dim> | |
ALPAKA_FN_ACC auto | uniformGroupsAlong (TAcc const &acc, TArgs... args) |
template<typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and (alpaka::Dim<TAcc>::value > 0)>> | |
ALPAKA_FN_ACC auto | uniformGroupsAlongX (TAcc const &acc, TArgs... args) |
template<typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and (alpaka::Dim<TAcc>::value > 1)>> | |
ALPAKA_FN_ACC auto | uniformGroupsAlongY (TAcc const &acc, TArgs... args) |
template<typename TAcc , typename... TArgs, typename = std::enable_if_t<alpaka::isAccelerator<TAcc> and (alpaka::Dim<TAcc>::value > 2)>> | |
ALPAKA_FN_ACC auto | uniformGroupsAlongZ (TAcc const &acc, TArgs... args) |
template<typename TFirstIndex , typename... TRestIndices> | |
ALPAKA_FN_HOST_ACC | Vec (TFirstIndex &&, TRestIndices &&...) -> Vec< DimInt< 1+sizeof...(TRestIndices)>, std::decay_t< TFirstIndex >> |
template<typename TView > | |
ViewConst (TView) -> ViewConst< std::decay_t< TView >> | |
template<typename TAwaited > | |
ALPAKA_FN_HOST auto | wait (TAwaited const &awaited) -> void |
Waits the thread for the completion of the given awaited action to complete. More... | |
template<typename TWaiter , typename TAwaited > | |
ALPAKA_FN_HOST auto | wait (TWaiter &waiter, TAwaited const &awaited) -> void |
The waiter waits for the given awaited action to complete. More... | |
template<typename TDim , typename TIdx > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC | WorkDivMembers (alpaka::Vec< TDim, TIdx > const &gridBlockExtent, alpaka::Vec< TDim, TIdx > const &blockThreadExtent, alpaka::Vec< TDim, TIdx > const &elemExtent) -> WorkDivMembers< TDim, TIdx > |
Deduction guide for the constructor which can be called without explicit template type parameters. More... | |
Variables | |
template<typename TAcc , concepts::Tag... TTag> | |
constexpr bool | accMatchesTags = (std::is_same_v<alpaka::AccToTag<TAcc>, TTag> || ...) |
constexpr std::uint32_t | BlockSharedDynMemberAllocKiB = 47u |
template<typename TDev , typename TDim > | |
constexpr bool | hasAsyncBufSupport = trait::HasAsyncBufSupport<TDim, TDev>::value |
Checks if the given device can allocate a stream-ordered memory buffer of the given dimensionality. More... | |
template<typename TPlatform > | |
constexpr bool | hasMappedBufSupport = trait::HasMappedBufSupport<TPlatform>::value |
Checks if the host can allocate a pinned/mapped host memory, accessible by all devices in the given platform. More... | |
template<typename T , typename U > | |
constexpr auto | is_decayed_v = std::is_same_v<std::decay_t<T>, std::decay_t<U>> |
Provides a decaying wrapper around std::is_same. Example: is_decayed_v<volatile float, float> returns true. More... | |
template<typename TAcc > | |
constexpr bool | isAccelerator = interface::ImplementsInterface<ConceptAcc, TAcc>::value |
True if TAcc is an accelerator, i.e. if it implements the ConceptAcc concept. More... | |
template<typename TDev > | |
constexpr bool | isDevice = interface::ImplementsInterface<ConceptDev, std::decay_t<TDev>>::value |
True if TDev is a device, i.e. if it implements the ConceptDev concept. More... | |
template<typename TAcc > | |
constexpr bool | isMultiThreadAcc = trait::IsMultiThreadAcc<TAcc>::value |
True if TAcc is an accelerator that supports multiple threads per block, false otherwise. More... | |
template<typename TPlatform > | |
constexpr bool | isPlatform = interface::ImplementsInterface<ConceptPlatform, TPlatform>::value |
True if TPlatform is a platform, i.e. if it implements the ConceptPlatform concept. More... | |
template<typename TQueue > | |
constexpr bool | isQueue = interface::ImplementsInterface<ConceptQueue, std::decay_t<TQueue>>::value |
True if TQueue is a queue, i.e. if it implements the ConceptQueue concept. More... | |
template<typename TAcc > | |
constexpr bool | isSingleThreadAcc = trait::IsSingleThreadAcc<TAcc>::value |
True if TAcc is an accelerator that supports only a single thread per block, false otherwise. More... | |
template<typename T > | |
constexpr bool | isVec = false |
template<typename TDim , typename TVal > | |
constexpr bool | isVec< Vec< TDim, TVal > > = true |
template<typename T > | |
constexpr bool | isKernelArgumentTriviallyCopyable = IsKernelArgumentTriviallyCopyable<T>::value |
template<typename T > | |
constexpr bool | isKernelTriviallyCopyable = IsKernelTriviallyCopyable<T>::value |
The alpaka accelerator library.
The alpaka library.
using alpaka::Acc = typedef typename trait::AccType<T>::type |
The accelerator type trait alias template to remove the ::type.
Definition at line 78 of file Traits.hpp.
using alpaka::AccGpuCudaRt = typedef AccGpuUniformCudaHipRt<ApiCudaRt, TDim, TIdx> |
Definition at line 16 of file AccGpuCudaRt.hpp.
using alpaka::AccToTag = typedef typename trait::AccToTag<TAcc>::type |
using alpaka::AtomicCpu = typedef AtomicAtomicRef |
Definition at line 25 of file AtomicCpu.hpp.
using alpaka::AtomicHierarchy = typedef alpaka::meta::InheritFromList<alpaka::meta::Unique<std::tuple< TGridAtomic, TBlockAtomic, TThreadAtomic, interface::Implements<ConceptAtomicGrids, TGridAtomic>, interface::Implements<ConceptAtomicBlocks, TBlockAtomic>, interface::Implements<ConceptAtomicThreads, TThreadAtomic> >> > |
build a single class to inherit from different atomic implementations
Definition at line 27 of file AtomicHierarchy.hpp.
using alpaka::AtomicHierarchyConcept = typedef typename detail::AtomicHierarchyConceptType<THierarchy>::type |
Definition at line 53 of file Traits.hpp.
using alpaka::Buf = typedef typename trait::BufType<alpaka::Dev<TDev>, TElem, TDim, TIdx>::type |
The memory buffer type trait alias template to remove the ::type.
Definition at line 52 of file Traits.hpp.
using alpaka::BufCudaRt = typedef BufUniformCudaHipRt<ApiCudaRt, TElem, TDim, TIdx> |
Definition at line 15 of file BufCudaRt.hpp.
using alpaka::Dev = typedef typename trait::DevType<T>::type |
The device type trait alias template to remove the ::type.
Definition at line 56 of file Traits.hpp.
using alpaka::DevCudaRt = typedef DevUniformCudaHipRt<ApiCudaRt> |
The CUDA RT device handle.
Definition at line 15 of file DevCudaRt.hpp.
using alpaka::DevGlobal = typedef typename detail::DevGlobalTrait<typename alpaka::trait::AccToTag<TAcc>::type, T>::Type |
Definition at line 44 of file Traits.hpp.
using alpaka::Dim = typedef typename trait::DimType<T>::type |
The dimension type trait alias template to remove the ::type.
Definition at line 19 of file Traits.hpp.
using alpaka::DimInt = typedef std::integral_constant<std::size_t, N> |
Definition at line 15 of file DimIntegralConst.hpp.
using alpaka::Elem = typedef std::remove_volatile_t<typename trait::ElemType<TView>::type> |
The element type trait alias template to remove the ::type.
Definition at line 21 of file Traits.hpp.
using alpaka::EnabledAccTags = typedef alpaka::meta::Filter<AccTags, alpaka::AccIsEnabled> |
list of all tags where the related accelerator is enabled
Definition at line 35 of file TagAccIsEnabled.hpp.
using alpaka::Event = typedef typename trait::EventType<T>::type |
The event type trait alias template to remove the ::type.
Definition at line 26 of file Traits.hpp.
using alpaka::EventCpu = typedef EventGenericThreads<DevCpu> |
Definition at line 12 of file EventCpu.hpp.
using alpaka::EventCudaRt = typedef EventUniformCudaHipRt<ApiCudaRt> |
The CUDA RT device event.
Definition at line 15 of file EventCudaRt.hpp.
using alpaka::ExampleDefaultAcc = typedef alpaka::AccGpuCudaRt<TDim, TIdx> |
Alias for the default accelerator used by examples. From a list of all accelerators the first one which is enabled is chosen. AccCpuSerial is selected last.
Definition at line 16 of file ExampleDefaultAcc.hpp.
using alpaka::Idx = typedef typename trait::IdxType<T>::type |
Definition at line 29 of file Traits.hpp.
using alpaka::NativeHandle = typedef decltype(getNativeHandle(std::declval<TImpl>())) |
Alias to the type of the native handle.
Definition at line 36 of file Traits.hpp.
using alpaka::Platform = typedef typename trait::PlatformType<T>::type |
The platform type trait alias template to remove the ::type.
Definition at line 51 of file Traits.hpp.
using alpaka::PlatformCudaRt = typedef PlatformUniformCudaHipRt<ApiCudaRt> |
The CUDA RT platform.
Definition at line 15 of file PlatformCudaRt.hpp.
using alpaka::Queue = typedef typename trait::QueueType<TEnv, TProperty>::type |
Queue based on the environment and a property.
TEnv | Environment type, e.g. accelerator, device or a platform. trait::QueueType must be specialized for TEnv |
TProperty | Property to define the behavior of TEnv. |
Definition at line 70 of file Traits.hpp.
Definition at line 191 of file DevCpu.hpp.
Definition at line 190 of file DevCpu.hpp.
using alpaka::QueueCudaRtBlocking = typedef QueueUniformCudaHipRtBlocking<ApiCudaRt> |
The CUDA RT blocking queue.
Definition at line 15 of file QueueCudaRtBlocking.hpp.
using alpaka::QueueCudaRtNonBlocking = typedef QueueUniformCudaHipRtNonBlocking<ApiCudaRt> |
The CUDA RT non-blocking queue.
Definition at line 15 of file QueueCudaRtNonBlocking.hpp.
using alpaka::QueueUniformCudaHipRtBlocking = typedef uniform_cuda_hip::detail::QueueUniformCudaHipRt<TApi, true> |
The CUDA/HIP RT blocking queue.
Definition at line 43 of file DevUniformCudaHipRt.hpp.
using alpaka::QueueUniformCudaHipRtNonBlocking = typedef uniform_cuda_hip::detail::QueueUniformCudaHipRt<TApi, false> |
The CUDA/HIP RT non-blocking queue.
Definition at line 46 of file DevUniformCudaHipRt.hpp.
using alpaka::remove_restrict_t = typedef typename remove_restrict<T>::type |
Helper to remove restrict from a type.
Definition at line 34 of file RemoveRestrict.hpp.
using alpaka::TagToAcc = typedef typename trait::TagToAcc<TTag, TDim, TIdx>::type |
using alpaka::TaskKernelGpuCudaRt = typedef TaskKernelGpuUniformCudaHipRt<ApiCudaRt, TAcc, TDev, TDim, TIdx, TKernelFnObj, TArgs...> |
Definition at line 15 of file TaskKernelGpuCudaRt.hpp.
|
strong |
The grid block extent subdivision restrictions.
Definition at line 34 of file WorkDivHelpers.hpp.
ALPAKA_FN_HOST auto alpaka::allocAsyncBuf | ( | TQueue | queue, |
TExtent const & | extent = TExtent() |
||
) |
Allocates stream-ordered memory on the given device.
TElem | The element type of the returned buffer. |
TIdx | The linear index type of the buffer. |
TExtent | The extent type of the buffer. |
TQueue | The type of queue used to order the buffer allocation. |
queue | The queue used to order the buffer allocation. |
extent | The extent of the buffer. |
Definition at line 79 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::allocAsyncBufIfSupported | ( | TQueue | queue, |
TExtent const & | extent = TExtent() |
||
) |
If supported, allocates stream-ordered memory on the given queue and the associated device. Otherwise, allocates regular memory on the device associated to the queue. Please note that stream-ordered and regular memory have different semantics: this function is provided for convenience in the cases where the difference is not relevant, and the stream-ordered memory is only used as a performance optimisation.
TElem | The element type of the returned buffer. |
TIdx | The linear index type of the buffer. |
TExtent | The extent type of the buffer. |
TQueue | The type of queue used to order the buffer allocation. |
queue | The queue used to order the buffer allocation. |
extent | The extent of the buffer. |
Definition at line 114 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::allocBuf | ( | TDev const & | dev, |
TExtent const & | extent = TExtent() |
||
) |
Allocates memory on the given device.
TElem | The element type of the returned buffer. |
TIdx | The linear index type of the buffer. |
TExtent | The extent type of the buffer. |
TDev | The type of device the buffer is allocated on. |
dev | The device to allocate the buffer on. |
extent | The extent of the buffer. |
Definition at line 64 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::allocMappedBuf | ( | DevCpu const & | host, |
TPlatform const & | platform, | ||
TExtent const & | extent = TExtent() |
||
) |
Allocates pinned/mapped host memory, accessible by all devices in the given platform.
TElem | The element type of the returned buffer. |
TIdx | The linear index type of the buffer. |
TExtent | The extent type of the buffer. |
TPlatform | The platform from which the buffer is accessible. |
host | The host device to allocate the buffer on. |
extent | The extent of the buffer. |
Definition at line 138 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::allocMappedBufIfSupported | ( | DevCpu const & | host, |
TPlatform const & | platform, | ||
TExtent const & | extent = TExtent() |
||
) |
If supported, allocates pinned/mapped host memory, accessible by all devices in the given platform. Otherwise, allocates regular host memory. Please note that pinned/mapped and regular memory may have different semantics: this function is provided for convenience in the cases where the difference is not relevant, and the pinned/mapped memory is only used as a performance optimisation.
TElem | The element type of the returned buffer. |
TIdx | The linear index type of the buffer. |
TExtent | The extent type of the buffer. |
TPlatform | The platform from which the buffer is accessible. |
host | The host device to allocate the buffer on. |
extent | The extent of the buffer. |
Definition at line 175 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicAdd | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | value, | ||
THierarchy const & | hier = THierarchy() |
||
) | -> T |
Executes an atomic add operation.
T | The value type. |
TAtomic | The atomic implementation type. |
addr | The value to change atomically. |
value | The value used in the atomic operation. |
atomic | The atomic implementation. |
Definition at line 114 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicAnd | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | value, | ||
THierarchy const & | hier = THierarchy() |
||
) | -> T |
Executes an atomic and operation.
T | The value type. |
TAtomic | The atomic implementation type. |
addr | The value to change atomically. |
value | The value used in the atomic operation. |
atomic | The atomic implementation. |
Definition at line 240 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicCas | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | compare, | ||
T const & | value, | ||
THierarchy const & | hier = THierarchy() |
||
) | -> T |
Executes an atomic compare-and-swap operation.
TAtomic | The atomic implementation type. |
T | The value type. |
atomic | The atomic implementation. |
addr | The value to change atomically. |
compare | The comparison value used in the atomic operation. |
value | The value used in the atomic operation. |
Definition at line 295 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicDec | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | value, | ||
THierarchy const & | hier = THierarchy() |
||
) | -> T |
Executes an atomic decrement operation.
T | The value type. |
TAtomic | The atomic implementation type. |
addr | The value to change atomically. |
value | The value used in the atomic operation. |
atomic | The atomic implementation. |
Definition at line 222 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicExch | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | value, | ||
THierarchy const & | hier = THierarchy() |
||
) | -> T |
Executes an atomic exchange operation.
T | The value type. |
TAtomic | The atomic implementation type. |
addr | The value to change atomically. |
value | The value used in the atomic operation. |
atomic | The atomic implementation. |
Definition at line 186 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicInc | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | value, | ||
THierarchy const & | hier = THierarchy() |
||
) | -> T |
Executes an atomic increment operation.
T | The value type. |
TAtomic | The atomic implementation type. |
addr | The value to change atomically. |
value | The value used in the atomic operation. |
atomic | The atomic implementation. |
Definition at line 204 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicMax | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | value, | ||
THierarchy const & | hier = THierarchy() |
||
) | -> T |
Executes an atomic max operation.
T | The value type. |
TAtomic | The atomic implementation type. |
addr | The value to change atomically. |
value | The value used in the atomic operation. |
atomic | The atomic implementation. |
Definition at line 168 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicMin | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | value, | ||
THierarchy const & | hier = THierarchy() |
||
) | -> T |
Executes an atomic min operation.
T | The value type. |
TAtomic | The atomic implementation type. |
addr | The value to change atomically. |
value | The value used in the atomic operation. |
atomic | The atomic implementation. |
Definition at line 150 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicOp | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | compare, | ||
T const & | value, | ||
THierarchy const & | = THierarchy() |
||
) | -> T |
Executes the given operation atomically.
TOp | The operation type. |
TAtomic | The atomic implementation type. |
T | The value type. |
atomic | The atomic implementation. |
addr | The value to change atomically. |
compare | The comparison value used in the atomic operation. |
value | The value used in the atomic operation. |
Definition at line 94 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicOp | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | value, | ||
THierarchy const & | = THierarchy() |
||
) | -> T |
Executes the given operation atomically.
TOp | The operation type. |
T | The value type. |
TAtomic | The atomic implementation type. |
addr | The value to change atomically. |
value | The value used in the atomic operation. |
atomic | The atomic implementation. |
Definition at line 73 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicOr | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | value, | ||
THierarchy const & | hier = THierarchy() |
||
) | -> T |
Executes an atomic or operation.
T | The value type. |
TAtomic | The atomic implementation type. |
addr | The value to change atomically. |
value | The value used in the atomic operation. |
atomic | The atomic implementation. |
Definition at line 258 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicSub | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | value, | ||
THierarchy const & | hier = THierarchy() |
||
) | -> T |
Executes an atomic sub operation.
T | The value type. |
TAtomic | The atomic implementation type. |
addr | The value to change atomically. |
value | The value used in the atomic operation. |
atomic | The atomic implementation. |
Definition at line 132 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicXor | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | value, | ||
THierarchy const & | hier = THierarchy() |
||
) | -> T |
Executes an atomic xor operation.
T | The value type. |
TAtomic | The atomic implementation type. |
addr | The value to change atomically. |
value | The value used in the atomic operation. |
atomic | The atomic implementation. |
Definition at line 276 of file Traits.hpp.
|
constexpr |
Definition at line 82 of file Traits.hpp.
|
constexpr |
Definition at line 98 of file Traits.hpp.
auto alpaka::createSubView | ( | TView & | view, |
TExtent const & | extent, | ||
TOffsets const & | offset = TExtent() |
||
) |
Creates a sub view to an existing view.
view | The view this view is a sub-view of. |
extent | Number of elements the resulting view holds. |
offset | Number of elements skipped in view for the new origin of the resulting view. |
Definition at line 494 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::createTaskKernel | ( | TWorkDiv const & | workDiv, |
TKernelFnObj const & | kernelFnObj, | ||
TArgs &&... | args | ||
) |
Creates a kernel execution task.
TAcc | The accelerator type. |
workDiv | The index domain work division. |
kernelFnObj | The kernel function object which should be executed. |
args,... | The kernel invocation arguments. |
Definition at line 332 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::createTaskMemcpy | ( | TViewDstFwd && | viewDst, |
TViewSrc const & | viewSrc, | ||
TExtent const & | extent | ||
) |
Creates a memory copy task.
viewDst | The destination memory view. |
viewSrc | The source memory view. |
extent | The extent of the view to copy. |
Definition at line 253 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::createTaskMemset | ( | TViewFwd && | view, |
std::uint8_t const & | byte, | ||
TExtent const & | extent | ||
) |
Create a memory set task.
view | The memory view to fill. |
byte | Value to set for each element of the specified view. |
extent | The extent of the view to fill. |
Definition at line 207 of file Traits.hpp.
auto alpaka::createView | ( | TDev const & | dev, |
TContainer & | con | ||
) |
Creates a view to a contiguous container of device-accessible memory.
dev | Device from which the container can be accessed. |
con | Contiguous container. The container must provide a data() method. The data held by the container must be accessible from the given device. The GetExtent trait must be defined for the container. |
Definition at line 468 of file Traits.hpp.
auto alpaka::createView | ( | TDev const & | dev, |
TContainer & | con, | ||
TExtent const & | extent | ||
) |
Creates a view to a contiguous container of device-accessible memory.
dev | Device from which the container can be accessed. |
con | Contiguous container. The container must provide a data() method. The data held by the container must be accessible from the given device. The GetExtent trait must be defined for the container. |
extent | Number of elements held by the container. Using a multi-dimensional extent will result in a multi-dimensional view to the memory represented by the container. |
Definition at line 482 of file Traits.hpp.
auto alpaka::createView | ( | TDev const & | dev, |
TElem * | pMem, | ||
TExtent const & | extent | ||
) |
Creates a view to a device pointer.
dev | Device from where pMem can be accessed. |
pMem | Pointer to memory. The pointer must be accessible from the given device. |
extent | Number of elements represented by the pMem. Using a multi dimensional extent will result in a multi dimension view to the memory represented by pMem. |
Definition at line 434 of file Traits.hpp.
auto alpaka::createView | ( | TDev const & | dev, |
TElem * | pMem, | ||
TExtent const & | extent, | ||
TPitch | pitch | ||
) |
Creates a view to a device pointer.
dev | Device from where pMem can be accessed. |
pMem | Pointer to memory. The pointer must be accessible from the given device. |
extent | Number of elements represented by the pMem. Using a multi dimensional extent will result in a multi dimension view to the memory represented by pMem. |
pitch | Pitch in bytes for each dimension. Dimensionality must be equal to extent. |
Definition at line 456 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto alpaka::declareSharedVar | ( | TBlockSharedMemSt const & | blockSharedMemSt | ) | -> T& |
Declare a block shared variable.
The variable is uninitialized and not default constructed! The variable can be accessed by all threads within a block. Access to the variable is not thread safe.
T | The element type. |
TuniqueId | id those is unique inside a kernel |
TBlockSharedMemSt | The block shared allocator implementation type. |
blockSharedMemSt | The block shared allocator implementation. |
Definition at line 42 of file Traits.hpp.
|
constexpr |
|
constexpr |
ALPAKA_FN_HOST auto alpaka::empty | ( | TQueue const & | queue | ) | -> bool |
Tests if the queue is empty (all ops in the given queue have been completed).
Definition at line 58 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::enqueue | ( | TQueue & | queue, |
TTask && | task | ||
) | -> void |
Queues the given task in the given queue.
Special Handling for events: If the event has previously been queued, then this call will overwrite any existing state of the event. Any subsequent calls which examine the status of event will only examine the completion of this most recent call to enqueue. If a queue is waiting for an event the latter's event state at the time of the API call to wait() will be used to release the queue.
Definition at line 47 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::exec | ( | TQueue & | queue, |
TWorkDiv const & | workDiv, | ||
TKernelFnObj const & | kernelFnObj, | ||
TArgs &&... | args | ||
) | -> void |
Executes the given kernel in the given queue.
TAcc | The accelerator type. |
queue | The queue to enqueue the view copy task into. |
workDiv | The index domain work division. |
kernelFnObj | The kernel function object which should be executed. |
args,... | The kernel invocation arguments. |
Definition at line 378 of file Traits.hpp.
|
inline |
execute a callable for each active accelerator tag
Definition at line 21 of file ExecuteForEachAccTag.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto alpaka::ffs | ( | TIntrinsic const & | intrinsic, |
std::int32_t | value | ||
) | -> std::int32_t |
Returns the 1-based position of the least significant bit set to 1 in the given 32-bit value. Returns 0 for input value 0.
TIntrinsic | The intrinsic implementation type. |
intrinsic | The intrinsic implementation. |
value | The input value. |
Definition at line 65 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto alpaka::ffs | ( | TIntrinsic const & | intrinsic, |
std::int64_t | value | ||
) | -> std::int32_t |
Returns the 1-based position of the least significant bit set to 1 in the given 64-bit value. Returns 0 for input value 0.
TIntrinsic | The intrinsic implementation type. |
intrinsic | The intrinsic implementation. |
value | The input value. |
Definition at line 79 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::free | ( | TAlloc const & | alloc, |
T const *const | ptr | ||
) | -> void |
Frees the memory identified by the given pointer.
Definition at line 41 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto alpaka::freeSharedVars | ( | TBlockSharedMemSt & | blockSharedMemSt | ) | -> void |
Frees all memory used by block shared variables.
TBlockSharedMemSt | The block shared allocator implementation type. |
blockSharedMemSt | The block shared allocator implementation. |
Definition at line 54 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getAccDevProps | ( | TDev const & | dev | ) | -> AccDevProps<Dim<TAcc>, Idx<TAcc>> |
Definition at line 90 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getAccName | ( | ) | -> std::string |
TAcc | The accelerator type. |
Definition at line 100 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getBlockSharedMemDynSizeBytes | ( | TKernelFnObj const & | kernelFnObj, |
Vec< TDim, Idx< TAcc >> const & | blockThreadExtent, | ||
Vec< TDim, Idx< TAcc >> const & | threadElemExtent, | ||
TArgs const &... | args | ||
) | -> std::size_t |
TAcc | The accelerator type. |
kernelFnObj | The kernel object for which the block shared memory size should be calculated. |
blockThreadExtent | The block thread extent. |
threadElemExtent | The thread element extent. |
args,... | The kernel invocation arguments. |
Definition at line 181 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getDepth | ( | TExtent const & | extent = TExtent() | ) | -> Idx<TExtent> |
Definition at line 121 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getDev | ( | T const & | t | ) |
Definition at line 68 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getDevByIdx | ( | TPlatform const & | platform, |
std::size_t const & | devIdx | ||
) | -> Dev<TPlatform> |
Definition at line 62 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getDevCount | ( | TPlatform const & | platform | ) |
Definition at line 55 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getDevs | ( | TPlatform const & | platform | ) | -> std::vector<Dev<TPlatform>> |
Definition at line 69 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto alpaka::getDynSharedMem | ( | TBlockSharedMemDyn const & | blockSharedMemDyn | ) | -> T* |
Get block shared dynamic memory.
The available size of the memory can be defined by specializing the trait BlockSharedMemDynSizeBytes for a kernel. The Memory can be accessed by all threads within a block. Access to the memory is not thread safe.
T | The element type. |
TBlockSharedMemDyn | The block shared dynamic memory implementation type. |
blockSharedMemDyn | The block shared dynamic memory implementation. |
Definition at line 39 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getExtent | ( | TExtent const & | extent = TExtent() | ) | -> Idx<TExtent> |
Definition at line 43 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getExtentProduct | ( | T const & | object | ) | -> Idx<T> |
Definition at line 134 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getExtents | ( | T const & | object | ) | -> Vec<Dim<T>, Idx<T>> |
Definition at line 59 of file Traits.hpp.
|
constexpr |
T | has to specialize GetExtent. |
Definition at line 68 of file Traits.hpp.
|
constexpr |
T | has to specialize GetExtent. |
Definition at line 78 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getFreeMemBytes | ( | TDev const & | dev | ) | -> std::size_t |
Definition at line 104 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST auto alpaka::getFunctionAttributes | ( | TDev const & | dev, |
TKernelFnObj const & | kernelFnObj, | ||
TArgs &&... | args | ||
) | -> alpaka::KernelFunctionAttributes |
TAcc | The accelerator type. |
TDev | The device type. |
dev | The device instance |
kernelFnObj | The kernel function object which should be executed. |
args | The kernel invocation arguments. |
Definition at line 204 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getHeight | ( | TExtent const & | extent = TExtent() | ) | -> Idx<TExtent> |
Definition at line 108 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getIdx | ( | TIdx const & | idx, |
TWorkDiv const & | workDiv | ||
) | -> Vec<Dim<TWorkDiv>, Idx<TIdx>> |
Get the indices requested.
Definition at line 23 of file Accessors.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getIdx | ( | TIdxWorkDiv const & | idxWorkDiv | ) | -> Vec<Dim<TIdxWorkDiv>, Idx<TIdxWorkDiv>> |
Get the indices requested.
Definition at line 31 of file Accessors.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getIdxThreadFirstElem | ( | [[maybe_unused] ] TIdxWorkDiv const & | idxWorkDiv, |
TGridThreadIdx const & | gridThreadIdx, | ||
TThreadElemExtent const & | threadElemExtent | ||
) | -> Vec<Dim<TIdxWorkDiv>, Idx<TIdxWorkDiv>> |
Get the index of the first element this thread computes.
Definition at line 89 of file Accessors.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getIdxThreadFirstElem | ( | TIdxWorkDiv const & | idxWorkDiv | ) | -> Vec<Dim<TIdxWorkDiv>, Idx<TIdxWorkDiv>> |
Get the index of the first element this thread computes.
Definition at line 110 of file Accessors.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getIdxThreadFirstElem | ( | TIdxWorkDiv const & | idxWorkDiv, |
TGridThreadIdx const & | gridThreadIdx | ||
) | -> Vec<Dim<TIdxWorkDiv>, Idx<TIdxWorkDiv>> |
Get the index of the first element this thread computes.
Definition at line 100 of file Accessors.hpp.
ALPAKA_FN_HOST auto alpaka::getMemBytes | ( | TDev const & | dev | ) | -> std::size_t |
Definition at line 95 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getName | ( | TDev const & | dev | ) | -> std::string |
Definition at line 87 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getNativeHandle | ( | TImpl const & | impl | ) |
Get the native handle of the alpaka object. It will return the alpaka object handle if there is any, otherwise it generates a compile time error.
Definition at line 29 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getOffset | ( | TOffsets const & | offsets | ) | -> Idx<TOffsets> |
Definition at line 39 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getOffsets | ( | T const & | object | ) | -> Vec<Dim<T>, Idx<T>> |
Definition at line 55 of file Traits.hpp.
|
constexpr |
T | has to specialize GetOffsets. |
Definition at line 64 of file Traits.hpp.
|
constexpr |
T | has to specialize GetOffsets. |
Definition at line 73 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getOffsetX | ( | TOffsets const & | offsets = TOffsets() | ) | -> Idx<TOffsets> |
Definition at line 87 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getOffsetY | ( | TOffsets const & | offsets = TOffsets() | ) | -> Idx<TOffsets> |
Definition at line 95 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getOffsetZ | ( | TOffsets const & | offsets = TOffsets() | ) | -> Idx<TOffsets> |
Definition at line 103 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getOmpSchedule | ( | TKernelFnObj const & | kernelFnObj, |
Vec< TDim, Idx< TAcc >> const & | blockThreadExtent, | ||
Vec< TDim, Idx< TAcc >> const & | threadElemExtent, | ||
TArgs const &... | args | ||
) |
TAcc | The accelerator type. |
kernelFnObj | The kernel object for which the block shared memory size should be calculated. |
blockThreadExtent | The block thread extent. |
threadElemExtent | The thread element extent. |
args,... | The kernel invocation arguments. |
Definition at line 229 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getPitchBytes | ( | TView const & | view | ) | -> Idx<TView> |
Definition at line 176 of file Traits.hpp.
auto alpaka::getPitchBytesVec | ( | TView const & | view | ) | -> Vec<Dim<TView>, Idx<TView>> |
Definition at line 412 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getPitchBytesVecEnd | ( | TView const & | view = TView() | ) | -> Vec<TDim, Idx<TView>> |
Definition at line 420 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getPitchesInBytes | ( | TView const & | view | ) | -> Vec<Dim<TView>, Idx<TView>> |
Definition at line 196 of file Traits.hpp.
|
constexpr |
Definition at line 118 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getPtrDev | ( | TView & | view, |
TDev const & | dev | ||
) | -> Elem<TView>* |
Gets the pointer to the view on the given device.
view | The memory view. |
dev | The device. |
Definition at line 168 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getPtrDev | ( | TView const & | view, |
TDev const & | dev | ||
) | -> Elem<TView> const* |
Gets the pointer to the view on the given device.
view | The memory view. |
dev | The device. |
Definition at line 157 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getPtrNative | ( | TView & | view | ) | -> Elem<TView>* |
Gets the native pointer of the memory view.
view | The memory view. |
Definition at line 146 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getPtrNative | ( | TView const & | view | ) | -> Elem<TView> const* |
Gets the native pointer of the memory view.
view | The memory view. |
Definition at line 136 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getValidWorkDiv | ( | KernelCfg< TAcc, TGridElemExtent, TThreadElemExtent > const & | kernelCfg, |
[[maybe_unused] ] TDev const & | dev, | ||
TKernelFnObj const & | kernelFnObj, | ||
TArgs &&... | args | ||
) | -> WorkDivMembers<Dim<TAcc>, Idx<TAcc>> |
TDev | The type of the device. |
TGridElemExtent | The type of the grid element extent. |
TThreadElemExtent | The type of the thread element extent. |
dev | The device the work division should be valid for. |
kernelFnObj | The kernel function object which should be executed. |
args | The kernel invocation arguments. |
Definition at line 362 of file WorkDivHelpers.hpp.
ALPAKA_FN_HOST auto alpaka::getWarpSizes | ( | TDev const & | dev | ) | -> std::vector<std::size_t> |
Definition at line 111 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getWidth | ( | TExtent const & | extent = TExtent() | ) | -> Idx<TExtent> |
Definition at line 95 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getWorkDiv | ( | TWorkDiv const & | workDiv | ) | -> Vec<Dim<TWorkDiv>, Idx<TWorkDiv>> |
Get the extent requested.
Definition at line 33 of file Traits.hpp.
|
inline |
Definition at line 389 of file IndependentElements.hpp.
|
inline |
Definition at line 406 of file IndependentElements.hpp.
|
inline |
Definition at line 422 of file IndependentElements.hpp.
|
inline |
Definition at line 434 of file IndependentElements.hpp.
|
inline |
Definition at line 446 of file IndependentElements.hpp.
|
inline |
Definition at line 194 of file IndependentElements.hpp.
|
inline |
Definition at line 211 of file IndependentElements.hpp.
|
inline |
Definition at line 227 of file IndependentElements.hpp.
|
inline |
Definition at line 237 of file IndependentElements.hpp.
|
inline |
Definition at line 247 of file IndependentElements.hpp.
ALPAKA_FN_HOST auto alpaka::isComplete | ( | TEvent const & | event | ) | -> bool |
Tests if the given event has already been completed.
Definition at line 34 of file Traits.hpp.
void alpaka::isSupportedByAtomicAtomicRef | ( | ) |
Definition at line 42 of file AtomicAtomicRef.hpp.
ALPAKA_FN_HOST auto alpaka::isValidAccDevProps | ( | AccDevProps< TDim, TIdx > const & | accDevProps | ) | -> bool |
TDim | The dimensionality of the accelerator device properties. |
TIdx | The idx type of the accelerator device properties. |
accDevProps | The maxima for the work division. |
Definition at line 91 of file WorkDivHelpers.hpp.
ALPAKA_FN_HOST auto alpaka::isValidWorkDiv | ( | TWorkDiv const & | workDiv, |
AccDevProps< TDim, TIdx > const & | accDevProps | ||
) | -> bool |
Checks if the work division is supported.
TWorkDiv | The type of the work division. |
TDim | The dimensionality of the accelerator device properties. |
TIdx | The idx type of the accelerator device properties. |
workDiv | The work division to test for validity. |
accDevProps | The maxima for the work division. |
Definition at line 407 of file WorkDivHelpers.hpp.
ALPAKA_FN_HOST auto alpaka::isValidWorkDiv | ( | TWorkDiv const & | workDiv, |
AccDevProps< TDim, TIdx > const & | accDevProps, | ||
KernelFunctionAttributes const & | kernelFunctionAttributes | ||
) | -> bool |
Checks if the work division is supported.
TWorkDiv | The type of the work division. |
TDim | The dimensionality of the accelerator device properties. |
TIdx | The idx type of the accelerator device properties. |
workDiv | The work division to test for validity. |
accDevProps | The maxima for the work division. |
kernelFunctionAttributes | Kernel attributes, including the maximum number of threads per block that can be used by this kernel on the given device. This number can be equal to or smaller than the the number of threads per block supported by the device. |
Definition at line 464 of file WorkDivHelpers.hpp.
ALPAKA_FN_HOST auto alpaka::isValidWorkDiv | ( | TWorkDiv const & | workDiv, |
TDev const & | dev | ||
) | -> bool |
Checks if the work division is supported by the device.
TAcc | The accelerator to test the validity on. |
workDiv | The work division to test for validity. |
dev | The device to test the work division for validity on. |
Definition at line 546 of file WorkDivHelpers.hpp.
ALPAKA_FN_HOST auto alpaka::isValidWorkDiv | ( | TWorkDiv const & | workDiv, |
TDev const & | dev, | ||
TKernelFnObj const & | kernelFnObj, | ||
TArgs &&... | args | ||
) | -> bool |
Checks if the work division is supported for the kernel on the device.
TAcc | The accelerator to test the validity on. |
TDev | The type of the device. |
TWorkDiv | The type of work division to test for validity. |
workDiv | The work division to test for validity. |
dev | The device to test the work division for validity on. |
kernelFnObj | The kernel function object which should be executed. |
args | The kernel invocation arguments. |
Definition at line 527 of file WorkDivHelpers.hpp.
ALPAKA_FN_HOST auto alpaka::malloc | ( | TAlloc const & | alloc, |
std::size_t const & | sizeElems | ||
) | -> T* |
Definition at line 33 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::mapIdx | ( | Vec< DimInt< TDimIn >, TElem > const & | in, |
Vec< DimInt< TDimExtents >, TElem > const & | extent | ||
) | -> Vec<DimInt<TDimOut>, TElem> |
Maps an N-dimensional index to an N-dimensional position. At least one dimension must always be 1 or zero.
TDimOut | Dimension of the index vector to map to. |
in | The index vector to map from. |
extent | The extents of the input or output space, whichever has more than 1 dimensions. |
Definition at line 26 of file MapIdx.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::mapIdxPitchBytes | ( | Vec< DimInt< TDimIn >, TElem > const & | in, |
Vec< DimInt< TidxDimPitch >, TElem > const & | pitches | ||
) | -> Vec<DimInt<TDimOut>, TElem> |
Maps an N dimensional index to a N dimensional position based on the pitches of a view without padding or a byte view. At least one dimension must always be 1 or zero.
TDimOut | Dimension of the index vector to map to. |
in | The index vector to map from. |
pitches | The pitches of the input or output space, whichever has more than 1 dimensions. |
Definition at line 66 of file MapIdx.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto alpaka::mem_fence | ( | TMemFence const & | fence, |
TMemScope const & | scope | ||
) | -> void |
Issues memory fence instructions.
TMemFence | The memory fence implementation type. |
TMemScope | The memory scope type. |
fence | The memory fence implementation. |
scope | The memory scope. |
Definition at line 61 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::memcpy | ( | TQueue & | queue, |
alpaka::detail::DevGlobalImplGeneric< TTag, TTypeDst > & | viewDst, | ||
TViewSrc const & | viewSrc | ||
) | -> void |
Definition at line 61 of file DeviceGlobalCpu.hpp.
ALPAKA_FN_HOST auto alpaka::memcpy | ( | TQueue & | queue, |
alpaka::detail::DevGlobalImplGeneric< TTag, TTypeDst > & | viewDst, | ||
TViewSrc const & | viewSrc, | ||
TExtent const & | extent | ||
) | -> void |
Definition at line 112 of file DeviceGlobalCpu.hpp.
ALPAKA_FN_HOST auto alpaka::memcpy | ( | TQueue & | queue, |
TViewDstFwd && | viewDst, | ||
alpaka::detail::DevGlobalImplGeneric< TTag, TTypeSrc > & | viewSrc | ||
) | -> void |
Definition at line 86 of file DeviceGlobalCpu.hpp.
ALPAKA_FN_HOST auto alpaka::memcpy | ( | TQueue & | queue, |
TViewDstFwd && | viewDst, | ||
alpaka::detail::DevGlobalImplGeneric< TTag, TTypeSrc > & | viewSrc, | ||
TExtent const & | extent | ||
) | -> void |
Definition at line 138 of file DeviceGlobalCpu.hpp.
ALPAKA_FN_HOST auto alpaka::memcpy | ( | TQueue & | queue, |
TViewDstFwd && | viewDst, | ||
TViewSrc const & | viewSrc | ||
) | -> void |
Copies the entire memory of viewSrc to viewDst. Possibly copies between different memory spaces.
queue | The queue to enqueue the view copy task into. | |
[in,out] | viewDst | The destination memory view. May be a temporary object. |
viewSrc | The source memory view. May be a temporary object. |
Definition at line 307 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::memcpy | ( | TQueue & | queue, |
TViewDstFwd && | viewDst, | ||
TViewSrc const & | viewSrc, | ||
TExtent const & | extent | ||
) | -> void |
Copies memory from a part of viewSrc to viewDst, described by extent. Possibly copies between different memory spaces.
queue | The queue to enqueue the view copy task into. | |
[in,out] | viewDst | The destination memory view. May be a temporary object. |
viewSrc | The source memory view. May be a temporary object. | |
extent | The extent of the view to copy. |
Definition at line 294 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::memcpy | ( | uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking > & | queue, |
alpaka::detail::DevGlobalImplGeneric< TTag, TTypeDst > & | viewDst, | ||
TViewSrc const & | viewSrc | ||
) |
Definition at line 94 of file DeviceGlobalUniformCudaHipBuiltIn.hpp.
ALPAKA_FN_HOST auto alpaka::memcpy | ( | uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking > & | queue, |
alpaka::detail::DevGlobalImplGeneric< TTag, TTypeDst > & | viewDst, | ||
TViewSrc const & | viewSrc, | ||
TExtent | extent | ||
) |
Definition at line 167 of file DeviceGlobalUniformCudaHipBuiltIn.hpp.
ALPAKA_FN_HOST auto alpaka::memcpy | ( | uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking > & | queue, |
TViewDst & | viewDst, | ||
alpaka::detail::DevGlobalImplGeneric< TTag, TTypeSrc > & | viewSrc | ||
) |
Definition at line 58 of file DeviceGlobalUniformCudaHipBuiltIn.hpp.
ALPAKA_FN_HOST auto alpaka::memcpy | ( | uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking > & | queue, |
TViewDst & | viewDst, | ||
alpaka::detail::DevGlobalImplGeneric< TTag, TTypeSrc > & | viewSrc, | ||
TExtent | extent | ||
) |
Definition at line 131 of file DeviceGlobalUniformCudaHipBuiltIn.hpp.
ALPAKA_FN_HOST auto alpaka::memset | ( | TQueue & | queue, |
TViewFwd && | view, | ||
std::uint8_t const & | byte | ||
) | -> void |
Sets each byte of the memory of the entire view to the given value.
queue | The queue to enqueue the view fill task into. | |
[in,out] | view | The memory view to fill. May be a temporary object. |
byte | Value to set for each element of the specified view. |
Definition at line 242 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::memset | ( | TQueue & | queue, |
TViewFwd && | view, | ||
std::uint8_t const & | byte, | ||
TExtent const & | extent | ||
) | -> void |
Sets the bytes of the memory of view, described by extent, to the given value.
queue | The queue to enqueue the view fill task into. | |
[in,out] | view | The memory view to fill. May be a temporary object. |
byte | Value to set for each element of the specified view. | |
extent | The extent of the view to fill. |
Definition at line 231 of file Traits.hpp.
|
inlineconstexpr |
|
inlineconstexpr |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto alpaka::popcount | ( | TIntrinsic const & | intrinsic, |
std::uint32_t | value | ||
) | -> std::int32_t |
Returns the number of 1 bits in the given 32-bit value.
TIntrinsic | The intrinsic implementation type. |
intrinsic | The intrinsic implementation. |
value | The input value. |
Definition at line 38 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto alpaka::popcount | ( | TIntrinsic const & | intrinsic, |
std::uint64_t | value | ||
) | -> std::int32_t |
Returns the number of 1 bits in the given 64-bit value.
TIntrinsic | The intrinsic implementation type. |
intrinsic | The intrinsic implementation. |
value | The input value. |
Definition at line 51 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::print | ( | TView const & | view, |
std::ostream & | os, | ||
std::string const & | elementSeparator = ", " , |
||
std::string const & | rowSeparator = "\n" , |
||
std::string const & | rowPrefix = "[" , |
||
std::string const & | rowSuffix = "]" |
||
) | -> void |
Prints the content of the view to the given queue.
Definition at line 391 of file Traits.hpp.
void alpaka::printTagNames | ( | ) |
ALPAKA_FN_HOST auto alpaka::reset | ( | TDev const & | dev | ) | -> void |
Resets the device. What this method does is dependent on the accelerator.
Definition at line 126 of file Traits.hpp.
|
constexpr |
Definition at line 90 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::subDivideGridElems | ( | Vec< TDim, TIdx > const & | gridElemExtent, |
Vec< TDim, TIdx > const & | threadElemExtent, | ||
AccDevProps< TDim, TIdx > const & | accDevProps, | ||
TIdx | kernelBlockThreadCountMax = static_cast<TIdx>(0u) , |
||
bool | blockThreadMustDivideGridThreadExtent = true , |
||
GridBlockExtentSubDivRestrictions | gridBlockExtentSubDivRestrictions = GridBlockExtentSubDivRestrictions::Unrestricted |
||
) | -> WorkDivMembers<TDim, TIdx> |
Subdivides the given grid thread extent into blocks restricted by the maxima allowed.
gridElemExtent | The full extent of elements in the grid. |
threadElemExtent | the number of elements computed per thread. |
accDevProps | The maxima for the work division. |
kernelBlockThreadCountMax | The maximum number of threads per block. If it is zero this argument is not used, device hard limits are used. |
blockThreadMustDivideGridThreadExtent | If this is true, the grid thread extent will be multiples of the corresponding block thread extent. NOTE: If this is true and gridThreadExtent is prime (or otherwise bad chosen) in a dimension, the block thread extent will be one in this dimension. |
gridBlockExtentSubDivRestrictions | The grid block extent subdivision restrictions. |
Definition at line 134 of file WorkDivHelpers.hpp.
|
constexpr |
TVec | has to specialize SubVecFromIndices. |
A sequence of integers from 0 to dim-1.
Definition at line 51 of file Traits.hpp.
|
constexpr |
TVec | has to specialize SubVecFromIndices. |
A sequence of integers from 0 to dim-1.
Definition at line 66 of file Traits.hpp.
|
constexpr |
Builds a new vector by selecting the elements of the source vector in the given order. Repeating and swizzling elements is allowed.
Definition at line 42 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto alpaka::syncBlockThreads | ( | TBlockSync const & | blockSync | ) | -> void |
Synchronizes all threads within the current block (independently for all blocks).
TBlockSync | The block synchronization implementation type. |
blockSync | The block synchronization implementation. |
Definition at line 36 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto alpaka::syncBlockThreadsPredicate | ( | TBlockSync const & | blockSync, |
int | predicate | ||
) | -> int |
Synchronizes all threads within the current block (independently for all blocks), evaluates the predicate for all threads and returns the combination of all the results computed via TOp.
TOp | The operation used to combine the predicate values of all threads. |
TBlockSync | The block synchronization implementation type. |
blockSync | The block synchronization implementation. |
predicate | The predicate value of the current thread. |
Definition at line 100 of file Traits.hpp.
|
constexpr |
|
inline |
Definition at line 262 of file UniformElements.hpp.
|
inline |
Definition at line 279 of file UniformElements.hpp.
|
inline |
Definition at line 295 of file UniformElements.hpp.
|
inline |
Definition at line 305 of file UniformElements.hpp.
|
inline |
Definition at line 315 of file UniformElements.hpp.
|
inline |
Definition at line 595 of file UniformElements.hpp.
|
inline |
Definition at line 603 of file UniformElements.hpp.
|
inline |
Definition at line 1086 of file UniformElements.hpp.
|
inline |
Definition at line 1103 of file UniformElements.hpp.
|
inline |
Definition at line 1119 of file UniformElements.hpp.
|
inline |
Definition at line 1129 of file UniformElements.hpp.
|
inline |
Definition at line 1139 of file UniformElements.hpp.
|
inline |
Definition at line 819 of file UniformElements.hpp.
|
inline |
Definition at line 836 of file UniformElements.hpp.
|
inline |
Definition at line 852 of file UniformElements.hpp.
|
inline |
Definition at line 862 of file UniformElements.hpp.
|
inline |
Definition at line 872 of file UniformElements.hpp.
ALPAKA_FN_HOST_ACC alpaka::Vec | ( | TFirstIndex && | , |
TRestIndices && | ... | ||
) | -> Vec< DimInt< 1+sizeof...(TRestIndices)>, std::decay_t< TFirstIndex >> |
alpaka::ViewConst | ( | TView | ) | -> ViewConst< std::decay_t< TView >> |
ALPAKA_FN_HOST auto alpaka::wait | ( | TAwaited const & | awaited | ) | -> void |
Waits the thread for the completion of the given awaited action to complete.
Special Handling for events: If the event is re-enqueued wait() will terminate when the re-enqueued event will be ready and previously enqueued states of the event will be ignored.
Definition at line 34 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::wait | ( | TWaiter & | waiter, |
TAwaited const & | awaited | ||
) | -> void |
The waiter waits for the given awaited action to complete.
Special Handling if waiter
is a queue and awaited
an event: The waiter
waits for the event state to become ready based on the recently captured event state at the time of the API call even if the event is being re-enqueued later.
Definition at line 46 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC alpaka::WorkDivMembers | ( | alpaka::Vec< TDim, TIdx > const & | gridBlockExtent, |
alpaka::Vec< TDim, TIdx > const & | blockThreadExtent, | ||
alpaka::Vec< TDim, TIdx > const & | elemExtent | ||
) | -> WorkDivMembers< TDim, TIdx > |
Deduction guide for the constructor which can be called without explicit template type parameters.
|
inlineconstexpr |
|
constexpr |
Definition at line 14 of file BlockSharedDynMemberAllocKiB.hpp.
|
inlineconstexpr |
Checks if the given device can allocate a stream-ordered memory buffer of the given dimensionality.
TDev | The type of device to allocate the buffer on. |
TDim | The dimensionality of the buffer to allocate. |
Definition at line 95 of file Traits.hpp.
|
inlineconstexpr |
Checks if the host can allocate a pinned/mapped host memory, accessible by all devices in the given platform.
TPlatform | The platform from which the buffer is accessible. |
Definition at line 156 of file Traits.hpp.
|
inlineconstexpr |
|
inlineconstexpr |
True if TAcc is an accelerator, i.e. if it implements the ConceptAcc concept.
Definition at line 30 of file Traits.hpp.
|
inlineconstexpr |
True if TDev is a device, i.e. if it implements the ConceptDev concept.
Definition at line 64 of file Traits.hpp.
|
inlineconstexpr |
Definition at line 267 of file Traits.hpp.
|
inlineconstexpr |
Definition at line 317 of file Traits.hpp.
|
inlineconstexpr |
True if TAcc is an accelerator that supports multiple threads per block, false otherwise.
Definition at line 86 of file Traits.hpp.
|
inlineconstexpr |
True if TPlatform is a platform, i.e. if it implements the ConceptPlatform concept.
Definition at line 23 of file Traits.hpp.
|
inlineconstexpr |
True if TQueue is a queue, i.e. if it implements the ConceptQueue concept.
Definition at line 20 of file Traits.hpp.
|
inlineconstexpr |
True if TAcc is an accelerator that supports only a single thread per block, false otherwise.
Definition at line 82 of file Traits.hpp.
|
inlineconstexpr |
|
inlineconstexpr |