alpaka
Abstraction Library for Parallel Kernel Acceleration
|
The alpaka accelerator library. More...
Namespaces | |
bt | |
concepts | |
core | |
cpu | |
cuda | |
detail | |
gb | |
generic | |
hierarchy | |
Defines the parallelism hierarchy levels of alpaka. | |
internal | |
math | |
memory_scope | |
meta | |
omp | |
origin | |
Defines the origins available for getting extent and indices of kernel executions. | |
property | |
Properties to define queue behavior. | |
rand | |
test | |
The test specifics. | |
trait | |
The accelerator traits. | |
uniform_cuda_hip | |
unit | |
Defines the units available for getting extent and indices of kernel executions. | |
warp | |
Classes | |
class | AccCpuOmp2Blocks |
The CPU OpenMP 2.0 block accelerator. More... | |
class | AccCpuOmp2Threads |
The CPU OpenMP 2.0 thread accelerator. More... | |
class | AccCpuSerial |
The CPU serial accelerator. More... | |
class | AccCpuThreads |
The CPU threads accelerator. More... | |
struct | AccDevProps |
The acceleration properties on a device. More... | |
class | AccGpuUniformCudaHipRt |
The GPU CUDA accelerator. More... | |
class | AllocCpuAligned |
The CPU boost aligned allocator. More... | |
class | AllocCpuNew |
The CPU new allocator. More... | |
struct | ApiCudaRt |
struct | AtomicAdd |
The addition function object. More... | |
struct | AtomicAnd |
The and function object. More... | |
class | AtomicAtomicRef |
The atomic ops based on atomic_ref for CPU accelerators. More... | |
struct | AtomicCas |
The compare and swap function object. More... | |
struct | AtomicDec |
The decrement function object. More... | |
struct | AtomicExch |
The exchange function object. More... | |
struct | AtomicInc |
The increment function object. More... | |
struct | AtomicMax |
The maximum function object. More... | |
struct | AtomicMin |
The minimum function object. More... | |
class | AtomicNoOp |
The NoOp atomic ops. More... | |
class | AtomicOmpBuiltIn |
The OpenMP accelerators atomic ops. More... | |
struct | AtomicOr |
The or function object. More... | |
struct | AtomicSub |
The subtraction function object. More... | |
class | AtomicUniformCudaHipBuiltIn |
The GPU CUDA/HIP accelerator atomic ops. More... | |
struct | AtomicXor |
The exclusive or function object. More... | |
struct | BlockAnd |
The logical and function object. More... | |
struct | BlockCount |
The counting function object. More... | |
struct | BlockOr |
The logical or function object. More... | |
class | BlockSharedMemDynMember |
Dynamic block shared memory provider using fixed-size member array to allocate memory on the stack or in shared memory. More... | |
class | BlockSharedMemDynUniformCudaHipBuiltIn |
The GPU CUDA/HIP block shared memory allocator. More... | |
class | BlockSharedMemStMember |
Static block shared memory provider using a pointer to externally allocated fixed-size memory, likely provided by BlockSharedMemDynMember. More... | |
class | BlockSharedMemStMemberMasterSync |
class | BlockSharedMemStUniformCudaHipBuiltIn |
The GPU CUDA/HIP block shared memory allocator. More... | |
class | BlockSyncBarrierOmp |
The OpenMP barrier block synchronization. More... | |
class | BlockSyncBarrierThread |
The thread id map barrier block synchronization. More... | |
class | BlockSyncNoOp |
The no op block synchronization. More... | |
class | BlockSyncUniformCudaHipBuiltIn |
The GPU CUDA/HIP block synchronization. More... | |
class | BufCpu |
The CPU memory buffer. More... | |
struct | BufUniformCudaHipRt |
The CUDA/HIP memory buffer. More... | |
class | Complex |
Implementation of a complex number useable on host and device. More... | |
struct | ConceptAcc |
struct | ConceptAtomicBlocks |
struct | ConceptAtomicGrids |
struct | ConceptAtomicThreads |
struct | ConceptBlockSharedDyn |
struct | ConceptBlockSharedSt |
struct | ConceptBlockSync |
struct | ConceptCurrentThreadWaitFor |
struct | ConceptIdxBt |
struct | ConceptIdxGb |
struct | ConceptIntrinsic |
struct | ConceptMemAlloc |
struct | ConceptMemFence |
struct | ConceptPlatform |
struct | ConceptWorkDiv |
class | DevCpu |
The CPU device handle. More... | |
class | DevUniformCudaHipRt |
The CUDA/HIP RT device handle. More... | |
class | EventGenericThreads |
The CPU device event. More... | |
class | EventUniformCudaHipRt |
The CUDA/HIP RT device event. More... | |
class | IGenericThreadsQueue |
The CPU queue interface. More... | |
class | IntrinsicCpu |
The CPU intrinsic. More... | |
class | IntrinsicFallback |
The Fallback intrinsic. More... | |
class | IntrinsicUniformCudaHipBuiltIn |
The GPU CUDA/HIP intrinsic. More... | |
struct | IsKernelArgumentTriviallyCopyable |
Check if a type used as kernel argument is trivially copyable. More... | |
class | MemFenceCpu |
The default CPU memory fence. More... | |
class | MemFenceCpuSerial |
The serial CPU memory fence. More... | |
class | MemFenceOmp2Blocks |
The CPU OpenMP 2.0 block memory fence. More... | |
class | MemFenceOmp2Threads |
The CPU OpenMP 2.0 block memory fence. More... | |
class | MemFenceUniformCudaHipBuiltIn |
The GPU CUDA/HIP memory fence. More... | |
class | MemSetKernel |
any device ND memory set kernel. More... | |
struct | PlatformCpu |
The CPU device platform. More... | |
struct | PlatformUniformCudaHipRt |
The CUDA/HIP RT platform. More... | |
struct | QueueCpuOmp2Collective |
The CPU collective device queue. More... | |
class | QueueGenericThreadsBlocking |
The CPU device queue. More... | |
class | QueueGenericThreadsNonBlocking |
The CPU device queue. More... | |
struct | remove_restrict |
Removes restrict from a type. More... | |
struct | remove_restrict< T *__restrict__ > |
struct | TagCpuOmp2Blocks |
struct | TagCpuOmp2Threads |
struct | TagCpuSerial |
struct | TagCpuSycl |
struct | TagCpuTbbBlocks |
struct | TagCpuThreads |
struct | TagFpgaSyclIntel |
struct | TagGenericSycl |
struct | TagGpuCudaRt |
struct | TagGpuHipRt |
struct | TagGpuSyclIntel |
class | TaskKernelCpuOmp2Blocks |
The CPU OpenMP 2.0 block accelerator execution task. More... | |
class | TaskKernelCpuOmp2Threads |
The CPU OpenMP 2.0 thread accelerator execution task. More... | |
class | TaskKernelCpuSerial |
The CPU serial execution task implementation. More... | |
class | TaskKernelCpuThreads |
The CPU threads execution task. More... | |
class | TaskKernelGpuUniformCudaHipRt |
The GPU CUDA/HIP accelerator execution task. More... | |
class | Vec |
A n-dimensional vector. More... | |
struct | ViewConst |
A non-modifiable wrapper around a view. This view acts as the wrapped view, but the underlying data is only exposed const-qualified. More... | |
struct | ViewPlainPtr |
The memory view to wrap plain pointers. More... | |
class | ViewSubView |
A sub-view to a view. More... | |
class | WorkDivMembers |
A basic class holding the work division as grid block extent, block thread and thread element extent. More... | |
class | WorkDivUniformCudaHipBuiltIn |
The GPU CUDA/HIP accelerator work division. More... | |
Typedefs | |
template<typename T > | |
using | Acc = typename trait::AccType< T >::type |
The accelerator type trait alias template to remove the ::type. More... | |
template<typename TDim , typename TIdx > | |
using | AccGpuCudaRt = AccGpuUniformCudaHipRt< ApiCudaRt, TDim, TIdx > |
template<typename TAcc > | |
using | AccToTag = typename trait::AccToTag< TAcc >::type |
maps an acc type to a tag type More... | |
using | AtomicCpu = AtomicStdLibLock< 16 > |
template<typename TGridAtomic , typename TBlockAtomic , typename TThreadAtomic > | |
using | AtomicHierarchy = alpaka::meta::InheritFromList< alpaka::meta::Unique< std::tuple< TGridAtomic, TBlockAtomic, TThreadAtomic, concepts::Implements< ConceptAtomicGrids, TGridAtomic >, concepts::Implements< ConceptAtomicBlocks, TBlockAtomic >, concepts::Implements< ConceptAtomicThreads, TThreadAtomic > >> > |
build a single class to inherit from different atomic implementations More... | |
template<typename THierarchy > | |
using | AtomicHierarchyConcept = typename detail::AtomicHierarchyConceptType< THierarchy >::type |
template<typename TDev , typename TElem , typename TDim , typename TIdx > | |
using | Buf = typename trait::BufType< alpaka::Dev< TDev >, TElem, TDim, TIdx >::type |
The memory buffer type trait alias template to remove the ::type. More... | |
template<typename TElem , typename TDim , typename TIdx > | |
using | BufCudaRt = BufUniformCudaHipRt< ApiCudaRt, TElem, TDim, TIdx > |
template<typename T > | |
using | Dev = typename trait::DevType< T >::type |
The device type trait alias template to remove the ::type. More... | |
using | DevCudaRt = DevUniformCudaHipRt< ApiCudaRt > |
The CUDA RT device handle. More... | |
template<typename TAcc , typename T > | |
using | DevGlobal = typename detail::DevGlobalTrait< typename alpaka::trait::AccToTag< TAcc >::type, T >::Type |
template<typename T > | |
using | Dim = typename trait::DimType< T >::type |
The dimension type trait alias template to remove the ::type. More... | |
template<std::size_t N> | |
using | DimInt = std::integral_constant< std::size_t, N > |
template<typename TView > | |
using | Elem = std::remove_volatile_t< typename trait::ElemType< TView >::type > |
The element type trait alias template to remove the ::type. More... | |
template<typename T > | |
using | Event = typename trait::EventType< T >::type |
The event type trait alias template to remove the ::type. More... | |
using | EventCpu = EventGenericThreads< DevCpu > |
using | EventCudaRt = EventUniformCudaHipRt< ApiCudaRt > |
The CUDA RT device event. More... | |
template<class TDim , class TIdx > | |
using | ExampleDefaultAcc = alpaka::AccGpuCudaRt< TDim, TIdx > |
Alias for the default accelerator used by examples. From a list of all accelerators the first one which is enabled is chosen. AccCpuSerial is selected last. More... | |
template<typename T > | |
using | Idx = typename trait::IdxType< T >::type |
template<typename TImpl > | |
using | NativeHandle = decltype(getNativeHandle(std::declval< TImpl >())) |
Alias to the type of the native handle. More... | |
template<typename T > | |
using | Platform = typename trait::PlatformType< T >::type |
The platform type trait alias template to remove the ::type. More... | |
using | PlatformCudaRt = PlatformUniformCudaHipRt< ApiCudaRt > |
The CUDA RT platform. More... | |
template<typename TEnv , typename TProperty > | |
using | Queue = typename trait::QueueType< TEnv, TProperty >::type |
Queue based on the environment and a property. More... | |
using | QueueCpuBlocking = QueueGenericThreadsBlocking< DevCpu > |
using | QueueCpuNonBlocking = QueueGenericThreadsNonBlocking< DevCpu > |
using | QueueCudaRtBlocking = QueueUniformCudaHipRtBlocking< ApiCudaRt > |
The CUDA RT blocking queue. More... | |
using | QueueCudaRtNonBlocking = QueueUniformCudaHipRtNonBlocking< ApiCudaRt > |
The CUDA RT non-blocking queue. More... | |
template<typename TApi > | |
using | QueueUniformCudaHipRtBlocking = uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, true > |
The CUDA/HIP RT blocking queue. More... | |
template<typename TApi > | |
using | QueueUniformCudaHipRtNonBlocking = uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, false > |
The CUDA/HIP RT non-blocking queue. More... | |
template<typename T > | |
using | remove_restrict_t = typename remove_restrict< T >::type |
Helper to remove restrict from a type. More... | |
template<typename TTag , typename TDim , typename TIdx > | |
using | TagToAcc = typename trait::TagToAcc< TTag, TDim, TIdx >::type |
maps a tag type to an acc type More... | |
template<typename TAcc , typename TDim , typename TIdx , typename TKernelFnObj , typename... TArgs> | |
using | TaskKernelGpuCudaRt = TaskKernelGpuUniformCudaHipRt< ApiCudaRt, TAcc, TDim, TIdx, TKernelFnObj, TArgs... > |
Enumerations | |
enum class | GridBlockExtentSubDivRestrictions { EqualExtent , CloseToEqualExtent , Unrestricted } |
The grid block extent subdivision restrictions. More... | |
Functions | |
template<typename TElem , typename TIdx , typename TExtent , typename TQueue > | |
ALPAKA_FN_HOST auto | allocAsyncBuf (TQueue queue, TExtent const &extent=TExtent()) |
Allocates stream-ordered memory on the given device. More... | |
template<typename TElem , typename TIdx , typename TExtent , typename TQueue > | |
ALPAKA_FN_HOST auto | allocAsyncBufIfSupported (TQueue queue, TExtent const &extent=TExtent()) |
If supported, allocates stream-ordered memory on the given queue and the associated device. Otherwise, allocates regular memory on the device associated to the queue. Please note that stream-ordered and regular memory have different semantics: this function is provided for convenience in the cases where the difference is not relevant, and the stream-ordered memory is only used as a performance optimisation. More... | |
template<typename TElem , typename TIdx , typename TExtent , typename TDev > | |
ALPAKA_FN_HOST auto | allocBuf (TDev const &dev, TExtent const &extent=TExtent()) |
Allocates memory on the given device. More... | |
template<typename TPlatform , typename TElem , typename TIdx , typename TExtent > | |
ALPAKA_FN_HOST auto | allocMappedBuf (DevCpu const &host, TPlatform const &platform, TExtent const &extent=TExtent()) |
Allocates pinned/mapped host memory, accessible by all devices in the given platform. More... | |
template<typename TElem , typename TIdx , typename TExtent , typename TPlatform > | |
ALPAKA_FN_HOST auto | allocMappedBufIfSupported (DevCpu const &host, TPlatform const &platform, TExtent const &extent=TExtent()) |
If supported, allocates pinned/mapped host memory, accessible by all devices in the given platform. Otherwise, allocates regular host memory. Please note that pinned/mapped and regular memory may have different semantics: this function is provided for convenience in the cases where the difference is not relevant, and the pinned/mapped memory is only used as a performance optimisation. More... | |
template<typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicAdd (TAtomic const &atomic, T *const addr, T const &value, THierarchy const &hier=THierarchy()) -> T |
Executes an atomic add operation. More... | |
template<typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicAnd (TAtomic const &atomic, T *const addr, T const &value, THierarchy const &hier=THierarchy()) -> T |
Executes an atomic and operation. More... | |
template<typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicCas (TAtomic const &atomic, T *const addr, T const &compare, T const &value, THierarchy const &hier=THierarchy()) -> T |
Executes an atomic compare-and-swap operation. More... | |
template<typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicDec (TAtomic const &atomic, T *const addr, T const &value, THierarchy const &hier=THierarchy()) -> T |
Executes an atomic decrement operation. More... | |
template<typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicExch (TAtomic const &atomic, T *const addr, T const &value, THierarchy const &hier=THierarchy()) -> T |
Executes an atomic exchange operation. More... | |
template<typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicInc (TAtomic const &atomic, T *const addr, T const &value, THierarchy const &hier=THierarchy()) -> T |
Executes an atomic increment operation. More... | |
template<typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicMax (TAtomic const &atomic, T *const addr, T const &value, THierarchy const &hier=THierarchy()) -> T |
Executes an atomic max operation. More... | |
template<typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicMin (TAtomic const &atomic, T *const addr, T const &value, THierarchy const &hier=THierarchy()) -> T |
Executes an atomic min operation. More... | |
template<typename TOp , typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicOp (TAtomic const &atomic, T *const addr, T const &compare, T const &value, THierarchy const &=THierarchy()) -> T |
Executes the given operation atomically. More... | |
template<typename TOp , typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicOp (TAtomic const &atomic, T *const addr, T const &value, THierarchy const &=THierarchy()) -> T |
Executes the given operation atomically. More... | |
template<typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicOr (TAtomic const &atomic, T *const addr, T const &value, THierarchy const &hier=THierarchy()) -> T |
Executes an atomic or operation. More... | |
template<typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicSub (TAtomic const &atomic, T *const addr, T const &value, THierarchy const &hier=THierarchy()) -> T |
Executes an atomic sub operation. More... | |
template<typename TAtomic , typename T , typename THierarchy = hierarchy::Grids> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | atomicXor (TAtomic const &atomic, T *const addr, T const &value, THierarchy const &hier=THierarchy()) -> T |
Executes an atomic xor operation. More... | |
template<typename TVal , typename TVec > | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | castVec (TVec const &vec) |
template<typename TVecL , typename TVecR > | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | concatVec (TVecL const &vecL, TVecR const &vecR) |
template<typename TView , typename TExtent , typename TOffsets > | |
auto | createSubView (TView &view, TExtent const &extent, TOffsets const &offset=TExtent()) |
Creates a sub view to an existing view. More... | |
template<typename TAcc , typename TWorkDiv , typename TKernelFnObj , typename... TArgs> | |
ALPAKA_FN_HOST auto | createTaskKernel (TWorkDiv const &workDiv, TKernelFnObj const &kernelFnObj, TArgs &&... args) |
Creates a kernel execution task. More... | |
template<typename TExtent , typename TViewSrc , typename TViewDstFwd > | |
ALPAKA_FN_HOST auto | createTaskMemcpy (TViewDstFwd &&viewDst, TViewSrc const &viewSrc, TExtent const &extent) |
Creates a memory copy task. More... | |
template<typename TExtent , typename TViewFwd > | |
ALPAKA_FN_HOST auto | createTaskMemset (TViewFwd &&view, std::uint8_t const &byte, TExtent const &extent) |
Create a memory set task. More... | |
template<typename TDev , typename TContainer > | |
auto | createView (TDev const &dev, TContainer &con) |
Creates a view to a contiguous container of device-accessible memory. More... | |
template<typename TDev , typename TContainer , typename TExtent > | |
auto | createView (TDev const &dev, TContainer &con, TExtent const &extent) |
Creates a view to a contiguous container of device-accessible memory. More... | |
template<typename TDev , typename TElem , typename TExtent > | |
auto | createView (TDev const &dev, TElem *pMem, TExtent const &extent) |
Creates a view to a device pointer. More... | |
template<typename TDev , typename TElem , typename TExtent , typename TPitch > | |
auto | createView (TDev const &dev, TElem *pMem, TExtent const &extent, TPitch pitch) |
Creates a view to a device pointer. More... | |
template<typename T , std::size_t TuniqueId, typename TBlockSharedMemSt > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto | declareSharedVar (TBlockSharedMemSt const &blockSharedMemSt) -> T & |
Declare a block shared variable. More... | |
template<typename TDim , typename TVal , typename... Vecs, typename = std::enable_if_t<(std::is_same_v<Vec<TDim, TVal>, Vecs> && ...)>> | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | elementwise_max (Vec< TDim, TVal > const &p, Vecs const &... qs) -> Vec< TDim, TVal > |
template<typename TDim , typename TVal , typename... Vecs, typename = std::enable_if_t<(std::is_same_v<Vec<TDim, TVal>, Vecs> && ...)>> | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | elementwise_min (Vec< TDim, TVal > const &p, Vecs const &... qs) -> Vec< TDim, TVal > |
template<typename TQueue > | |
ALPAKA_FN_HOST auto | empty (TQueue const &queue) -> bool |
Tests if the queue is empty (all ops in the given queue have been completed). More... | |
template<typename TQueue , typename TTask > | |
ALPAKA_FN_HOST auto | enqueue (TQueue &queue, TTask &&task) -> void |
Queues the given task in the given queue. More... | |
template<typename TAcc , typename TQueue , typename TWorkDiv , typename TKernelFnObj , typename... TArgs> | |
ALPAKA_FN_HOST auto | exec (TQueue &queue, TWorkDiv const &workDiv, TKernelFnObj const &kernelFnObj, TArgs &&... args) -> void |
Executes the given kernel in the given queue. More... | |
template<typename TIntrinsic > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto | ffs (TIntrinsic const &intrinsic, std::int32_t value) -> std::int32_t |
Returns the 1-based position of the least significant bit set to 1 in the given 32-bit value. Returns 0 for input value 0. More... | |
template<typename TIntrinsic > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto | ffs (TIntrinsic const &intrinsic, std::int64_t value) -> std::int32_t |
Returns the 1-based position of the least significant bit set to 1 in the given 64-bit value. Returns 0 for input value 0. More... | |
template<typename TAlloc , typename T > | |
ALPAKA_FN_HOST auto | free (TAlloc const &alloc, T const *const ptr) -> void |
Frees the memory identified by the given pointer. More... | |
template<typename TBlockSharedMemSt > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto | freeSharedVars (TBlockSharedMemSt &blockSharedMemSt) -> void |
Frees all memory used by block shared variables. More... | |
template<typename TAcc , typename TDev > | |
ALPAKA_FN_HOST auto | getAccDevProps (TDev const &dev) -> AccDevProps< Dim< TAcc >, Idx< TAcc >> |
template<typename TAcc > | |
ALPAKA_FN_HOST auto | getAccName () -> std::string |
template<typename TAcc , typename TKernelFnObj , typename TDim , typename... TArgs> | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getBlockSharedMemDynSizeBytes (TKernelFnObj const &kernelFnObj, Vec< TDim, Idx< TAcc >> const &blockThreadExtent, Vec< TDim, Idx< TAcc >> const &threadElemExtent, TArgs const &... args) -> std::size_t |
template<typename TExtent > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getDepth (TExtent const &extent=TExtent()) -> Idx< TExtent > |
template<typename T > | |
ALPAKA_FN_HOST auto | getDev (T const &t) |
template<typename TPlatform > | |
ALPAKA_FN_HOST auto | getDevByIdx (TPlatform const &platform, std::size_t const &devIdx) -> Dev< TPlatform > |
template<typename TPlatform > | |
ALPAKA_FN_HOST auto | getDevCount (TPlatform const &platform) |
template<typename TPlatform > | |
ALPAKA_FN_HOST auto | getDevs (TPlatform const &platform) -> std::vector< Dev< TPlatform >> |
template<typename T , typename TBlockSharedMemDyn > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto | getDynSharedMem (TBlockSharedMemDyn const &blockSharedMemDyn) -> T * |
Get block shared dynamic memory. More... | |
template<std::size_t Tidx, typename TExtent > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getExtent (TExtent const &extent=TExtent()) -> Idx< TExtent > |
template<typename T > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getExtentProduct (T const &object) -> Idx< T > |
template<typename T > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getExtents (T const &object) -> Vec< Dim< T >, Idx< T >> |
template<typename T > | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | getExtentVec (T const &object={}) -> Vec< Dim< T >, Idx< T >> |
template<typename TDim , typename T > | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | getExtentVecEnd (T const &object={}) -> Vec< TDim, Idx< T >> |
template<typename TDev > | |
ALPAKA_FN_HOST auto | getFreeMemBytes (TDev const &dev) -> std::size_t |
template<typename TExtent > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getHeight (TExtent const &extent=TExtent()) -> Idx< TExtent > |
template<typename TOrigin , typename TUnit , typename TIdx , typename TWorkDiv > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getIdx (TIdx const &idx, TWorkDiv const &workDiv) -> Vec< Dim< TWorkDiv >, Idx< TIdx >> |
Get the indices requested. More... | |
template<typename TOrigin , typename TUnit , typename TIdxWorkDiv > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getIdx (TIdxWorkDiv const &idxWorkDiv) -> Vec< Dim< TIdxWorkDiv >, Idx< TIdxWorkDiv >> |
Get the indices requested. More... | |
template<typename TIdxWorkDiv , typename TGridThreadIdx , typename TThreadElemExtent > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getIdxThreadFirstElem ([[maybe_unused]] TIdxWorkDiv const &idxWorkDiv, TGridThreadIdx const &gridThreadIdx, TThreadElemExtent const &threadElemExtent) -> Vec< Dim< TIdxWorkDiv >, Idx< TIdxWorkDiv >> |
Get the index of the first element this thread computes. More... | |
template<typename TIdxWorkDiv > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getIdxThreadFirstElem (TIdxWorkDiv const &idxWorkDiv) -> Vec< Dim< TIdxWorkDiv >, Idx< TIdxWorkDiv >> |
Get the index of the first element this thread computes. More... | |
template<typename TIdxWorkDiv , typename TGridThreadIdx > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getIdxThreadFirstElem (TIdxWorkDiv const &idxWorkDiv, TGridThreadIdx const &gridThreadIdx) -> Vec< Dim< TIdxWorkDiv >, Idx< TIdxWorkDiv >> |
Get the index of the first element this thread computes. More... | |
template<typename TDev > | |
ALPAKA_FN_HOST auto | getMemBytes (TDev const &dev) -> std::size_t |
template<typename TDev > | |
ALPAKA_FN_HOST auto | getName (TDev const &dev) -> std::string |
template<typename TImpl > | |
ALPAKA_FN_HOST auto | getNativeHandle (TImpl const &impl) |
Get the native handle of the alpaka object. It will return the alpaka object handle if there is any, otherwise it generates a compile time error. More... | |
template<std::size_t Tidx, typename TOffsets > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getOffset (TOffsets const &offsets) -> Idx< TOffsets > |
template<typename T > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getOffsets (T const &object) -> Vec< Dim< T >, Idx< T >> |
template<typename T > | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | getOffsetVec (T const &object={}) -> Vec< Dim< T >, Idx< T >> |
template<typename TDim , typename T > | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | getOffsetVecEnd (T const &object={}) -> Vec< TDim, Idx< T >> |
template<typename TOffsets > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getOffsetX (TOffsets const &offsets=TOffsets()) -> Idx< TOffsets > |
template<typename TOffsets > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getOffsetY (TOffsets const &offsets=TOffsets()) -> Idx< TOffsets > |
template<typename TOffsets > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getOffsetZ (TOffsets const &offsets=TOffsets()) -> Idx< TOffsets > |
template<typename TAcc , typename TKernelFnObj , typename TDim , typename... TArgs> | |
ALPAKA_FN_HOST auto | getOmpSchedule (TKernelFnObj const &kernelFnObj, Vec< TDim, Idx< TAcc >> const &blockThreadExtent, Vec< TDim, Idx< TAcc >> const &threadElemExtent, TArgs const &... args) |
template<std::size_t Tidx, typename TView > | |
ALPAKA_FN_HOST auto | getPitchBytes (TView const &view) -> Idx< TView > |
template<typename TView > | |
auto | getPitchBytesVec (TView const &view) -> Vec< Dim< TView >, Idx< TView >> |
template<typename TDim , typename TView > | |
ALPAKA_FN_HOST auto | getPitchBytesVecEnd (TView const &view=TView()) -> Vec< TDim, Idx< TView >> |
template<typename TView > | |
ALPAKA_FN_HOST auto | getPitchesInBytes (TView const &view) -> Vec< Dim< TView >, Idx< TView >> |
template<typename TDev > | |
constexpr ALPAKA_FN_HOST auto | getPreferredWarpSize (TDev const &dev) -> std::size_t |
template<typename TView , typename TDev > | |
ALPAKA_FN_HOST auto | getPtrDev (TView &view, TDev const &dev) -> Elem< TView > * |
Gets the pointer to the view on the given device. More... | |
template<typename TView , typename TDev > | |
ALPAKA_FN_HOST auto | getPtrDev (TView const &view, TDev const &dev) -> Elem< TView > const * |
Gets the pointer to the view on the given device. More... | |
template<typename TView > | |
ALPAKA_FN_HOST auto | getPtrNative (TView &view) -> Elem< TView > * |
Gets the native pointer of the memory view. More... | |
template<typename TView > | |
ALPAKA_FN_HOST auto | getPtrNative (TView const &view) -> Elem< TView > const * |
Gets the native pointer of the memory view. More... | |
template<typename TAcc , typename TDev , typename TGridElemExtent = Vec<Dim<TAcc>, Idx<TAcc>>, typename TThreadElemExtent = Vec<Dim<TAcc>, Idx<TAcc>>> | |
ALPAKA_FN_HOST auto | getValidWorkDiv ([[maybe_unused]] TDev const &dev, [[maybe_unused]] TGridElemExtent const &gridElemExtent=Vec< Dim< TAcc >, Idx< TAcc >>::ones(), [[maybe_unused]] TThreadElemExtent const &threadElemExtents=Vec< Dim< TAcc >, Idx< TAcc >>::ones(), [[maybe_unused]] bool blockThreadMustDivideGridThreadExtent=true, [[maybe_unused]] GridBlockExtentSubDivRestrictions gridBlockExtentSubDivRestrictions=GridBlockExtentSubDivRestrictions::Unrestricted) -> WorkDivMembers< Dim< TGridElemExtent >, Idx< TGridElemExtent >> |
template<typename TDev > | |
ALPAKA_FN_HOST auto | getWarpSizes (TDev const &dev) -> std::vector< std::size_t > |
template<typename TExtent > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getWidth (TExtent const &extent=TExtent()) -> Idx< TExtent > |
template<typename TOrigin , typename TUnit , typename TWorkDiv > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | getWorkDiv (TWorkDiv const &workDiv) -> Vec< Dim< TWorkDiv >, Idx< TWorkDiv >> |
Get the extent requested. More... | |
template<typename TEvent > | |
ALPAKA_FN_HOST auto | isComplete (TEvent const &event) -> bool |
Tests if the given event has already been completed. More... | |
template<typename T > | |
void | isSupportedByAtomicAtomicRef () |
template<typename TDim , typename TIdx > | |
ALPAKA_FN_HOST auto | isValidAccDevProps (AccDevProps< TDim, TIdx > const &accDevProps) -> bool |
template<typename TDim , typename TIdx , typename TWorkDiv > | |
ALPAKA_FN_HOST auto | isValidWorkDiv (AccDevProps< TDim, TIdx > const &accDevProps, TWorkDiv const &workDiv) -> bool |
template<typename TAcc , typename TDev , typename TWorkDiv > | |
ALPAKA_FN_HOST auto | isValidWorkDiv (TDev const &dev, TWorkDiv const &workDiv) -> bool |
template<typename T , typename TAlloc > | |
ALPAKA_FN_HOST auto | malloc (TAlloc const &alloc, std::size_t const &sizeElems) -> T * |
template<std::size_t TDimOut, std::size_t TDimIn, std::size_t TDimExtents, typename TElem > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | mapIdx (Vec< DimInt< TDimIn >, TElem > const &in, Vec< DimInt< TDimExtents >, TElem > const &extent) -> Vec< DimInt< TDimOut >, TElem > |
Maps an N-dimensional index to an N-dimensional position. At least one dimension must always be 1 or zero. More... | |
template<std::size_t TDimOut, std::size_t TDimIn, std::size_t TidxDimPitch, typename TElem > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | mapIdxPitchBytes (Vec< DimInt< TDimIn >, TElem > const &in, Vec< DimInt< TidxDimPitch >, TElem > const &pitches) -> Vec< DimInt< TDimOut >, TElem > |
Maps an N dimensional index to a N dimensional position based on the pitches of a view without padding or a byte view. At least one dimension must always be 1 or zero. More... | |
template<typename TMemFence , typename TMemScope > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto | mem_fence (TMemFence const &fence, TMemScope const &scope) -> void |
Issues memory fence instructions. More... | |
template<typename TTag , typename TViewSrc , typename TTypeDst , typename TQueue , typename std::enable_if_t< std::is_same_v< TTag, TagCpuOmp2Blocks >||std::is_same_v< TTag, TagCpuOmp2Threads >||std::is_same_v< TTag, TagCpuSerial >||std::is_same_v< TTag, TagCpuTbbBlocks >||std::is_same_v< TTag, TagCpuThreads >, int > = 0> | |
ALPAKA_FN_HOST auto | memcpy (TQueue &queue, alpaka::detail::DevGlobalImplGeneric< TTag, TTypeDst > &viewDst, TViewSrc const &viewSrc) -> void |
template<typename TTag , typename TExtent , typename TViewSrc , typename TTypeDst , typename TQueue , typename std::enable_if_t< std::is_same_v< TTag, TagCpuOmp2Blocks >||std::is_same_v< TTag, TagCpuOmp2Threads >||std::is_same_v< TTag, TagCpuSerial >||std::is_same_v< TTag, TagCpuTbbBlocks >||std::is_same_v< TTag, TagCpuThreads >, int > = 0> | |
ALPAKA_FN_HOST auto | memcpy (TQueue &queue, alpaka::detail::DevGlobalImplGeneric< TTag, TTypeDst > &viewDst, TViewSrc const &viewSrc, TExtent const &extent) -> void |
template<typename TTag , typename TTypeSrc , typename TViewDstFwd , typename TQueue , typename std::enable_if_t< std::is_same_v< TTag, TagCpuOmp2Blocks >||std::is_same_v< TTag, TagCpuOmp2Threads >||std::is_same_v< TTag, TagCpuSerial >||std::is_same_v< TTag, TagCpuTbbBlocks >||std::is_same_v< TTag, TagCpuThreads >, int > = 0> | |
ALPAKA_FN_HOST auto | memcpy (TQueue &queue, TViewDstFwd &&viewDst, alpaka::detail::DevGlobalImplGeneric< TTag, TTypeSrc > &viewSrc) -> void |
template<typename TTag , typename TExtent , typename TTypeSrc , typename TViewDstFwd , typename TQueue , typename std::enable_if_t< std::is_same_v< TTag, TagCpuOmp2Blocks >||std::is_same_v< TTag, TagCpuOmp2Threads >||std::is_same_v< TTag, TagCpuSerial >||std::is_same_v< TTag, TagCpuTbbBlocks >||std::is_same_v< TTag, TagCpuThreads >, int > = 0> | |
ALPAKA_FN_HOST auto | memcpy (TQueue &queue, TViewDstFwd &&viewDst, alpaka::detail::DevGlobalImplGeneric< TTag, TTypeSrc > &viewSrc, TExtent const &extent) -> void |
template<typename TViewSrc , typename TViewDstFwd , typename TQueue > | |
ALPAKA_FN_HOST auto | memcpy (TQueue &queue, TViewDstFwd &&viewDst, TViewSrc const &viewSrc) -> void |
Copies the entire memory of viewSrc to viewDst. Possibly copies between different memory spaces. More... | |
template<typename TExtent , typename TViewSrc , typename TViewDstFwd , typename TQueue > | |
ALPAKA_FN_HOST auto | memcpy (TQueue &queue, TViewDstFwd &&viewDst, TViewSrc const &viewSrc, TExtent const &extent) -> void |
Copies memory from a part of viewSrc to viewDst, described by extent. Possibly copies between different memory spaces. More... | |
template<typename TTag , typename TApi , bool TBlocking, typename TTypeDst , typename TViewSrc , typename std::enable_if_t<(std::is_same_v< TTag, TagGpuCudaRt > &&std::is_same_v< TApi, ApiCudaRt >), int > = 0> | |
ALPAKA_FN_HOST auto | memcpy (uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking > &queue, alpaka::detail::DevGlobalImplGeneric< TTag, TTypeDst > &viewDst, TViewSrc const &viewSrc) |
template<typename TTag , typename TApi , bool TBlocking, typename TTypeDst , typename TViewSrc , typename TExtent , typename std::enable_if_t<(std::is_same_v< TTag, TagGpuCudaRt > &&std::is_same_v< TApi, ApiCudaRt >), int > = 0> | |
ALPAKA_FN_HOST auto | memcpy (uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking > &queue, alpaka::detail::DevGlobalImplGeneric< TTag, TTypeDst > &viewDst, TViewSrc const &viewSrc, TExtent extent) |
template<typename TTag , typename TApi , bool TBlocking, typename TViewDst , typename TTypeSrc , typename std::enable_if_t<(std::is_same_v< TTag, TagGpuCudaRt > &&std::is_same_v< TApi, ApiCudaRt >), int > = 0> | |
ALPAKA_FN_HOST auto | memcpy (uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking > &queue, TViewDst &viewDst, alpaka::detail::DevGlobalImplGeneric< TTag, TTypeSrc > &viewSrc) |
template<typename TTag , typename TApi , bool TBlocking, typename TViewDst , typename TTypeSrc , typename TExtent , typename std::enable_if_t<(std::is_same_v< TTag, TagGpuCudaRt > &&std::is_same_v< TApi, ApiCudaRt >), int > = 0> | |
ALPAKA_FN_HOST auto | memcpy (uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking > &queue, TViewDst &viewDst, alpaka::detail::DevGlobalImplGeneric< TTag, TTypeSrc > &viewSrc, TExtent extent) |
template<typename TViewFwd , typename TQueue > | |
ALPAKA_FN_HOST auto | memset (TQueue &queue, TViewFwd &&view, std::uint8_t const &byte) -> void |
Sets each byte of the memory of the entire view to the given value. More... | |
template<typename TExtent , typename TViewFwd , typename TQueue > | |
ALPAKA_FN_HOST auto | memset (TQueue &queue, TViewFwd &&view, std::uint8_t const &byte, TExtent const &extent) -> void |
Sets the bytes of the memory of view, described by extent, to the given value. More... | |
template<typename T , typename TChar , typename TTraits > | |
std::basic_ostream< TChar, TTraits > & | operator<< (std::basic_ostream< TChar, TTraits > &os, Complex< T > const &x) |
Host-only output of a complex number. More... | |
template<typename T , typename TChar , typename TTraits > | |
std::basic_istream< TChar, TTraits > & | operator>> (std::basic_istream< TChar, TTraits > &is, Complex< T > const &x) |
Host-only input of a complex number. More... | |
template<typename TIntrinsic > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto | popcount (TIntrinsic const &intrinsic, std::uint32_t value) -> std::int32_t |
Returns the number of 1 bits in the given 32-bit value. More... | |
template<typename TIntrinsic > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto | popcount (TIntrinsic const &intrinsic, std::uint64_t value) -> std::int32_t |
Returns the number of 1 bits in the given 64-bit value. More... | |
template<typename TView > | |
ALPAKA_FN_HOST auto | print (TView const &view, std::ostream &os, std::string const &elementSeparator=", ", std::string const &rowSeparator="\n", std::string const &rowPrefix="[", std::string const &rowSuffix="]") -> void |
Prints the content of the view to the given queue. More... | |
template<typename TDev > | |
ALPAKA_FN_HOST auto | reset (TDev const &dev) -> void |
Resets the device. What this method does is dependent on the accelerator. More... | |
template<typename TVec > | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | reverseVec (TVec const &vec) |
template<typename TDim , typename TIdx > | |
ALPAKA_FN_HOST auto | subDivideGridElems (Vec< TDim, TIdx > const &gridElemExtent, Vec< TDim, TIdx > const &threadElemExtent, AccDevProps< TDim, TIdx > const &accDevProps, bool blockThreadMustDivideGridThreadExtent=true, GridBlockExtentSubDivRestrictions gridBlockExtentSubDivRestrictions=GridBlockExtentSubDivRestrictions::Unrestricted) -> WorkDivMembers< TDim, TIdx > |
Subdivides the given grid thread extent into blocks restricted by the maxima allowed. More... | |
template<typename TSubDim , typename TVec > | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | subVecBegin (TVec const &vec) |
template<typename TSubDim , typename TVec > | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | subVecEnd (TVec const &vec) |
template<typename TIndexSequence , typename TVec > | |
ALPAKA_NO_HOST_ACC_WARNING constexpr ALPAKA_FN_HOST_ACC auto | subVecFromIndices (TVec const &vec) |
Builds a new vector by selecting the elements of the source vector in the given order. Repeating and swizzling elements is allowed. More... | |
template<typename TBlockSync > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto | syncBlockThreads (TBlockSync const &blockSync) -> void |
Synchronizes all threads within the current block (independently for all blocks). More... | |
template<typename TOp , typename TBlockSync > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto | syncBlockThreadsPredicate (TBlockSync const &blockSync, int predicate) -> int |
Synchronizes all threads within the current block (independently for all blocks), evaluates the predicate for all threads and returns the combination of all the results computed via TOp. More... | |
template<typename TDim , typename TVal > | |
constexpr ALPAKA_FN_HOST_ACC auto | toArray (Vec< TDim, TVal > const &v) -> std::array< TVal, TDim::value > |
Converts a Vec to a std::array. More... | |
template<typename TFirstIndex , typename... TRestIndices> | |
Vec (TFirstIndex &&, TRestIndices &&...) -> Vec< DimInt< 1+sizeof...(TRestIndices)>, std::decay_t< TFirstIndex >> | |
template<typename TView > | |
ViewConst (TView) -> ViewConst< std::decay_t< TView >> | |
template<typename TAwaited > | |
ALPAKA_FN_HOST auto | wait (TAwaited const &awaited) -> void |
Waits the thread for the completion of the given awaited action to complete. More... | |
template<typename TWaiter , typename TAwaited > | |
ALPAKA_FN_HOST auto | wait (TWaiter &waiter, TAwaited const &awaited) -> void |
The waiter waits for the given awaited action to complete. More... | |
template<typename TDim , typename TIdx > | |
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC | WorkDivMembers (alpaka::Vec< TDim, TIdx > const &gridBlockExtent, alpaka::Vec< TDim, TIdx > const &blockThreadExtent, alpaka::Vec< TDim, TIdx > const &elemExtent) -> WorkDivMembers< TDim, TIdx > |
Deduction guide for the constructor which can be called without explicit template type parameters. More... | |
template<typename T > | |
ALPAKA_FN_HOST_ACC Complex< T > | operator+ (Complex< T > const &val) |
Host-device arithmetic operations matching std::complex<T>. More... | |
template<typename T > | |
ALPAKA_FN_HOST_ACC Complex< T > | operator- (Complex< T > const &val) |
Unary minus. More... | |
template<typename T > | |
ALPAKA_FN_HOST_ACC Complex< T > | operator+ (Complex< T > const &lhs, Complex< T > const &rhs) |
Addition of two complex numbers. More... | |
template<typename T > | |
ALPAKA_FN_HOST_ACC Complex< T > | operator+ (Complex< T > const &lhs, T const &rhs) |
Addition of a complex and a real number. More... | |
template<typename T > | |
ALPAKA_FN_HOST_ACC Complex< T > | operator+ (T const &lhs, Complex< T > const &rhs) |
Addition of a real and a complex number. More... | |
template<typename T > | |
ALPAKA_FN_HOST_ACC Complex< T > | operator- (Complex< T > const &lhs, Complex< T > const &rhs) |
Subtraction of two complex numbers. More... | |
template<typename T > | |
ALPAKA_FN_HOST_ACC Complex< T > | operator- (Complex< T > const &lhs, T const &rhs) |
Subtraction of a complex and a real number. More... | |
template<typename T > | |
ALPAKA_FN_HOST_ACC Complex< T > | operator- (T const &lhs, Complex< T > const &rhs) |
Subtraction of a real and a complex number. More... | |
template<typename T > | |
ALPAKA_FN_HOST_ACC Complex< T > | operator* (Complex< T > const &lhs, Complex< T > const &rhs) |
Muptiplication of two complex numbers. More... | |
template<typename T > | |
ALPAKA_FN_HOST_ACC Complex< T > | operator* (Complex< T > const &lhs, T const &rhs) |
Muptiplication of a complex and a real number. More... | |
template<typename T > | |
ALPAKA_FN_HOST_ACC Complex< T > | operator* (T const &lhs, Complex< T > const &rhs) |
Muptiplication of a real and a complex number. More... | |
template<typename T > | |
ALPAKA_FN_HOST_ACC Complex< T > | operator/ (Complex< T > const &lhs, Complex< T > const &rhs) |
Division of two complex numbers. More... | |
template<typename T > | |
ALPAKA_FN_HOST_ACC Complex< T > | operator/ (Complex< T > const &lhs, T const &rhs) |
Division of complex and a real number. More... | |
template<typename T > | |
ALPAKA_FN_HOST_ACC Complex< T > | operator/ (T const &lhs, Complex< T > const &rhs) |
Division of a real and a complex number. More... | |
template<typename T > | |
constexpr ALPAKA_FN_HOST_ACC bool | operator== (Complex< T > const &lhs, Complex< T > const &rhs) |
Equality of two complex numbers. More... | |
template<typename T > | |
constexpr ALPAKA_FN_HOST_ACC bool | operator== (Complex< T > const &lhs, T const &rhs) |
Equality of a complex and a real number. More... | |
template<typename T > | |
constexpr ALPAKA_FN_HOST_ACC bool | operator== (T const &lhs, Complex< T > const &rhs) |
Equality of a real and a complex number. More... | |
template<typename T > | |
constexpr ALPAKA_FN_HOST_ACC bool | operator!= (Complex< T > const &lhs, Complex< T > const &rhs) |
Inequality of two complex numbers. More... | |
template<typename T > | |
constexpr ALPAKA_FN_HOST_ACC bool | operator!= (Complex< T > const &lhs, T const &rhs) |
Inequality of a complex and a real number. More... | |
template<typename T > | |
constexpr ALPAKA_FN_HOST_ACC bool | operator!= (T const &lhs, Complex< T > const &rhs) |
Inequality of a real and a complex number. More... | |
template<typename T > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC T | abs (Complex< T > const &x) |
Host-only math functions matching std::complex<T>. More... | |
template<typename T > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC Complex< T > | acos (Complex< T > const &x) |
Arc cosine. More... | |
template<typename T > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC Complex< T > | acosh (Complex< T > const &x) |
Arc hyperbolic cosine. More... | |
template<typename T > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC T | arg (Complex< T > const &x) |
Argument. More... | |
template<typename T > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC Complex< T > | asin (Complex< T > const &x) |
Arc sine. More... | |
template<typename T > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC Complex< T > | asinh (Complex< T > const &x) |
Arc hyperbolic sine. More... | |
template<typename T > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC Complex< T > | atan (Complex< T > const &x) |
Arc tangent. More... | |
template<typename T > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC Complex< T > | atanh (Complex< T > const &x) |
Arc hyperbolic tangent. More... | |
template<typename T > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC Complex< T > | conj (Complex< T > const &x) |
Complex conjugate. More... | |
template<typename T > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC Complex< T > | cos (Complex< T > const &x) |
Cosine. More... | |
template<typename T > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC Complex< T > | cosh (Complex< T > const &x) |
Hyperbolic cosine. More... | |
template<typename T > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC Complex< T > | exp (Complex< T > const &x) |
Exponential. More... | |
template<typename T > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC Complex< T > | log (Complex< T > const &x) |
Natural logarithm. More... | |
template<typename T > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC Complex< T > | log10 (Complex< T > const &x) |
Base 10 logarithm. More... | |
template<typename T > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC T | norm (Complex< T > const &x) |
Squared magnitude. More... | |
template<typename T > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC Complex< T > | polar (T const &r, T const &theta=T()) |
Get a complex number with given magnitude and phase angle. More... | |
template<typename T , typename U > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | pow (Complex< T > const &x, Complex< U > const &y) |
Complex power of a complex number. More... | |
template<typename T , typename U > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | pow (Complex< T > const &x, U const &y) |
Real power of a complex number. More... | |
template<typename T , typename U > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto | pow (T const &x, Complex< U > const &y) |
Complex power of a real number. More... | |
template<typename T > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC Complex< T > | proj (Complex< T > const &x) |
Projection onto the Riemann sphere. More... | |
template<typename T > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC Complex< T > | sin (Complex< T > const &x) |
Sine. More... | |
template<typename T > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC Complex< T > | sinh (Complex< T > const &x) |
Hyperbolic sine. More... | |
template<typename T > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC Complex< T > | sqrt (Complex< T > const &x) |
Square root. More... | |
template<typename T > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC Complex< T > | tan (Complex< T > const &x) |
Tangent. More... | |
template<typename T > | |
constexpr ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC Complex< T > | tanh (Complex< T > const &x) |
Hyperbolic tangent. More... | |
Variables | |
template<typename TAcc , typename... TTag> | |
constexpr bool | accMatchesTags = (std::is_same_v<alpaka::AccToTag<TAcc>, TTag> || ...) |
constexpr std::uint32_t | BlockSharedDynMemberAllocKiB = 47u |
template<typename TDev , typename TDim > | |
constexpr bool | hasAsyncBufSupport = trait::HasAsyncBufSupport<TDim, TDev>::value |
Checks if the given device can allocate a stream-ordered memory buffer of the given dimensionality. More... | |
template<typename TPlatform > | |
constexpr bool | hasMappedBufSupport = trait::HasMappedBufSupport<TPlatform>::value |
Checks if the host can allocate a pinned/mapped host memory, accessible by all devices in the given platform. More... | |
template<typename T , typename U > | |
constexpr auto | is_decayed_v = std::is_same_v<std::decay_t<T>, std::decay_t<U>> |
Provides a decaying wrapper around std::is_same. Example: is_decayed_v<volatile float, float> returns true. More... | |
template<typename TAcc > | |
constexpr bool | isAccelerator = concepts::ImplementsConcept<ConceptAcc, TAcc>::value |
True if TAcc is an accelerator, i.e. if it implements the ConceptAcc concept. More... | |
template<typename TDev > | |
constexpr bool | isDevice = concepts::ImplementsConcept<ConceptDev, TDev>::value |
True if TDev is a device, i.e. if it implements the ConceptDev concept. More... | |
template<typename TPlatform > | |
constexpr bool | isPlatform = concepts::ImplementsConcept<ConceptPlatform, TPlatform>::value |
True if TPlatform is a platform, i.e. if it implements the ConceptPlatform concept. More... | |
template<typename TQueue > | |
constexpr bool | isQueue = concepts::ImplementsConcept<ConceptQueue, TQueue>::value |
True if TQueue is a queue, i.e. if it implements the ConceptQueue concept. More... | |
template<typename T > | |
constexpr bool | isVec = false |
template<typename TDim , typename TVal > | |
constexpr bool | isVec< Vec< TDim, TVal > > = true |
template<typename T > | |
constexpr bool | isKernelArgumentTriviallyCopyable = IsKernelArgumentTriviallyCopyable<T>::value |
The alpaka accelerator library.
The alpaka library.
using alpaka::Acc = typedef typename trait::AccType<T>::type |
The accelerator type trait alias template to remove the ::type.
Definition at line 58 of file Traits.hpp.
using alpaka::AccGpuCudaRt = typedef AccGpuUniformCudaHipRt<ApiCudaRt, TDim, TIdx> |
Definition at line 16 of file AccGpuCudaRt.hpp.
using alpaka::AccToTag = typedef typename trait::AccToTag<TAcc>::type |
using alpaka::AtomicCpu = typedef AtomicStdLibLock<16> |
Definition at line 21 of file AtomicCpu.hpp.
using alpaka::AtomicHierarchy = typedef alpaka::meta::InheritFromList<alpaka::meta::Unique<std::tuple< TGridAtomic, TBlockAtomic, TThreadAtomic, concepts::Implements<ConceptAtomicGrids, TGridAtomic>, concepts::Implements<ConceptAtomicBlocks, TBlockAtomic>, concepts::Implements<ConceptAtomicThreads, TThreadAtomic> >> > |
build a single class to inherit from different atomic implementations
Definition at line 27 of file AtomicHierarchy.hpp.
using alpaka::AtomicHierarchyConcept = typedef typename detail::AtomicHierarchyConceptType<THierarchy>::type |
Definition at line 53 of file Traits.hpp.
using alpaka::Buf = typedef typename trait::BufType<alpaka::Dev<TDev>, TElem, TDim, TIdx>::type |
The memory buffer type trait alias template to remove the ::type.
Definition at line 52 of file Traits.hpp.
using alpaka::BufCudaRt = typedef BufUniformCudaHipRt<ApiCudaRt, TElem, TDim, TIdx> |
Definition at line 15 of file BufCudaRt.hpp.
using alpaka::Dev = typedef typename trait::DevType<T>::type |
The device type trait alias template to remove the ::type.
Definition at line 56 of file Traits.hpp.
using alpaka::DevCudaRt = typedef DevUniformCudaHipRt<ApiCudaRt> |
The CUDA RT device handle.
Definition at line 15 of file DevCudaRt.hpp.
using alpaka::DevGlobal = typedef typename detail::DevGlobalTrait<typename alpaka::trait::AccToTag<TAcc>::type, T>::Type |
Definition at line 44 of file Traits.hpp.
using alpaka::Dim = typedef typename trait::DimType<T>::type |
The dimension type trait alias template to remove the ::type.
Definition at line 19 of file Traits.hpp.
using alpaka::DimInt = typedef std::integral_constant<std::size_t, N> |
Definition at line 15 of file DimIntegralConst.hpp.
using alpaka::Elem = typedef std::remove_volatile_t<typename trait::ElemType<TView>::type> |
The element type trait alias template to remove the ::type.
Definition at line 21 of file Traits.hpp.
using alpaka::Event = typedef typename trait::EventType<T>::type |
The event type trait alias template to remove the ::type.
Definition at line 26 of file Traits.hpp.
using alpaka::EventCpu = typedef EventGenericThreads<DevCpu> |
Definition at line 12 of file EventCpu.hpp.
using alpaka::EventCudaRt = typedef EventUniformCudaHipRt<ApiCudaRt> |
The CUDA RT device event.
Definition at line 15 of file EventCudaRt.hpp.
using alpaka::ExampleDefaultAcc = typedef alpaka::AccGpuCudaRt<TDim, TIdx> |
Alias for the default accelerator used by examples. From a list of all accelerators the first one which is enabled is chosen. AccCpuSerial is selected last.
Definition at line 16 of file ExampleDefaultAcc.hpp.
using alpaka::Idx = typedef typename trait::IdxType<T>::type |
Definition at line 29 of file Traits.hpp.
using alpaka::NativeHandle = typedef decltype(getNativeHandle(std::declval<TImpl>())) |
Alias to the type of the native handle.
Definition at line 36 of file Traits.hpp.
using alpaka::Platform = typedef typename trait::PlatformType<T>::type |
The platform type trait alias template to remove the ::type.
Definition at line 51 of file Traits.hpp.
using alpaka::PlatformCudaRt = typedef PlatformUniformCudaHipRt<ApiCudaRt> |
The CUDA RT platform.
Definition at line 15 of file PlatformCudaRt.hpp.
using alpaka::Queue = typedef typename trait::QueueType<TEnv, TProperty>::type |
Queue based on the environment and a property.
TEnv | Environment type, e.g. accelerator, device or a platform. trait::QueueType must be specialized for TEnv |
TProperty | Property to define the behavior of TEnv. |
Definition at line 70 of file Traits.hpp.
Definition at line 191 of file DevCpu.hpp.
Definition at line 190 of file DevCpu.hpp.
using alpaka::QueueCudaRtBlocking = typedef QueueUniformCudaHipRtBlocking<ApiCudaRt> |
The CUDA RT blocking queue.
Definition at line 15 of file QueueCudaRtBlocking.hpp.
using alpaka::QueueCudaRtNonBlocking = typedef QueueUniformCudaHipRtNonBlocking<ApiCudaRt> |
The CUDA RT non-blocking queue.
Definition at line 15 of file QueueCudaRtNonBlocking.hpp.
using alpaka::QueueUniformCudaHipRtBlocking = typedef uniform_cuda_hip::detail::QueueUniformCudaHipRt<TApi, true> |
The CUDA/HIP RT blocking queue.
Definition at line 43 of file DevUniformCudaHipRt.hpp.
using alpaka::QueueUniformCudaHipRtNonBlocking = typedef uniform_cuda_hip::detail::QueueUniformCudaHipRt<TApi, false> |
The CUDA/HIP RT non-blocking queue.
Definition at line 46 of file DevUniformCudaHipRt.hpp.
using alpaka::remove_restrict_t = typedef typename remove_restrict<T>::type |
Helper to remove restrict from a type.
Definition at line 34 of file RemoveRestrict.hpp.
using alpaka::TagToAcc = typedef typename trait::TagToAcc<TTag, TDim, TIdx>::type |
using alpaka::TaskKernelGpuCudaRt = typedef TaskKernelGpuUniformCudaHipRt<ApiCudaRt, TAcc, TDim, TIdx, TKernelFnObj, TArgs...> |
Definition at line 15 of file TaskKernelGpuCudaRt.hpp.
|
strong |
The grid block extent subdivision restrictions.
Definition at line 27 of file WorkDivHelpers.hpp.
|
constexpr |
Host-only math functions matching std::complex<T>.
Due to issue #1688, these functions are technically marked host-device and suppress related warnings. However, they must be called for host only.
They take and return alpaka::Complex (or a real number when appropriate). Internally cast, fall back to std::complex implementation and cast back. These functions can be used directly on the host side. They are also picked up by ADL in math traits for CPU backends.
On the device side, alpaka math traits must be used instead. Note that the set of the traits is currently a bit smaller. Absolute value
Definition at line 371 of file Complex.hpp.
|
constexpr |
Arc cosine.
Definition at line 379 of file Complex.hpp.
|
constexpr |
Arc hyperbolic cosine.
Definition at line 387 of file Complex.hpp.
ALPAKA_FN_HOST auto alpaka::allocAsyncBuf | ( | TQueue | queue, |
TExtent const & | extent = TExtent() |
||
) |
Allocates stream-ordered memory on the given device.
TElem | The element type of the returned buffer. |
TIdx | The linear index type of the buffer. |
TExtent | The extent type of the buffer. |
TQueue | The type of queue used to order the buffer allocation. |
queue | The queue used to order the buffer allocation. |
extent | The extent of the buffer. |
Definition at line 79 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::allocAsyncBufIfSupported | ( | TQueue | queue, |
TExtent const & | extent = TExtent() |
||
) |
If supported, allocates stream-ordered memory on the given queue and the associated device. Otherwise, allocates regular memory on the device associated to the queue. Please note that stream-ordered and regular memory have different semantics: this function is provided for convenience in the cases where the difference is not relevant, and the stream-ordered memory is only used as a performance optimisation.
TElem | The element type of the returned buffer. |
TIdx | The linear index type of the buffer. |
TExtent | The extent type of the buffer. |
TQueue | The type of queue used to order the buffer allocation. |
queue | The queue used to order the buffer allocation. |
extent | The extent of the buffer. |
Definition at line 114 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::allocBuf | ( | TDev const & | dev, |
TExtent const & | extent = TExtent() |
||
) |
Allocates memory on the given device.
TElem | The element type of the returned buffer. |
TIdx | The linear index type of the buffer. |
TExtent | The extent type of the buffer. |
TDev | The type of device the buffer is allocated on. |
dev | The device to allocate the buffer on. |
extent | The extent of the buffer. |
Definition at line 64 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::allocMappedBuf | ( | DevCpu const & | host, |
TPlatform const & | platform, | ||
TExtent const & | extent = TExtent() |
||
) |
Allocates pinned/mapped host memory, accessible by all devices in the given platform.
TPlatform | The platform from which the buffer is accessible. |
TElem | The element type of the returned buffer. |
TIdx | The linear index type of the buffer. |
TExtent | The extent type of the buffer. |
host | The host device to allocate the buffer on. |
extent | The extent of the buffer. |
Definition at line 138 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::allocMappedBufIfSupported | ( | DevCpu const & | host, |
TPlatform const & | platform, | ||
TExtent const & | extent = TExtent() |
||
) |
If supported, allocates pinned/mapped host memory, accessible by all devices in the given platform. Otherwise, allocates regular host memory. Please note that pinned/mapped and regular memory may have different semantics: this function is provided for convenience in the cases where the difference is not relevant, and the pinned/mapped memory is only used as a performance optimisation.
TElem | The element type of the returned buffer. |
TIdx | The linear index type of the buffer. |
TExtent | The extent type of the buffer. |
TPlatform | The platform from which the buffer is accessible. |
host | The host device to allocate the buffer on. |
extent | The extent of the buffer. |
Definition at line 175 of file Traits.hpp.
|
constexpr |
Argument.
Definition at line 395 of file Complex.hpp.
|
constexpr |
Arc sine.
Definition at line 403 of file Complex.hpp.
|
constexpr |
Arc hyperbolic sine.
Definition at line 411 of file Complex.hpp.
|
constexpr |
Arc tangent.
Definition at line 419 of file Complex.hpp.
|
constexpr |
Arc hyperbolic tangent.
Definition at line 427 of file Complex.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicAdd | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | value, | ||
THierarchy const & | hier = THierarchy() |
||
) | -> T |
Executes an atomic add operation.
T | The value type. |
TAtomic | The atomic implementation type. |
addr | The value to change atomically. |
value | The value used in the atomic operation. |
atomic | The atomic implementation. |
Definition at line 114 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicAnd | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | value, | ||
THierarchy const & | hier = THierarchy() |
||
) | -> T |
Executes an atomic and operation.
T | The value type. |
TAtomic | The atomic implementation type. |
addr | The value to change atomically. |
value | The value used in the atomic operation. |
atomic | The atomic implementation. |
Definition at line 240 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicCas | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | compare, | ||
T const & | value, | ||
THierarchy const & | hier = THierarchy() |
||
) | -> T |
Executes an atomic compare-and-swap operation.
TAtomic | The atomic implementation type. |
T | The value type. |
atomic | The atomic implementation. |
addr | The value to change atomically. |
compare | The comparison value used in the atomic operation. |
value | The value used in the atomic operation. |
Definition at line 295 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicDec | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | value, | ||
THierarchy const & | hier = THierarchy() |
||
) | -> T |
Executes an atomic decrement operation.
T | The value type. |
TAtomic | The atomic implementation type. |
addr | The value to change atomically. |
value | The value used in the atomic operation. |
atomic | The atomic implementation. |
Definition at line 222 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicExch | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | value, | ||
THierarchy const & | hier = THierarchy() |
||
) | -> T |
Executes an atomic exchange operation.
T | The value type. |
TAtomic | The atomic implementation type. |
addr | The value to change atomically. |
value | The value used in the atomic operation. |
atomic | The atomic implementation. |
Definition at line 186 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicInc | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | value, | ||
THierarchy const & | hier = THierarchy() |
||
) | -> T |
Executes an atomic increment operation.
T | The value type. |
TAtomic | The atomic implementation type. |
addr | The value to change atomically. |
value | The value used in the atomic operation. |
atomic | The atomic implementation. |
Definition at line 204 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicMax | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | value, | ||
THierarchy const & | hier = THierarchy() |
||
) | -> T |
Executes an atomic max operation.
T | The value type. |
TAtomic | The atomic implementation type. |
addr | The value to change atomically. |
value | The value used in the atomic operation. |
atomic | The atomic implementation. |
Definition at line 168 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicMin | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | value, | ||
THierarchy const & | hier = THierarchy() |
||
) | -> T |
Executes an atomic min operation.
T | The value type. |
TAtomic | The atomic implementation type. |
addr | The value to change atomically. |
value | The value used in the atomic operation. |
atomic | The atomic implementation. |
Definition at line 150 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicOp | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | compare, | ||
T const & | value, | ||
THierarchy const & | = THierarchy() |
||
) | -> T |
Executes the given operation atomically.
TOp | The operation type. |
TAtomic | The atomic implementation type. |
T | The value type. |
atomic | The atomic implementation. |
addr | The value to change atomically. |
compare | The comparison value used in the atomic operation. |
value | The value used in the atomic operation. |
Definition at line 94 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicOp | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | value, | ||
THierarchy const & | = THierarchy() |
||
) | -> T |
Executes the given operation atomically.
TOp | The operation type. |
T | The value type. |
TAtomic | The atomic implementation type. |
addr | The value to change atomically. |
value | The value used in the atomic operation. |
atomic | The atomic implementation. |
Definition at line 73 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicOr | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | value, | ||
THierarchy const & | hier = THierarchy() |
||
) | -> T |
Executes an atomic or operation.
T | The value type. |
TAtomic | The atomic implementation type. |
addr | The value to change atomically. |
value | The value used in the atomic operation. |
atomic | The atomic implementation. |
Definition at line 258 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicSub | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | value, | ||
THierarchy const & | hier = THierarchy() |
||
) | -> T |
Executes an atomic sub operation.
T | The value type. |
TAtomic | The atomic implementation type. |
addr | The value to change atomically. |
value | The value used in the atomic operation. |
atomic | The atomic implementation. |
Definition at line 132 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::atomicXor | ( | TAtomic const & | atomic, |
T *const | addr, | ||
T const & | value, | ||
THierarchy const & | hier = THierarchy() |
||
) | -> T |
Executes an atomic xor operation.
T | The value type. |
TAtomic | The atomic implementation type. |
addr | The value to change atomically. |
value | The value used in the atomic operation. |
atomic | The atomic implementation. |
Definition at line 276 of file Traits.hpp.
|
constexpr |
Definition at line 82 of file Traits.hpp.
|
constexpr |
Definition at line 98 of file Traits.hpp.
|
constexpr |
Complex conjugate.
Definition at line 435 of file Complex.hpp.
|
constexpr |
Cosine.
Definition at line 443 of file Complex.hpp.
|
constexpr |
Hyperbolic cosine.
Definition at line 451 of file Complex.hpp.
auto alpaka::createSubView | ( | TView & | view, |
TExtent const & | extent, | ||
TOffsets const & | offset = TExtent() |
||
) |
Creates a sub view to an existing view.
view | The view this view is a sub-view of. |
extent | Number of elements the resulting view holds. |
offset | Number of elements skipped in view for the new origin of the resulting view. |
Definition at line 494 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::createTaskKernel | ( | TWorkDiv const & | workDiv, |
TKernelFnObj const & | kernelFnObj, | ||
TArgs &&... | args | ||
) |
Creates a kernel execution task.
TAcc | The accelerator type. |
workDiv | The index domain work division. |
kernelFnObj | The kernel function object which should be executed. |
args,... | The kernel invocation arguments. |
Definition at line 262 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::createTaskMemcpy | ( | TViewDstFwd && | viewDst, |
TViewSrc const & | viewSrc, | ||
TExtent const & | extent | ||
) |
Creates a memory copy task.
viewDst | The destination memory view. |
viewSrc | The source memory view. |
extent | The extent of the view to copy. |
Definition at line 253 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::createTaskMemset | ( | TViewFwd && | view, |
std::uint8_t const & | byte, | ||
TExtent const & | extent | ||
) |
Create a memory set task.
view | The memory view to fill. |
byte | Value to set for each element of the specified view. |
extent | The extent of the view to fill. |
Definition at line 207 of file Traits.hpp.
auto alpaka::createView | ( | TDev const & | dev, |
TContainer & | con | ||
) |
Creates a view to a contiguous container of device-accessible memory.
dev | Device from which the container can be accessed. |
con | Contiguous container. The container must provide a data() method. The data held by the container must be accessible from the given device. The GetExtent trait must be defined for the container. |
Definition at line 468 of file Traits.hpp.
auto alpaka::createView | ( | TDev const & | dev, |
TContainer & | con, | ||
TExtent const & | extent | ||
) |
Creates a view to a contiguous container of device-accessible memory.
dev | Device from which the container can be accessed. |
con | Contiguous container. The container must provide a data() method. The data held by the container must be accessible from the given device. The GetExtent trait must be defined for the container. |
extent | Number of elements held by the container. Using a multi-dimensional extent will result in a multi-dimensional view to the memory represented by the container. |
Definition at line 482 of file Traits.hpp.
auto alpaka::createView | ( | TDev const & | dev, |
TElem * | pMem, | ||
TExtent const & | extent | ||
) |
Creates a view to a device pointer.
dev | Device from where pMem can be accessed. |
pMem | Pointer to memory. The pointer must be accessible from the given device. |
extent | Number of elements represented by the pMem. Using a multi dimensional extent will result in a multi dimension view to the memory represented by pMem. |
Definition at line 434 of file Traits.hpp.
auto alpaka::createView | ( | TDev const & | dev, |
TElem * | pMem, | ||
TExtent const & | extent, | ||
TPitch | pitch | ||
) |
Creates a view to a device pointer.
dev | Device from where pMem can be accessed. |
pMem | Pointer to memory. The pointer must be accessible from the given device. |
extent | Number of elements represented by the pMem. Using a multi dimensional extent will result in a multi dimension view to the memory represented by pMem. |
pitch | Pitch in bytes for each dimension. Dimensionality must be equal to extent. |
Definition at line 456 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto alpaka::declareSharedVar | ( | TBlockSharedMemSt const & | blockSharedMemSt | ) | -> T& |
Declare a block shared variable.
The variable is uninitialized and not default constructed! The variable can be accessed by all threads within a block. Access to the variable is not thread safe.
T | The element type. |
TuniqueId | id those is unique inside a kernel |
TBlockSharedMemSt | The block shared allocator implementation type. |
blockSharedMemSt | The block shared allocator implementation. |
Definition at line 42 of file Traits.hpp.
|
constexpr |
|
constexpr |
ALPAKA_FN_HOST auto alpaka::empty | ( | TQueue const & | queue | ) | -> bool |
Tests if the queue is empty (all ops in the given queue have been completed).
Definition at line 58 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::enqueue | ( | TQueue & | queue, |
TTask && | task | ||
) | -> void |
Queues the given task in the given queue.
Special Handling for events: If the event has previously been queued, then this call will overwrite any existing state of the event. Any subsequent calls which examine the status of event will only examine the completion of this most recent call to enqueue. If a queue is waiting for an event the latter's event state at the time of the API call to wait() will be used to release the queue.
Definition at line 47 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::exec | ( | TQueue & | queue, |
TWorkDiv const & | workDiv, | ||
TKernelFnObj const & | kernelFnObj, | ||
TArgs &&... | args | ||
) | -> void |
Executes the given kernel in the given queue.
TAcc | The accelerator type. |
queue | The queue to enqueue the view copy task into. |
workDiv | The index domain work division. |
kernelFnObj | The kernel function object which should be executed. |
args,... | The kernel invocation arguments. |
Definition at line 309 of file Traits.hpp.
|
constexpr |
Exponential.
Definition at line 459 of file Complex.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto alpaka::ffs | ( | TIntrinsic const & | intrinsic, |
std::int32_t | value | ||
) | -> std::int32_t |
Returns the 1-based position of the least significant bit set to 1 in the given 32-bit value. Returns 0 for input value 0.
TIntrinsic | The intrinsic implementation type. |
intrinsic | The intrinsic implementation. |
value | The input value. |
Definition at line 65 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto alpaka::ffs | ( | TIntrinsic const & | intrinsic, |
std::int64_t | value | ||
) | -> std::int32_t |
Returns the 1-based position of the least significant bit set to 1 in the given 64-bit value. Returns 0 for input value 0.
TIntrinsic | The intrinsic implementation type. |
intrinsic | The intrinsic implementation. |
value | The input value. |
Definition at line 79 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::free | ( | TAlloc const & | alloc, |
T const *const | ptr | ||
) | -> void |
Frees the memory identified by the given pointer.
Definition at line 41 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto alpaka::freeSharedVars | ( | TBlockSharedMemSt & | blockSharedMemSt | ) | -> void |
Frees all memory used by block shared variables.
TBlockSharedMemSt | The block shared allocator implementation type. |
blockSharedMemSt | The block shared allocator implementation. |
Definition at line 54 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getAccDevProps | ( | TDev const & | dev | ) | -> AccDevProps<Dim<TAcc>, Idx<TAcc>> |
Definition at line 62 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getAccName | ( | ) | -> std::string |
TAcc | The accelerator type. |
Definition at line 72 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getBlockSharedMemDynSizeBytes | ( | TKernelFnObj const & | kernelFnObj, |
Vec< TDim, Idx< TAcc >> const & | blockThreadExtent, | ||
Vec< TDim, Idx< TAcc >> const & | threadElemExtent, | ||
TArgs const &... | args | ||
) | -> std::size_t |
TAcc | The accelerator type. |
kernelFnObj | The kernel object for which the block shared memory size should be calculated. |
blockThreadExtent | The block thread extent. |
threadElemExtent | The thread element extent. |
args,... | The kernel invocation arguments. |
Definition at line 157 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getDepth | ( | TExtent const & | extent = TExtent() | ) | -> Idx<TExtent> |
Definition at line 121 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getDev | ( | T const & | t | ) |
Definition at line 68 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getDevByIdx | ( | TPlatform const & | platform, |
std::size_t const & | devIdx | ||
) | -> Dev<TPlatform> |
Definition at line 62 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getDevCount | ( | TPlatform const & | platform | ) |
Definition at line 55 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getDevs | ( | TPlatform const & | platform | ) | -> std::vector<Dev<TPlatform>> |
Definition at line 69 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto alpaka::getDynSharedMem | ( | TBlockSharedMemDyn const & | blockSharedMemDyn | ) | -> T* |
Get block shared dynamic memory.
The available size of the memory can be defined by specializing the trait BlockSharedMemDynSizeBytes for a kernel. The Memory can be accessed by all threads within a block. Access to the memory is not thread safe.
T | The element type. |
TBlockSharedMemDyn | The block shared dynamic memory implementation type. |
blockSharedMemDyn | The block shared dynamic memory implementation. |
Definition at line 39 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getExtent | ( | TExtent const & | extent = TExtent() | ) | -> Idx<TExtent> |
Definition at line 43 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getExtentProduct | ( | T const & | object | ) | -> Idx<T> |
Definition at line 134 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getExtents | ( | T const & | object | ) | -> Vec<Dim<T>, Idx<T>> |
Definition at line 59 of file Traits.hpp.
|
constexpr |
T | has to specialize GetExtent. |
Definition at line 68 of file Traits.hpp.
|
constexpr |
T | has to specialize GetExtent. |
Definition at line 78 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getFreeMemBytes | ( | TDev const & | dev | ) | -> std::size_t |
Definition at line 104 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getHeight | ( | TExtent const & | extent = TExtent() | ) | -> Idx<TExtent> |
Definition at line 108 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getIdx | ( | TIdx const & | idx, |
TWorkDiv const & | workDiv | ||
) | -> Vec<Dim<TWorkDiv>, Idx<TIdx>> |
Get the indices requested.
Definition at line 23 of file Accessors.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getIdx | ( | TIdxWorkDiv const & | idxWorkDiv | ) | -> Vec<Dim<TIdxWorkDiv>, Idx<TIdxWorkDiv>> |
Get the indices requested.
Definition at line 31 of file Accessors.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getIdxThreadFirstElem | ( | [[maybe_unused] ] TIdxWorkDiv const & | idxWorkDiv, |
TGridThreadIdx const & | gridThreadIdx, | ||
TThreadElemExtent const & | threadElemExtent | ||
) | -> Vec<Dim<TIdxWorkDiv>, Idx<TIdxWorkDiv>> |
Get the index of the first element this thread computes.
Definition at line 89 of file Accessors.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getIdxThreadFirstElem | ( | TIdxWorkDiv const & | idxWorkDiv | ) | -> Vec<Dim<TIdxWorkDiv>, Idx<TIdxWorkDiv>> |
Get the index of the first element this thread computes.
Definition at line 110 of file Accessors.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getIdxThreadFirstElem | ( | TIdxWorkDiv const & | idxWorkDiv, |
TGridThreadIdx const & | gridThreadIdx | ||
) | -> Vec<Dim<TIdxWorkDiv>, Idx<TIdxWorkDiv>> |
Get the index of the first element this thread computes.
Definition at line 100 of file Accessors.hpp.
ALPAKA_FN_HOST auto alpaka::getMemBytes | ( | TDev const & | dev | ) | -> std::size_t |
Definition at line 95 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getName | ( | TDev const & | dev | ) | -> std::string |
Definition at line 87 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getNativeHandle | ( | TImpl const & | impl | ) |
Get the native handle of the alpaka object. It will return the alpaka object handle if there is any, otherwise it generates a compile time error.
Definition at line 29 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getOffset | ( | TOffsets const & | offsets | ) | -> Idx<TOffsets> |
Definition at line 39 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getOffsets | ( | T const & | object | ) | -> Vec<Dim<T>, Idx<T>> |
Definition at line 55 of file Traits.hpp.
|
constexpr |
T | has to specialize GetOffsets. |
Definition at line 64 of file Traits.hpp.
|
constexpr |
T | has to specialize GetOffsets. |
Definition at line 73 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getOffsetX | ( | TOffsets const & | offsets = TOffsets() | ) | -> Idx<TOffsets> |
Definition at line 87 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getOffsetY | ( | TOffsets const & | offsets = TOffsets() | ) | -> Idx<TOffsets> |
Definition at line 95 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getOffsetZ | ( | TOffsets const & | offsets = TOffsets() | ) | -> Idx<TOffsets> |
Definition at line 103 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getOmpSchedule | ( | TKernelFnObj const & | kernelFnObj, |
Vec< TDim, Idx< TAcc >> const & | blockThreadExtent, | ||
Vec< TDim, Idx< TAcc >> const & | threadElemExtent, | ||
TArgs const &... | args | ||
) |
TAcc | The accelerator type. |
kernelFnObj | The kernel object for which the block shared memory size should be calculated. |
blockThreadExtent | The block thread extent. |
threadElemExtent | The thread element extent. |
args,... | The kernel invocation arguments. |
Definition at line 186 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getPitchBytes | ( | TView const & | view | ) | -> Idx<TView> |
Definition at line 176 of file Traits.hpp.
auto alpaka::getPitchBytesVec | ( | TView const & | view | ) | -> Vec<Dim<TView>, Idx<TView>> |
Definition at line 412 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getPitchBytesVecEnd | ( | TView const & | view = TView() | ) | -> Vec<TDim, Idx<TView>> |
Definition at line 420 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getPitchesInBytes | ( | TView const & | view | ) | -> Vec<Dim<TView>, Idx<TView>> |
Definition at line 196 of file Traits.hpp.
|
constexpr |
Definition at line 118 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getPtrDev | ( | TView & | view, |
TDev const & | dev | ||
) | -> Elem<TView>* |
Gets the pointer to the view on the given device.
view | The memory view. |
dev | The device. |
Definition at line 168 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getPtrDev | ( | TView const & | view, |
TDev const & | dev | ||
) | -> Elem<TView> const* |
Gets the pointer to the view on the given device.
view | The memory view. |
dev | The device. |
Definition at line 157 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getPtrNative | ( | TView & | view | ) | -> Elem<TView>* |
Gets the native pointer of the memory view.
view | The memory view. |
Definition at line 146 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getPtrNative | ( | TView const & | view | ) | -> Elem<TView> const* |
Gets the native pointer of the memory view.
view | The memory view. |
Definition at line 136 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::getValidWorkDiv | ( | [[maybe_unused] ] TDev const & | dev, |
[[maybe_unused] ] TGridElemExtent const & | gridElemExtent = Vec<Dim<TAcc>, Idx<TAcc>>::ones() , |
||
[[maybe_unused] ] TThreadElemExtent const & | threadElemExtents = Vec<Dim<TAcc>, Idx<TAcc>>::ones() , |
||
[[maybe_unused] ] bool | blockThreadMustDivideGridThreadExtent = true , |
||
[[maybe_unused] ] GridBlockExtentSubDivRestrictions | gridBlockExtentSubDivRestrictions = GridBlockExtentSubDivRestrictions::Unrestricted |
||
) | -> WorkDivMembers<Dim<TGridElemExtent>, Idx<TGridElemExtent>> |
TAcc | The accelerator for which this work division has to be valid. |
TGridElemExtent | The type of the grid element extent. |
TThreadElemExtent | The type of the thread element extent. |
TDev | The type of the device. |
dev | The device the work division should be valid for. |
gridElemExtent | The full extent of elements in the grid. |
threadElemExtents | the number of elements computed per thread. |
blockThreadMustDivideGridThreadExtent | If this is true, the grid thread extent will be multiples of the corresponding block thread extent. NOTE: If this is true and gridThreadExtent is prime (or otherwise bad chosen) in a dimension, the block thread extent will be one in this dimension. |
gridBlockExtentSubDivRestrictions | The grid block extent subdivision restrictions. |
Definition at line 284 of file WorkDivHelpers.hpp.
ALPAKA_FN_HOST auto alpaka::getWarpSizes | ( | TDev const & | dev | ) | -> std::vector<std::size_t> |
Definition at line 111 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getWidth | ( | TExtent const & | extent = TExtent() | ) | -> Idx<TExtent> |
Definition at line 95 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::getWorkDiv | ( | TWorkDiv const & | workDiv | ) | -> Vec<Dim<TWorkDiv>, Idx<TWorkDiv>> |
Get the extent requested.
Definition at line 33 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::isComplete | ( | TEvent const & | event | ) | -> bool |
Tests if the given event has already been completed.
Definition at line 34 of file Traits.hpp.
void alpaka::isSupportedByAtomicAtomicRef | ( | ) |
Definition at line 42 of file AtomicAtomicRef.hpp.
ALPAKA_FN_HOST auto alpaka::isValidAccDevProps | ( | AccDevProps< TDim, TIdx > const & | accDevProps | ) | -> bool |
TDim | The dimensionality of the accelerator device properties. |
TIdx | The idx type of the accelerator device properties. |
accDevProps | The maxima for the work division. |
Definition at line 84 of file WorkDivHelpers.hpp.
ALPAKA_FN_HOST auto alpaka::isValidWorkDiv | ( | AccDevProps< TDim, TIdx > const & | accDevProps, |
TWorkDiv const & | workDiv | ||
) | -> bool |
TDim | The dimensionality of the accelerator device properties. |
TIdx | The idx type of the accelerator device properties. |
TWorkDiv | The type of the work division. |
accDevProps | The maxima for the work division. |
workDiv | The work division to test for validity. |
Definition at line 331 of file WorkDivHelpers.hpp.
ALPAKA_FN_HOST auto alpaka::isValidWorkDiv | ( | TDev const & | dev, |
TWorkDiv const & | workDiv | ||
) | -> bool |
TAcc | The accelerator to test the validity on. |
dev | The device to test the work division for validity on. |
workDiv | The work division to test for validity. |
Definition at line 380 of file WorkDivHelpers.hpp.
|
constexpr |
Natural logarithm.
Definition at line 467 of file Complex.hpp.
|
constexpr |
Base 10 logarithm.
Definition at line 475 of file Complex.hpp.
ALPAKA_FN_HOST auto alpaka::malloc | ( | TAlloc const & | alloc, |
std::size_t const & | sizeElems | ||
) | -> T* |
Definition at line 33 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::mapIdx | ( | Vec< DimInt< TDimIn >, TElem > const & | in, |
Vec< DimInt< TDimExtents >, TElem > const & | extent | ||
) | -> Vec<DimInt<TDimOut>, TElem> |
Maps an N-dimensional index to an N-dimensional position. At least one dimension must always be 1 or zero.
TDimOut | Dimension of the index vector to map to. |
in | The index vector to map from. |
extent | The extents of the input or output space, whichever has more than 1 dimensions. |
Definition at line 26 of file MapIdx.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC auto alpaka::mapIdxPitchBytes | ( | Vec< DimInt< TDimIn >, TElem > const & | in, |
Vec< DimInt< TidxDimPitch >, TElem > const & | pitches | ||
) | -> Vec<DimInt<TDimOut>, TElem> |
Maps an N dimensional index to a N dimensional position based on the pitches of a view without padding or a byte view. At least one dimension must always be 1 or zero.
TDimOut | Dimension of the index vector to map to. |
in | The index vector to map from. |
pitches | The pitches of the input or output space, whichever has more than 1 dimensions. |
Definition at line 66 of file MapIdx.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto alpaka::mem_fence | ( | TMemFence const & | fence, |
TMemScope const & | scope | ||
) | -> void |
Issues memory fence instructions.
TMemFence | The memory fence implementation type. |
TMemScope | The memory scope type. |
fence | The memory fence implementation. |
scope | The memory scope. |
Definition at line 61 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::memcpy | ( | TQueue & | queue, |
alpaka::detail::DevGlobalImplGeneric< TTag, TTypeDst > & | viewDst, | ||
TViewSrc const & | viewSrc | ||
) | -> void |
Definition at line 61 of file DeviceGlobalCpu.hpp.
ALPAKA_FN_HOST auto alpaka::memcpy | ( | TQueue & | queue, |
alpaka::detail::DevGlobalImplGeneric< TTag, TTypeDst > & | viewDst, | ||
TViewSrc const & | viewSrc, | ||
TExtent const & | extent | ||
) | -> void |
Definition at line 112 of file DeviceGlobalCpu.hpp.
ALPAKA_FN_HOST auto alpaka::memcpy | ( | TQueue & | queue, |
TViewDstFwd && | viewDst, | ||
alpaka::detail::DevGlobalImplGeneric< TTag, TTypeSrc > & | viewSrc | ||
) | -> void |
Definition at line 86 of file DeviceGlobalCpu.hpp.
ALPAKA_FN_HOST auto alpaka::memcpy | ( | TQueue & | queue, |
TViewDstFwd && | viewDst, | ||
alpaka::detail::DevGlobalImplGeneric< TTag, TTypeSrc > & | viewSrc, | ||
TExtent const & | extent | ||
) | -> void |
Definition at line 138 of file DeviceGlobalCpu.hpp.
ALPAKA_FN_HOST auto alpaka::memcpy | ( | TQueue & | queue, |
TViewDstFwd && | viewDst, | ||
TViewSrc const & | viewSrc | ||
) | -> void |
Copies the entire memory of viewSrc to viewDst. Possibly copies between different memory spaces.
queue | The queue to enqueue the view copy task into. | |
[in,out] | viewDst | The destination memory view. May be a temporary object. |
viewSrc | The source memory view. May be a temporary object. |
Definition at line 307 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::memcpy | ( | TQueue & | queue, |
TViewDstFwd && | viewDst, | ||
TViewSrc const & | viewSrc, | ||
TExtent const & | extent | ||
) | -> void |
Copies memory from a part of viewSrc to viewDst, described by extent. Possibly copies between different memory spaces.
queue | The queue to enqueue the view copy task into. | |
[in,out] | viewDst | The destination memory view. May be a temporary object. |
viewSrc | The source memory view. May be a temporary object. | |
extent | The extent of the view to copy. |
Definition at line 294 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::memcpy | ( | uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking > & | queue, |
alpaka::detail::DevGlobalImplGeneric< TTag, TTypeDst > & | viewDst, | ||
TViewSrc const & | viewSrc | ||
) |
Definition at line 86 of file DeviceGlobalUniformCudaHipBuiltIn.hpp.
ALPAKA_FN_HOST auto alpaka::memcpy | ( | uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking > & | queue, |
alpaka::detail::DevGlobalImplGeneric< TTag, TTypeDst > & | viewDst, | ||
TViewSrc const & | viewSrc, | ||
TExtent | extent | ||
) |
Definition at line 159 of file DeviceGlobalUniformCudaHipBuiltIn.hpp.
ALPAKA_FN_HOST auto alpaka::memcpy | ( | uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking > & | queue, |
TViewDst & | viewDst, | ||
alpaka::detail::DevGlobalImplGeneric< TTag, TTypeSrc > & | viewSrc | ||
) |
Definition at line 50 of file DeviceGlobalUniformCudaHipBuiltIn.hpp.
ALPAKA_FN_HOST auto alpaka::memcpy | ( | uniform_cuda_hip::detail::QueueUniformCudaHipRt< TApi, TBlocking > & | queue, |
TViewDst & | viewDst, | ||
alpaka::detail::DevGlobalImplGeneric< TTag, TTypeSrc > & | viewSrc, | ||
TExtent | extent | ||
) |
Definition at line 123 of file DeviceGlobalUniformCudaHipBuiltIn.hpp.
ALPAKA_FN_HOST auto alpaka::memset | ( | TQueue & | queue, |
TViewFwd && | view, | ||
std::uint8_t const & | byte | ||
) | -> void |
Sets each byte of the memory of the entire view to the given value.
queue | The queue to enqueue the view fill task into. | |
[in,out] | view | The memory view to fill. May be a temporary object. |
byte | Value to set for each element of the specified view. |
Definition at line 242 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::memset | ( | TQueue & | queue, |
TViewFwd && | view, | ||
std::uint8_t const & | byte, | ||
TExtent const & | extent | ||
) | -> void |
Sets the bytes of the memory of view, described by extent, to the given value.
queue | The queue to enqueue the view fill task into. | |
[in,out] | view | The memory view to fill. May be a temporary object. |
byte | Value to set for each element of the specified view. | |
extent | The extent of the view to fill. |
Definition at line 231 of file Traits.hpp.
|
constexpr |
Squared magnitude.
Definition at line 483 of file Complex.hpp.
|
constexpr |
Inequality of two complex numbers.
Definition at line 311 of file Complex.hpp.
|
constexpr |
Inequality of a complex and a real number.
Definition at line 318 of file Complex.hpp.
|
constexpr |
Inequality of a real and a complex number.
Definition at line 326 of file Complex.hpp.
ALPAKA_FN_HOST_ACC Complex<T> alpaka::operator* | ( | Complex< T > const & | lhs, |
Complex< T > const & | rhs | ||
) |
Muptiplication of two complex numbers.
Definition at line 237 of file Complex.hpp.
ALPAKA_FN_HOST_ACC Complex<T> alpaka::operator* | ( | Complex< T > const & | lhs, |
T const & | rhs | ||
) |
Muptiplication of a complex and a real number.
Definition at line 246 of file Complex.hpp.
ALPAKA_FN_HOST_ACC Complex<T> alpaka::operator* | ( | T const & | lhs, |
Complex< T > const & | rhs | ||
) |
Muptiplication of a real and a complex number.
Definition at line 253 of file Complex.hpp.
ALPAKA_FN_HOST_ACC Complex<T> alpaka::operator+ | ( | Complex< T > const & | lhs, |
Complex< T > const & | rhs | ||
) |
Addition of two complex numbers.
Definition at line 195 of file Complex.hpp.
ALPAKA_FN_HOST_ACC Complex<T> alpaka::operator+ | ( | Complex< T > const & | lhs, |
T const & | rhs | ||
) |
Addition of a complex and a real number.
Definition at line 202 of file Complex.hpp.
ALPAKA_FN_HOST_ACC Complex<T> alpaka::operator+ | ( | Complex< T > const & | val | ) |
Host-device arithmetic operations matching std::complex<T>.
They take and return alpaka::Complex. Unary plus (added for compatibility with std::complex)
Definition at line 181 of file Complex.hpp.
ALPAKA_FN_HOST_ACC Complex<T> alpaka::operator+ | ( | T const & | lhs, |
Complex< T > const & | rhs | ||
) |
Addition of a real and a complex number.
Definition at line 209 of file Complex.hpp.
ALPAKA_FN_HOST_ACC Complex<T> alpaka::operator- | ( | Complex< T > const & | lhs, |
Complex< T > const & | rhs | ||
) |
Subtraction of two complex numbers.
Definition at line 216 of file Complex.hpp.
ALPAKA_FN_HOST_ACC Complex<T> alpaka::operator- | ( | Complex< T > const & | lhs, |
T const & | rhs | ||
) |
Subtraction of a complex and a real number.
Definition at line 223 of file Complex.hpp.
ALPAKA_FN_HOST_ACC Complex<T> alpaka::operator- | ( | Complex< T > const & | val | ) |
Unary minus.
Definition at line 188 of file Complex.hpp.
ALPAKA_FN_HOST_ACC Complex<T> alpaka::operator- | ( | T const & | lhs, |
Complex< T > const & | rhs | ||
) |
Subtraction of a real and a complex number.
Definition at line 230 of file Complex.hpp.
ALPAKA_FN_HOST_ACC Complex<T> alpaka::operator/ | ( | Complex< T > const & | lhs, |
Complex< T > const & | rhs | ||
) |
Division of two complex numbers.
Definition at line 260 of file Complex.hpp.
ALPAKA_FN_HOST_ACC Complex<T> alpaka::operator/ | ( | Complex< T > const & | lhs, |
T const & | rhs | ||
) |
Division of complex and a real number.
Definition at line 269 of file Complex.hpp.
ALPAKA_FN_HOST_ACC Complex<T> alpaka::operator/ | ( | T const & | lhs, |
Complex< T > const & | rhs | ||
) |
Division of a real and a complex number.
Definition at line 276 of file Complex.hpp.
std::basic_ostream<TChar, TTraits>& alpaka::operator<< | ( | std::basic_ostream< TChar, TTraits > & | os, |
Complex< T > const & | x | ||
) |
Host-only output of a complex number.
Definition at line 326 of file Complex.hpp.
|
constexpr |
Equality of two complex numbers.
Definition at line 285 of file Complex.hpp.
|
constexpr |
Equality of a complex and a real number.
Definition at line 293 of file Complex.hpp.
|
constexpr |
Equality of a real and a complex number.
Definition at line 301 of file Complex.hpp.
std::basic_istream<TChar, TTraits>& alpaka::operator>> | ( | std::basic_istream< TChar, TTraits > & | is, |
Complex< T > const & | x | ||
) |
Host-only input of a complex number.
Definition at line 344 of file Complex.hpp.
|
constexpr |
Get a complex number with given magnitude and phase angle.
Definition at line 491 of file Complex.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto alpaka::popcount | ( | TIntrinsic const & | intrinsic, |
std::uint32_t | value | ||
) | -> std::int32_t |
Returns the number of 1 bits in the given 32-bit value.
TIntrinsic | The intrinsic implementation type. |
intrinsic | The intrinsic implementation. |
value | The input value. |
Definition at line 38 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto alpaka::popcount | ( | TIntrinsic const & | intrinsic, |
std::uint64_t | value | ||
) | -> std::int32_t |
Returns the number of 1 bits in the given 64-bit value.
TIntrinsic | The intrinsic implementation type. |
intrinsic | The intrinsic implementation. |
value | The input value. |
Definition at line 51 of file Traits.hpp.
|
constexpr |
Complex power of a complex number.
Definition at line 499 of file Complex.hpp.
|
constexpr |
Real power of a complex number.
Definition at line 510 of file Complex.hpp.
|
constexpr |
Complex power of a real number.
Definition at line 518 of file Complex.hpp.
ALPAKA_FN_HOST auto alpaka::print | ( | TView const & | view, |
std::ostream & | os, | ||
std::string const & | elementSeparator = ", " , |
||
std::string const & | rowSeparator = "\n" , |
||
std::string const & | rowPrefix = "[" , |
||
std::string const & | rowSuffix = "]" |
||
) | -> void |
Prints the content of the view to the given queue.
Definition at line 391 of file Traits.hpp.
|
constexpr |
Projection onto the Riemann sphere.
Definition at line 526 of file Complex.hpp.
ALPAKA_FN_HOST auto alpaka::reset | ( | TDev const & | dev | ) | -> void |
Resets the device. What this method does is dependent on the accelerator.
Definition at line 126 of file Traits.hpp.
|
constexpr |
Definition at line 90 of file Traits.hpp.
|
constexpr |
Sine.
Definition at line 534 of file Complex.hpp.
|
constexpr |
Hyperbolic sine.
Definition at line 542 of file Complex.hpp.
|
constexpr |
Square root.
Definition at line 550 of file Complex.hpp.
ALPAKA_FN_HOST auto alpaka::subDivideGridElems | ( | Vec< TDim, TIdx > const & | gridElemExtent, |
Vec< TDim, TIdx > const & | threadElemExtent, | ||
AccDevProps< TDim, TIdx > const & | accDevProps, | ||
bool | blockThreadMustDivideGridThreadExtent = true , |
||
GridBlockExtentSubDivRestrictions | gridBlockExtentSubDivRestrictions = GridBlockExtentSubDivRestrictions::Unrestricted |
||
) | -> WorkDivMembers<TDim, TIdx> |
Subdivides the given grid thread extent into blocks restricted by the maxima allowed.
gridElemExtent | The full extent of elements in the grid. |
threadElemExtent | the number of elements computed per thread. |
accDevProps | The maxima for the work division. |
blockThreadMustDivideGridThreadExtent | If this is true, the grid thread extent will be multiples of the corresponding block thread extent. NOTE: If this is true and gridThreadExtent is prime (or otherwise bad chosen) in a dimension, the block thread extent will be one in this dimension. |
gridBlockExtentSubDivRestrictions | The grid block extent subdivision restrictions. |
Definition at line 129 of file WorkDivHelpers.hpp.
|
constexpr |
TVec | has to specialize SubVecFromIndices. |
A sequence of integers from 0 to dim-1.
Definition at line 51 of file Traits.hpp.
|
constexpr |
TVec | has to specialize SubVecFromIndices. |
A sequence of integers from 0 to dim-1.
Definition at line 66 of file Traits.hpp.
|
constexpr |
Builds a new vector by selecting the elements of the source vector in the given order. Repeating and swizzling elements is allowed.
Definition at line 42 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto alpaka::syncBlockThreads | ( | TBlockSync const & | blockSync | ) | -> void |
Synchronizes all threads within the current block (independently for all blocks).
TBlockSync | The block synchronization implementation type. |
blockSync | The block synchronization implementation. |
Definition at line 36 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_ACC auto alpaka::syncBlockThreadsPredicate | ( | TBlockSync const & | blockSync, |
int | predicate | ||
) | -> int |
Synchronizes all threads within the current block (independently for all blocks), evaluates the predicate for all threads and returns the combination of all the results computed via TOp.
TOp | The operation used to combine the predicate values of all threads. |
TBlockSync | The block synchronization implementation type. |
blockSync | The block synchronization implementation. |
predicate | The predicate value of the current thread. |
Definition at line 100 of file Traits.hpp.
|
constexpr |
Tangent.
Definition at line 558 of file Complex.hpp.
|
constexpr |
Hyperbolic tangent.
Definition at line 566 of file Complex.hpp.
|
constexpr |
alpaka::Vec | ( | TFirstIndex && | , |
TRestIndices && | ... | ||
) | -> Vec< DimInt< 1+sizeof...(TRestIndices)>, std::decay_t< TFirstIndex >> |
alpaka::ViewConst | ( | TView | ) | -> ViewConst< std::decay_t< TView >> |
ALPAKA_FN_HOST auto alpaka::wait | ( | TAwaited const & | awaited | ) | -> void |
Waits the thread for the completion of the given awaited action to complete.
Special Handling for events: If the event is re-enqueued wait() will terminate when the re-enqueued event will be ready and previously enqueued states of the event will be ignored.
Definition at line 34 of file Traits.hpp.
ALPAKA_FN_HOST auto alpaka::wait | ( | TWaiter & | waiter, |
TAwaited const & | awaited | ||
) | -> void |
The waiter waits for the given awaited action to complete.
Special Handling if waiter
is a queue and awaited
an event: The waiter
waits for the event state to become ready based on the recently captured event state at the time of the API call even if the event is being re-enqueued later.
Definition at line 46 of file Traits.hpp.
ALPAKA_NO_HOST_ACC_WARNING ALPAKA_FN_HOST_ACC alpaka::WorkDivMembers | ( | alpaka::Vec< TDim, TIdx > const & | gridBlockExtent, |
alpaka::Vec< TDim, TIdx > const & | blockThreadExtent, | ||
alpaka::Vec< TDim, TIdx > const & | elemExtent | ||
) | -> WorkDivMembers< TDim, TIdx > |
Deduction guide for the constructor which can be called without explicit template type parameters.
|
inlineconstexpr |
|
constexpr |
Definition at line 14 of file BlockSharedDynMemberAllocKiB.hpp.
|
inlineconstexpr |
Checks if the given device can allocate a stream-ordered memory buffer of the given dimensionality.
TDev | The type of device to allocate the buffer on. |
TDim | The dimensionality of the buffer to allocate. |
Definition at line 95 of file Traits.hpp.
|
inlineconstexpr |
Checks if the host can allocate a pinned/mapped host memory, accessible by all devices in the given platform.
TPlatform | The platform from which the buffer is accessible. |
Definition at line 156 of file Traits.hpp.
|
inlineconstexpr |
|
inlineconstexpr |
True if TAcc is an accelerator, i.e. if it implements the ConceptAcc concept.
Definition at line 30 of file Traits.hpp.
|
inlineconstexpr |
True if TDev is a device, i.e. if it implements the ConceptDev concept.
Definition at line 64 of file Traits.hpp.
|
inlineconstexpr |
Definition at line 224 of file Traits.hpp.
|
inlineconstexpr |
True if TPlatform is a platform, i.e. if it implements the ConceptPlatform concept.
Definition at line 23 of file Traits.hpp.
|
inlineconstexpr |
True if TQueue is a queue, i.e. if it implements the ConceptQueue concept.
Definition at line 20 of file Traits.hpp.
|
inlineconstexpr |
|
inlineconstexpr |