Skip to main content

Module cublas

Module cublas 

Source
Expand description

Sys-level safe wrappers for the cuBLAS entry points cudarc 0.19 doesn’t expose through its safe layer.

Wrapped today (Phase 1 cuBLAS slice):

  • cublasGemmEx, cublasGemmStridedBatchedEx
  • cublasAxpyEx, cublasScalEx, cublasNrm2Ex, cublasDotEx
  • cublasIamaxEx, cublasIaminEx, cublasAsumEx
  • cublasCopyEx, cublasSwapEx, cublasRotEx
  • cublasGemv_v2/cublasDgemv_v2, cublasSger_v2/cublasDger_v2
  • cublasSgeam/cublasDgeam
  • cublasSsyrk_v2/cublasDsyrk_v2
  • cublasStrsm_v2/cublasDtrsm_v2

All callers must hold the cuBLAS handle’s stream current on the same OS thread. The atomr-accel-cuda actor pipeline guarantees that via GpuDispatcher.

Functions§

asum_ex
axpy_ex
copy_ex
dgeam
dgemv
dger
dot_ex
dsyrk
dtrsm
gemm_ex
cublasGemmEx — typed-erased gemm with a separate compute type.
gemm_strided_batched_ex
cublasGemmStridedBatchedEx — typed-erased strided-batched gemm.
iamax_ex
iamin_ex
nrm2_ex
rot_ex
scal_ex
sgeam
cublasSgeam / cublasDgeam — matrix add/scale: C = α·op(A) + β·op(B).
sgemv
sger
ssyrk
strsm
swap_ex