Skip to main content

Module dtype

Module dtype 

Source
Expand description

CudaDtype — CUDA-side dtype mappings and capability markers.

The backend-agnostic AccelDtype trait (in atomr-accel) names the dtype and gives identity values. CudaDtype adds the cudarc- enum mappings every kernel actor needs:

  • cuda_data_typecudaDataType_t (consumed by cuBLAS, cuBLASLt, cuSPARSE, cuSOLVER, cuTENSOR).
  • cublas_compute_type — the natural cublasComputeType_t for matmul accumulation.
  • cudnn_data_typecudnnDataType_t (cuDNN tensor descriptor element type), gated on cudnn.
  • nccl_data_typencclDataType_t (collective-op element type), gated on nccl.
  • cuda_type_name — CUDA C++ type name ("float", "__half", "__nv_bfloat16", …) for NVRTC kernel source generation.

Capability markers (GemmSupported, CudnnSupported, …) are the compile-time gate keeping operations from being dispatched against unsupported dtypes — BlasMsg::gemm::<i64>(...) does not compile because i64: GemmSupported has no impl.

Structs§

C32
Local fp8 / fp4 wrappers (#[repr(transparent)] over u8) that satisfy cudarc’s orphan-rule constraint for unsafe impl DeviceRepr. Convertible from/to the backend-agnostic atomr_accel::dtype::* equivalents. 64-bit interleaved complex ({re, im} of f32). Layout matches cufft_sys::float2 and numpy.complex64. Phase 1.5++.
C64
128-bit interleaved complex ({re, im} of f64). Layout matches cufft_sys::double2 and numpy.complex128. Phase 1.5++.

Enums§

DType
Re-export atomr_accel::DType so existing crate::dtype::DType imports inside atomr-accel-cuda (added by Phase 0.4) keep working without changing every call site. Compact discriminant for AccelDtype::KIND.
DTypeKind
Alias used by BlasLtDispatch::dtype_kind and other Phase 1 dispatchers. Compact discriminant for AccelDtype::KIND.

Traits§

AccelDtype
Re-export so crate::dtype::AccelDtype resolves for actor modules that prefer the unified import path. Marker for any numeric type that can be a typed device buffer element across atomr-accel backends.
AxpyDotNrm2Supported
CudaDtype
CUDA-specific layer over AccelDtype.
CudnnSupported
Capability marker — type may be a cuDNN tensor element.
FftSupported
Capability marker — type may be a cuFFT element.
GeamSupported
GemmSupported
Capability marker — type may be a cuBLAS GEMM operand.
GemvSupported
GerSupported
NcclReduceSupported
Capability marker — type may be an NCCL collective-op element.
RngFloatSupported
Capability marker — type may be a cuRAND distribution element (Self is one of the float dtypes accepted by curandGenerate*).
RngIntSupported
Capability marker — cuRAND integer-fill operand. curandGenerate produces u32, curandGenerateLongLong produces u64. Used by Discrete and raw-bit paths.
SolverSupported
Capability marker — type may be a cuSOLVER dense factorization element (real or complex float).
SparseIndex
Phase 4 cuSPARSE index-type marker. Only i32 and i64 are representable cuSPARSE row/col index dtypes.
SparseSupported
Capability marker — type may be a cuSPARSE SpMV/SpMM/SpGEMM element.
SyrkSupported
TensorSupported
Capability marker — type may be a cuTENSOR contraction operand.
TrsmSupported