pub struct MatmulRequest<T: GemmSupported> {Show 21 fields
pub a: GpuRef<T>,
pub b: GpuRef<T>,
pub c: GpuRef<T>,
pub d: Option<GpuRef<T>>,
pub m: i32,
pub n: i32,
pub k: i32,
pub alpha: T::Scalar,
pub beta: T::Scalar,
pub transa: bool,
pub transb: bool,
pub lda: i64,
pub ldb: i64,
pub ldc: i64,
pub ldd: i64,
pub epilogue: Epilogue,
pub bias: Option<GpuRef<T>>,
pub gelu_aux: Option<GpuRef<T>>,
pub scales: ScaleSet,
pub workspace_size: usize,
pub reply: Sender<Result<(), GpuError>>,
}Expand description
Typed matmul request. Public surface; instantiated by callers.
Fields§
§a: GpuRef<T>§b: GpuRef<T>§c: GpuRef<T>§d: Option<GpuRef<T>>Optional explicit D output buffer. cuBLASLt allows
out-of-place matmul where the result lands in D rather than
in-place into C. Required for fp8 (the scale-back step
produces a different dtype than the accumulator).
m: i32§n: i32§k: i32§alpha: T::Scalar§beta: T::Scalar§transa: bool§transb: bool§lda: i64§ldb: i64§ldc: i64§ldd: i64§epilogue: Epilogue§bias: Option<GpuRef<T>>§gelu_aux: Option<GpuRef<T>>§scales: ScaleSet§workspace_size: usizeHint for the heuristic: maximum workspace bytes the algorithm
search may use. A reasonable default is 4 * 1024 * 1024
(cuBLASLt’s standard 4 MiB minimum).
reply: Sender<Result<(), GpuError>>Trait Implementations§
Source§impl BlasLtDispatch for MatmulRequest<f32>
impl BlasLtDispatch for MatmulRequest<f32>
fn dtype_kind(&self) -> DTypeKind
fn dispatch(self: Box<Self>, ctx: &BlasLtDispatchCtx<'_>)
Source§impl BlasLtDispatch for MatmulRequest<f64>
f64 (cudarc 0.19.4 has no Matmul<f64> impl).
impl BlasLtDispatch for MatmulRequest<f64>
f64 (cudarc 0.19.4 has no Matmul<f64> impl).
fn dtype_kind(&self) -> DTypeKind
fn dispatch(self: Box<Self>, _ctx: &BlasLtDispatchCtx<'_>)
Auto Trait Implementations§
impl<T> Freeze for MatmulRequest<T>
impl<T> !RefUnwindSafe for MatmulRequest<T>
impl<T> Send for MatmulRequest<T>
impl<T> Sync for MatmulRequest<T>
impl<T> Unpin for MatmulRequest<T>
impl<T> UnsafeUnpin for MatmulRequest<T>
impl<T> !UnwindSafe for MatmulRequest<T>
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more