Expand description
Epilogue enum — atomr-accel’s curated mapping over cuBLASLt’s
cublasLtEpilogue_t.
cuBLASLt fuses post-matmul ops (bias add, activation, gradient
aux/preact) into the kernel itself. The full set is large and
version-dependent; we expose the variants that matter for
transformer training/inference: bias, ReLU/GeLU forward + aux,
ReLU/GeLU backward (drelu/dgelu) with optional bias gradient,
and the BGRADA/BGRADB reduction-only variants used by mixed
optimizer/data-parallel pipelines.
Cache key compatibility: Epilogue derives Hash + Eq so
HeuristicKey can fold it into the (m,n,k,dtype,layout, epilogue,arch) cache without a custom impl.
Enums§
- Epilogue
- Curated epilogue matrix.