pub fn run_kernel<O, KA, F>(
lib_tag: &'static str,
stream: &Arc<CudaStream>,
completion: &Arc<dyn CompletionStrategy>,
output: O,
reply: Sender<Result<O, GpuError>>,
enqueue: F,
)Expand description
Run the synchronous-enqueue + async-completion-await pipeline.
enqueue runs immediately on the calling actor’s task. On success
it returns the keep-alive tuple — anything that must outlive
the kernel (input Arc<CudaSlice<T>>s, the unwrapped write
target, descriptor handles, etc.). The envelope spawns a Tokio
task that awaits CompletionStrategy::await_completion for
stream, replies via reply, and drops the keep-alive only
after completion.
lib_tag populates the lib field of any error annotation. The
completion future emits its own typed errors; on failure output
is discarded.
This is the trace-less, NVTX-less compatibility entry point used by
every actor that hasn’t migrated to KernelEnvelope. Behaviour
is byte-for-byte identical to the pre-Phase-0.7 implementation.