Skip to main content

run_kernel

Function run_kernel 

Source
pub fn run_kernel<O, KA, F>(
    lib_tag: &'static str,
    stream: &Arc<CudaStream>,
    completion: &Arc<dyn CompletionStrategy>,
    output: O,
    reply: Sender<Result<O, GpuError>>,
    enqueue: F,
)
where O: Send + 'static, KA: Send + 'static, F: FnOnce() -> Result<KA, GpuError>,
Expand description

Run the synchronous-enqueue + async-completion-await pipeline.

enqueue runs immediately on the calling actor’s task. On success it returns the keep-alive tuple — anything that must outlive the kernel (input Arc<CudaSlice<T>>s, the unwrapped write target, descriptor handles, etc.). The envelope spawns a Tokio task that awaits CompletionStrategy::await_completion for stream, replies via reply, and drops the keep-alive only after completion.

lib_tag populates the lib field of any error annotation. The completion future emits its own typed errors; on failure output is discarded.

This is the trace-less, NVTX-less compatibility entry point used by every actor that hasn’t migrated to KernelEnvelope. Behaviour is byte-for-byte identical to the pre-Phase-0.7 implementation.