Expand description
Multi-stream pipeline pattern.
Lets users wire Source -> KernelStage -> ... -> Sink such that
stage K+1 begins as soon as stage K’s GPU work is complete, with
cross-stage handoff via [cudarc::driver::CudaEvent] — no host
roundtrip between stages.
§Stage shape
Implement PipelineStage for any kernel-actor adapter:
ⓘ
struct BlasSgemmStage { /* ... */ }
impl PipelineStage for BlasSgemmStage {
type In = (GpuRef<f32>, GpuRef<f32>);
type Out = GpuRef<f32>;
fn enqueue(
&mut self, stream, wait_for, (a, b)
) -> Result<(CudaEvent, GpuRef<f32>), GpuError> {
if let Some(ev) = wait_for { stream.wait(ev)?; }
/* enqueue cuBLAS gemm via record-mode contract */
let ev = stream.record_event(None)?;
Ok((ev, c))
}
}F2 ships the trait + a thin executor; the full
PipelineBuilder<I, O> type-state DSL with Source / Sink wrappers
lands in F3 once we have more concrete patterns demanding it.
Structs§
- Pipeline
Executor - Two-stage type-state executor — the simplest non-trivial chain.
- Pipeline
ExecutorN - N-stage heterogeneous executor.
- Pipeline
Sink - Producer end.
submitblocks (awaits) when the channel is full — that’s the backpressure signal. - Pipeline
Source - Consumer end. Returns a
ReceiverStream<Result<O, GpuError>>that yields one item per processed input. - Stage
Box - Adapter wrapping any typed
PipelineStageinto aBoxedStage.
Traits§
- Boxed
Stage - Heterogeneous N-stage executor.
- Pipeline
Stage - One stage in a multi-stream GPU pipeline.
Functions§
- run_
pipeline - Run a homogeneous sequence of stages on
streams[i]for stage i. - spawn_
pipeline - Spawn a backpressured async pipeline driver around an executor.
Returns
(PipelineSink<I>, PipelineSource<O>). The driver runs on the ambient tokio runtime.