Skip to main content

GpuRef

Struct GpuRef 

Source
pub struct GpuRef<T> { /* private fields */ }
Expand description

A live device-buffer handle.

Holds a strong Arc to the slice (keeping the underlying memory alive even if the DeviceActor has begun shutdown) plus a Weak to the surrounding DeviceState (so reference cycles cannot trap the system in a non-terminating state). Calling GpuRef::access before each use validates that the context generation has not advanced.

Implementations§

Source§

impl<T> GpuRef<T>

Source

pub fn new(slice: Arc<CudaSlice<T>>, state: &Arc<DeviceState>) -> Self

Wrap a raw Arc<CudaSlice<T>> produced by a DeviceActor into a GpuRef<T>.

Only DeviceActor (and code reachable from its dispatcher) should call this — outside callers must obtain GpuRefs by asking the DeviceActor to allocate.

Source

pub fn access(&self) -> Result<&Arc<CudaSlice<T>>, GpuError>

Validate the reference and return access to the underlying slice.

Returns GpuError::GpuRefStale if either:

  • the owning DeviceState has been dropped,
  • the device is no longer accepting operations, or
  • the context generation has advanced past the one this ref was minted with (i.e. a poisoned-context rebuild has happened).
Source

pub fn generation(&self) -> u64

Generation token at construction. Exposed for tests.

Source

pub fn len(&self) -> usize

Length in elements of the underlying slice.

Source

pub fn is_empty(&self) -> bool

Source

pub fn device_id(&self) -> Option<u32>

Device id this GpuRef was minted on, or None if the owning DeviceState has been dropped.

Source

pub fn record_write(&self, stream: &Arc<CudaStream>)

Record the stream that most recently wrote to this buffer. Library actors (BlasActor, CudnnActor, FftActor, etc.) call this after enqueueing a kernel that mutates the slice so that downstream consumers can inject a cross-stream wait.

Source

pub fn last_write_stream(&self) -> Option<Arc<CudaStream>>

Most recent producing stream, if any. Returns None when no kernel has been recorded against this buffer.

Source

pub fn raw_device_ptr(&self) -> Result<u64, GpuError>

Phase 4.5++ — opaque CUdeviceptr (u64) for downstream raw-pointer FFI APIs (TensorRT enqueueV3, cuStreamWriteValue64, custom CUDA modules that aren’t fronted by cudarc).

Validates the GpuRef first via GpuRef::access. The pointer is captured against the slice’s own associated stream — the _guard returned by cudarc’s device_ptr() is dropped before the function returns, but the underlying allocation outlives this call because the inner Arc<CudaSlice<T>> is held by self. Callers must ensure they don’t dispatch the resulting pointer on a stream that has already gone out of scope; in practice the pointer is consumed immediately by an FFI shim (TensorRT enqueueV3, etc.) on a stream the caller owns.

Returns GpuError::GpuRefStale if the underlying generation token is stale or the device is shutting down.

Trait Implementations§

Source§

impl<T> Clone for GpuRef<T>

Source§

fn clone(&self) -> Self

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl<T> Debug for GpuRef<T>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<T> DevSliceArg for GpuRef<T>
where T: CudaDtype,

Source§

fn validate(&self) -> Result<Box<dyn Any + Send>, GpuError>

Validate the underlying GpuRef and return a keep-alive owner. The caller stores this Box<dyn Any + Send> in a Vec to keep the device buffer alive until the kernel completes.
Source§

fn push<'a>(&'a self, builder: &mut LaunchArgs<'a>) -> Result<(), GpuError>

Push the device-pointer reference onto builder. Implementors re-access() the GpuRef (cheap — pointer-equality check against DeviceState.generation) and call [PushKernelArg::arg] with &CudaSlice<T>. Read more
Source§

fn dtype(&self) -> Option<DType>

Element dtype for tracing / debugging. Always Some(..) for the default GpuRef<T: CudaDtype> impl.
Source§

fn len(&self) -> usize

Length of the underlying slice in elements.
Source§

fn is_empty(&self) -> bool

True iff the slice has zero elements.

Auto Trait Implementations§

§

impl<T> Freeze for GpuRef<T>

§

impl<T> !RefUnwindSafe for GpuRef<T>

§

impl<T> Send for GpuRef<T>

§

impl<T> Sync for GpuRef<T>

§

impl<T> Unpin for GpuRef<T>

§

impl<T> UnsafeUnpin for GpuRef<T>

§

impl<T> !UnwindSafe for GpuRef<T>

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

§

impl<T> Instrument for T

§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided [Span], returning an Instrumented wrapper. Read more
§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<T> WithSubscriber for T

§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

impl<T> Extension for T
where T: Any + Send + Sync,