Skip to main content

Module nvrtc_cache

Module nvrtc_cache 

Source
Expand description

Persistent disk cache for NVRTC-compiled CUDA kernels (Phase 0.6).

Modern CUDA kernels — Hopper/Blackwell hand-rolled CUDA-C, CUTLASS template instantiations, FlashAttention 2/3 variants — take 10s to 60s each through NVRTC. A persistent disk cache turns subsequent runs into single-digit-millisecond hot starts.

§Design

  • Key: (source_hash, arch, options_hash) where arch is the SM compute capability (e.g. 80, 90, 100) and options_hash is FNV-1a of the NVRTC compile options in their original order (callers should sort beforehand if they want order-insensitive keys — see hash_options).
  • Value: serialised PTX (and optional CUBIN) bytes wrapped in CachedKernel.
  • Storage: filesystem under $XDG_CACHE_HOME/atomr-accel/nvrtc/ (or $HOME/.cache/atomr-accel/nvrtc/, falling back to $TMPDIR/atomr-accel/nvrtc/). One file per cache entry, named {source_hash:016x}-{arch}-{options_hash:016x}.bin.
  • Format: bincode of CachedKernel. Entries whose atomr_accel_version does not match [env!("CARGO_PKG_VERSION")] are rejected on load.
  • Concurrency: in-process RwLocked HashMap read-through cache. Cross-process safety via atomic file write (<name>.tmp then rename).

§Usage

use atomr_accel_cuda::nvrtc_cache::{
    hash_options, hash_source, CachedKernel, NvrtcCache, NvrtcCacheKey,
};

let cache = NvrtcCache::new().unwrap();
let src = "extern \"C\" __global__ void noop() {}";
let key = NvrtcCacheKey {
    source_hash: hash_source(src),
    arch: 80,
    options_hash: hash_options(["-std=c++17", "--use_fast_math"]),
};
if let Some(entry) = cache.get(key) {
    println!("hot: {} bytes of PTX", entry.ptx.len());
} else {
    // ... NVRTC compile ...
    let ptx: Vec<u8> = b"PTX...".to_vec();
    cache.insert(key, CachedKernel::new(ptx, None)).unwrap();
}

Phase 5 will wire NvrtcActor through this cache; this module ships the storage layer alone.

Structs§

CachedKernel
On-disk and in-memory cache value.
NvrtcCache
Read-through disk cache for compiled NVRTC kernels.
NvrtcCacheKey
Composite cache key. source_hash and options_hash are produced by hash_source / hash_options; arch is the SM compute capability as an integer (e.g. 80, 90, 100).

Functions§

hash_options
FNV-1a 64-bit hash of an iterable of NVRTC compile options.
hash_source
FNV-1a 64-bit hash of a kernel source string. Stable across processes and across crate compilations.