Expand description
Persistent disk cache for NVRTC-compiled CUDA kernels (Phase 0.6).
Modern CUDA kernels — Hopper/Blackwell hand-rolled CUDA-C, CUTLASS template instantiations, FlashAttention 2/3 variants — take 10s to 60s each through NVRTC. A persistent disk cache turns subsequent runs into single-digit-millisecond hot starts.
§Design
- Key:
(source_hash, arch, options_hash)wherearchis the SM compute capability (e.g.80,90,100) andoptions_hashis FNV-1a of the NVRTC compile options in their original order (callers should sort beforehand if they want order-insensitive keys — seehash_options). - Value: serialised PTX (and optional CUBIN) bytes wrapped in
CachedKernel. - Storage: filesystem under
$XDG_CACHE_HOME/atomr-accel/nvrtc/(or$HOME/.cache/atomr-accel/nvrtc/, falling back to$TMPDIR/atomr-accel/nvrtc/). One file per cache entry, named{source_hash:016x}-{arch}-{options_hash:016x}.bin. - Format: bincode of
CachedKernel. Entries whoseatomr_accel_versiondoes not match [env!("CARGO_PKG_VERSION")] are rejected on load. - Concurrency: in-process
RwLockedHashMapread-through cache. Cross-process safety via atomic file write (<name>.tmpthenrename).
§Usage
use atomr_accel_cuda::nvrtc_cache::{
hash_options, hash_source, CachedKernel, NvrtcCache, NvrtcCacheKey,
};
let cache = NvrtcCache::new().unwrap();
let src = "extern \"C\" __global__ void noop() {}";
let key = NvrtcCacheKey {
source_hash: hash_source(src),
arch: 80,
options_hash: hash_options(["-std=c++17", "--use_fast_math"]),
};
if let Some(entry) = cache.get(key) {
println!("hot: {} bytes of PTX", entry.ptx.len());
} else {
// ... NVRTC compile ...
let ptx: Vec<u8> = b"PTX...".to_vec();
cache.insert(key, CachedKernel::new(ptx, None)).unwrap();
}Phase 5 will wire NvrtcActor through this cache; this module ships
the storage layer alone.
Structs§
- Cached
Kernel - On-disk and in-memory cache value.
- Nvrtc
Cache - Read-through disk cache for compiled NVRTC kernels.
- Nvrtc
Cache Key - Composite cache key.
source_hashandoptions_hashare produced byhash_source/hash_options;archis the SM compute capability as an integer (e.g.80,90,100).
Functions§
- hash_
options - FNV-1a 64-bit hash of an iterable of NVRTC compile options.
- hash_
source - FNV-1a 64-bit hash of a kernel source string. Stable across processes and across crate compilations.