GPU Streams¶
GPU streams require a CUDA-capable PyTorch installation.
Creating a GPU stream¶
shm = pyshmem.create(
"weights",
shape=(4096, 4096),
dtype=np.float32,
gpu_device="cuda:0",
)
CPU mirroring modes¶
GPU streams support two intentionally different modes.
Performance mode¶
The default GPU configuration uses cpu_mirror=False.
fastest path for GPU-heavy workloads
avoids updating a CPU mirror on every write
CPU-only handles can still inspect metadata and take locks
payload reads require a GPU attachment
Compatibility mode¶
Set cpu_mirror=True when you need CPU-side reopen or CPU-side payload reads.
shm = pyshmem.create(
"weights",
shape=(4096, 4096),
dtype=np.float32,
gpu_device="cuda:0",
cpu_mirror=True,
)
This mode trades throughput for compatibility and stronger safe-read semantics under concurrent writes.
Opening GPU streams¶
Always pass gpu_device when you want a CUDA tensor view:
reader = pyshmem.open("weights", gpu_device="cuda:0")
If you omit gpu_device, the handle remains CPU-only.