nvalchemi.data.DataLoader#
- class nvalchemi.data.DataLoader(dataset, *, batch_size=1, shuffle=False, drop_last=False, sampler=None, prefetch_factor=2, num_streams=4, use_streams=True)[source]#
Batch-iterating data loader that yields
Batch.Wraps a
Datasetand yieldsBatchobjects built viaBatch.from_data_list(). CUDA-stream prefetching is supported for overlapping I/O with computation.- Parameters:
dataset (Dataset) – AtomicData-native dataset to load from.
batch_size (int, default=1) – Number of samples per batch.
shuffle (bool, default=False) – Randomize sample order each epoch.
drop_last (bool, default=False) – Drop the last incomplete batch.
sampler (torch.utils.data.Sampler | None, default=None) – Custom sampler (overrides
shuffle).prefetch_factor (int, default=2) – How many batches to prefetch ahead.
num_streams (int, default=4) – Number of CUDA streams for prefetching.
use_streams (bool, default=True) – Enable CUDA-stream prefetching.
Examples
>>> from nvalchemi.data.datapipes import AtomicDataZarrReader, Dataset, DataLoader >>> reader = AtomicDataZarrReader("dataset.zarr") >>> ds = Dataset(reader, device="cpu") >>> loader = DataLoader(ds, batch_size=4) >>> for batch in loader: ... print(batch.positions.shape)