External Weight Manager#

class ExternalWeightManager#

Owns externalized model-weight tensors loaded from external weight files.

Export can move selected ONNX initializers into safetensors external weight files and expose those weights as engine inputs. This manager loads the external weight files once, validates them against the TensorRT engine in a separate engine-aware step, and registers their stable tensor addresses into a TensorMap for execution.

Public Functions

ExternalWeightManager() = default#

Default constructor.

ExternalWeightManager(ExternalWeightManager const&) = delete#

Deleted copy to prevent accidental duplication of GPU resources.

ExternalWeightManager &operator=(
ExternalWeightManager const&
) = delete#
ExternalWeightManager(ExternalWeightManager&&) noexcept = default#

Allow move.

ExternalWeightManager &operator=(
ExternalWeightManager&&
) noexcept = default#
void load(
std::filesystem::path const &engineDir,
std::filesystem::path const &configPath,
cudaStream_t stream
)#

Load external weight tensors listed in configPath from engineDir.

Missing external_weight_files is treated as an empty external-weight set.

Must be called exactly once before validateAgainstEngine(). Calling load() a second time is a programming error and throws.

Throws:

std::runtime_error – if an external weight file cannot be loaded or if load() has already been called.

void validateAgainstEngine(
EngineExecutor const &executor,
std::string_view engineLabel
)#

Validate loaded external weight tensors against executor.

Must be called exactly once after load() and before registerTensorMapEntries(). Calling it a second time is a programming error and throws.

Throws:

std::runtime_error – if load() has not been called first, if validation has already happened, or if an external weight tensor does not match the TensorRT engine input it feeds.

void registerTensorMapEntries(TensorMap &map)#

Register all loaded weights in map using each tensor’s safetensors name.

This is separate from load() because TensorMap is constructed after SharedResources, matching the other state managers that publish stable tensor addresses after the runtime builds its engine binding map.

Must be called exactly once after validateAgainstEngine(). Calling it a second time is a programming error and throws.

Throws:

std::runtime_error – if validation has not run first or if registration has already happened.

inline size_t size() const noexcept#

Number of externalized tensors currently owned by the manager.