External Weight Manager#
-
class ExternalWeightManager#
Owns externalized model-weight tensors loaded from external weight files.
Export can move selected ONNX initializers into safetensors external weight files and expose those weights as engine inputs. This manager loads the external weight files once, validates them against the TensorRT engine in a separate engine-aware step, and registers their stable tensor addresses into a TensorMap for execution.
Public Functions
-
ExternalWeightManager() = default#
Default constructor.
-
ExternalWeightManager(ExternalWeightManager const&) = delete#
Deleted copy to prevent accidental duplication of GPU resources.
- ExternalWeightManager &operator=(
- ExternalWeightManager const&
-
ExternalWeightManager(ExternalWeightManager&&) noexcept = default#
Allow move.
- ExternalWeightManager &operator=( ) noexcept = default#
- void load(
- std::filesystem::path const &engineDir,
- std::filesystem::path const &configPath,
- cudaStream_t stream
Load external weight tensors listed in
configPathfromengineDir.Missing
external_weight_filesis treated as an empty external-weight set.Must be called exactly once before
validateAgainstEngine(). Callingload()a second time is a programming error and throws.- Throws:
std::runtime_error – if an external weight file cannot be loaded or if
load()has already been called.
- void validateAgainstEngine(
- EngineExecutor const &executor,
- std::string_view engineLabel
Validate loaded external weight tensors against
executor.Must be called exactly once after
load()and beforeregisterTensorMapEntries(). Calling it a second time is a programming error and throws.- Throws:
std::runtime_error – if
load()has not been called first, if validation has already happened, or if an external weight tensor does not match the TensorRT engine input it feeds.
-
void registerTensorMapEntries(TensorMap &map)#
Register all loaded weights in
mapusing each tensor’s safetensors name.This is separate from load() because TensorMap is constructed after SharedResources, matching the other state managers that publish stable tensor addresses after the runtime builds its engine binding map.
Must be called exactly once after
validateAgainstEngine(). Calling it a second time is a programming error and throws.- Throws:
std::runtime_error – if validation has not run first or if registration has already happened.
-
inline size_t size() const noexcept#
Number of externalized tensors currently owned by the manager.
-
ExternalWeightManager() = default#