cuda.core.experimental.LinkerOptions¶
- class cuda.core.experimental.LinkerOptions(arch: str | None = None, max_register_count: int | None = None, time: bool | None = None, verbose: bool | None = None, link_time_optimization: bool | None = None, ptx: bool | None = None, optimization_level: int | None = None, debug: bool | None = None, lineinfo: bool | None = None, ftz: bool | None = None, prec_div: bool | None = None, prec_sqrt: bool | None = None, fma: bool | None = None, kernels_used: List[str] | None = None, variables_used: List[str] | None = None, optimize_unused_variables: bool | None = None, xptxas: List[str] | None = None, split_compile: int | None = None, split_compile_extended: int | None = None, no_cache: bool | None = None)¶
Customizable
Linker
options.Since the linker would choose to use nvJitLink or the driver APIs as the linking backed, not all options are applicable. When the system’s installed nvJitLink is too old (<12.3), or not installed, the driver APIs (cuLink) will be used instead.
- arch¶
Pass the SM architecture value, such as
sm_<CC>
(for generating CUBIN) orcompute_<CC>
(for generating PTX). If not provided, the current device’s architecture will be used.- Type:
str, optional
- verbose¶
Print verbose messages to the info log. Maps to
-verbose
. Default: False.- Type:
bool, optional
- link_time_optimization¶
Perform link time optimization. Maps to:
-lto
. Default: False.- Type:
bool, optional
- ptx¶
Emit PTX after linking instead of CUBIN; only supported with
-lto
. Maps to-ptx
. Default: False.- Type:
bool, optional
- optimization_level¶
Set optimization level. Only 0 and 3 are accepted. Maps to
-O<N>
.- Type:
int, optional
- kernels_used¶
Pass list of kernels that are used; any not in the list can be removed. This option can be specified multiple times. Maps to
-kernels-used=<name>
.- Type:
List[str], optional
- variables_used¶
Pass a list of variables that are used; any not in the list can be removed. Maps to
-variables-used=<name>
- Type:
List[str], optional
- optimize_unused_variables¶
Assume that if a variable is not referenced in device code, it can be removed. Maps to:
-optimize-unused-variables
Default: False.- Type:
bool, optional
- split_compile¶
Split compilation maximum thread count. Use 0 to use all available processors. Value of 1 disables split compilation (default). Maps to
-split-compile=<N>
. Default: 1.- Type:
int, optional
- split_compile_extended¶
A more aggressive form of split compilation available in LTO mode only. Accepts a maximum thread count value. Use 0 to use all available processors. Value of 1 disables extended split compilation (default). Note: This option can potentially impact performance of the compiled binary. Maps to
-split-compile-extended=<N>
. Default: 1.- Type:
int, optional