cuda.core.experimental.LinkerOptions#

class cuda.core.experimental.LinkerOptions(
name: str | None = '<default linker>',
arch: str | None = None,
max_register_count: int | None = None,
time: bool | None = None,
verbose: bool | None = None,
link_time_optimization: bool | None = None,
ptx: bool | None = None,
optimization_level: int | None = None,
debug: bool | None = None,
lineinfo: bool | None = None,
ftz: bool | None = None,
prec_div: bool | None = None,
prec_sqrt: bool | None = None,
fma: bool | None = None,
kernels_used: str | Tuple[str] | List[str] | None = None,
variables_used: str | Tuple[str] | List[str] | None = None,
optimize_unused_variables: bool | None = None,
ptxas_options: str | Tuple[str] | List[str] | None = None,
split_compile: int | None = None,
split_compile_extended: int | None = None,
no_cache: bool | None = None,
)#

Customizable Linker options.

Since the linker would choose to use nvJitLink or the driver APIs as the linking backed, not all options are applicable. When the system’s installed nvJitLink is too old (<12.3), or not installed, the driver APIs (cuLink) will be used instead.

name#

Name of the linker. If the linking succeeds, the name is passed down to the generated ObjectCode.

Type:

str, optional

arch#

Pass the SM architecture value, such as sm_<CC> (for generating CUBIN) or compute_<CC> (for generating PTX). If not provided, the current device’s architecture will be used.

Type:

str, optional

max_register_count#

Maximum register count.

Type:

int, optional

time#

Print timing information to the info log. Default: False.

Type:

bool, optional

verbose#

Print verbose messages to the info log. Default: False.

Type:

bool, optional

Perform link time optimization. Default: False.

Type:

bool, optional

ptx#

Emit PTX after linking instead of CUBIN; only supported with link_time_optimization=True. Default: False.

Type:

bool, optional

optimization_level#

Set optimization level. Only 0 and 3 are accepted.

Type:

int, optional

debug#

Generate debug information. Default: False.

Type:

bool, optional

lineinfo#

Generate line information. Default: False.

Type:

bool, optional

ftz#

Flush denormal values to zero. Default: False.

Type:

bool, optional

prec_div#

Use precise division. Default: True.

Type:

bool, optional

prec_sqrt#

Use precise square root. Default: True.

Type:

bool, optional

fma#

Use fast multiply-add. Default: True.

Type:

bool, optional

kernels_used#

Pass a kernel or sequence of kernels that are used; any not in the list can be removed.

Type:

[Union[str, Tuple[str], List[str]]], optional

variables_used#

Pass a variable or sequence of variables that are used; any not in the list can be removed.

Type:

[Union[str, Tuple[str], List[str]]], optional

optimize_unused_variables#

Assume that if a variable is not referenced in device code, it can be removed. Default: False.

Type:

bool, optional

ptxas_options#

Pass options to PTXAS.

Type:

[Union[str, Tuple[str], List[str]]], optional

split_compile#

Split compilation maximum thread count. Use 0 to use all available processors. Value of 1 disables split compilation (default). Default: 1.

Type:

int, optional

split_compile_extended#

A more aggressive form of split compilation available in LTO mode only. Accepts a maximum thread count value. Use 0 to use all available processors. Value of 1 disables extended split compilation (default). Note: This option can potentially impact performance of the compiled binary. Default: 1.

Type:

int, optional

no_cache#

Do not cache the intermediate steps of nvJitLink. Default: False.

Type:

bool, optional