cuda.core.experimental.LinkerOptions

class cuda.core.experimental.LinkerOptions(arch: str | None = None, max_register_count: int | None = None, time: bool | None = None, verbose: bool | None = None, link_time_optimization: bool | None = None, ptx: bool | None = None, optimization_level: int | None = None, debug: bool | None = None, lineinfo: bool | None = None, ftz: bool | None = None, prec_div: bool | None = None, prec_sqrt: bool | None = None, fma: bool | None = None, kernels_used: str | Tuple[str] | List[str] | None = None, variables_used: str | Tuple[str] | List[str] | None = None, optimize_unused_variables: bool | None = None, ptxas_options: str | Tuple[str] | List[str] | None = None, split_compile: int | None = None, split_compile_extended: int | None = None, no_cache: bool | None = None)

Customizable Linker options.

Since the linker would choose to use nvJitLink or the driver APIs as the linking backed, not all options are applicable. When the system’s installed nvJitLink is too old (<12.3), or not installed, the driver APIs (cuLink) will be used instead.

arch

Pass the SM architecture value, such as sm_<CC> (for generating CUBIN) or compute_<CC> (for generating PTX). If not provided, the current device’s architecture will be used.

Type:

str, optional

max_register_count

Maximum register count.

Type:

int, optional

time

Print timing information to the info log. Default: False.

Type:

bool, optional

verbose

Print verbose messages to the info log. Default: False.

Type:

bool, optional

Perform link time optimization. Default: False.

Type:

bool, optional

ptx

Emit PTX after linking instead of CUBIN; only supported with link_time_optimization=True. Default: False.

Type:

bool, optional

optimization_level

Set optimization level. Only 0 and 3 are accepted.

Type:

int, optional

debug

Generate debug information. Default: False.

Type:

bool, optional

lineinfo

Generate line information. Default: False.

Type:

bool, optional

ftz

Flush denormal values to zero. Default: False.

Type:

bool, optional

prec_div

Use precise division. Default: True.

Type:

bool, optional

prec_sqrt

Use precise square root. Default: True.

Type:

bool, optional

fma

Use fast multiply-add. Default: True.

Type:

bool, optional

kernels_used

Pass a kernel or sequence of kernels that are used; any not in the list can be removed.

Type:

[Union[str, Tuple[str], List[str]]], optional

variables_used

Pass a variable or sequence of variables that are used; any not in the list can be removed.

Type:

[Union[str, Tuple[str], List[str]]], optional

optimize_unused_variables

Assume that if a variable is not referenced in device code, it can be removed. Default: False.

Type:

bool, optional

ptxas_options

Pass options to PTXAS.

Type:

[Union[str, Tuple[str], List[str]]], optional

split_compile

Split compilation maximum thread count. Use 0 to use all available processors. Value of 1 disables split compilation (default). Default: 1.

Type:

int, optional

split_compile_extended

A more aggressive form of split compilation available in LTO mode only. Accepts a maximum thread count value. Use 0 to use all available processors. Value of 1 disables extended split compilation (default). Note: This option can potentially impact performance of the compiled binary. Default: 1.

Type:

int, optional

no_cache

Do not cache the intermediate steps of nvJitLink. Default: False.

Type:

bool, optional