cuda.core.experimental.LinkerOptions#
- class cuda.core.experimental.LinkerOptions(
- name: str | None = '<default linker>',
- arch: str | None = None,
- max_register_count: int | None = None,
- time: bool | None = None,
- verbose: bool | None = None,
- link_time_optimization: bool | None = None,
- ptx: bool | None = None,
- optimization_level: int | None = None,
- debug: bool | None = None,
- lineinfo: bool | None = None,
- ftz: bool | None = None,
- prec_div: bool | None = None,
- prec_sqrt: bool | None = None,
- fma: bool | None = None,
- kernels_used: str | tuple[str] | list[str] | None = None,
- variables_used: str | tuple[str] | list[str] | None = None,
- optimize_unused_variables: bool | None = None,
- ptxas_options: str | tuple[str] | list[str] | None = None,
- split_compile: int | None = None,
- split_compile_extended: int | None = None,
- no_cache: bool | None = None,
Customizable
Linkeroptions.Since the linker would choose to use nvJitLink or the driver APIs as the linking backed, not all options are applicable. When the system’s installed nvJitLink is too old (<12.3), or not installed, the driver APIs (cuLink) will be used instead.
- name#
Name of the linker. If the linking succeeds, the name is passed down to the generated ObjectCode.
- Type:
str, optional
- arch#
Pass the SM architecture value, such as
sm_<CC>(for generating CUBIN) orcompute_<CC>(for generating PTX). If not provided, the current device’s architecture will be used.- Type:
str, optional
- ptx#
Emit PTX after linking instead of CUBIN; only supported with
link_time_optimization=True. Default: False.- Type:
bool, optional
- kernels_used#
Pass a kernel or sequence of kernels that are used; any not in the list can be removed.
- variables_used#
Pass a variable or sequence of variables that are used; any not in the list can be removed.
- optimize_unused_variables#
Assume that if a variable is not referenced in device code, it can be removed. Default: False.
- Type:
bool, optional
- split_compile#
Split compilation maximum thread count. Use 0 to use all available processors. Value of 1 disables split compilation (default). Default: 1.
- Type:
int, optional
- split_compile_extended#
A more aggressive form of split compilation available in LTO mode only. Accepts a maximum thread count value. Use 0 to use all available processors. Value of 1 disables extended split compilation (default). Note: This option can potentially impact performance of the compiled binary. Default: 1.
- Type:
int, optional