.. _Configuration:

Configuration
=============

Warp has settings at the global, module, and kernel level that can be used to fine-tune the compilation and verbosity
of Warp programs. In cases in which a setting can be changed at multiple levels (e.g.: ``enable_backward``),
the setting at the more-specific scope takes precedence.

.. _global-settings:

Global Settings
---------------

Settings can be modified by direct assignment before or after calling :func:`wp.init() <warp.init>`,
though some settings only take effect if set prior to initialization.

For example, the location of the user kernel cache can be changed with:

.. code-block:: python

    import os

    import warp as wp

    example_dir = os.path.dirname(os.path.realpath(__file__))

    # set default cache directory before wp.init()
    wp.config.kernel_cache_dir = os.path.join(example_dir, "tmp", "warpcache1")

    wp.init()

See :doc:`../api_reference/warp_config` for a complete list of global settings.

.. _module-settings:

Module Settings
---------------

Module-level settings to control runtime compilation and code generation may be changed by passing a dictionary of
option pairs to :func:`wp.set_module_options() <warp.set_module_options>`.

For example, compilation of backward passes for the kernel in an entire module can be disabled with:

.. code:: python

    wp.set_module_options({"enable_backward": False})

The options for a module can also be queried using :func:`wp.get_module_options() <warp.get_module_options>`.

+--------------------------------------+---------+-------------+--------------------------------------------------------------------------+
| Field                                | Type    |Default Value| Description                                                              |
+======================================+=========+=============+==========================================================================+
|``mode``                              | String  | ``None``    | A module-level override of the :attr:`warp.config.mode` setting.         |
|                                      |         |             | ``None`` defers to the global setting at compile time.                   |
+--------------------------------------+---------+-------------+--------------------------------------------------------------------------+
|``optimization_level``                | Integer | ``None``    | A module-level override of the :attr:`warp.config.optimization_level`    |
|                                      |         |             | setting. ``None`` defers to the global setting at compile time.          |
+--------------------------------------+---------+-------------+--------------------------------------------------------------------------+
|``max_unroll``                        | Integer | Global      | A module-level override of the :attr:`warp.config.max_unroll` setting.   |
|                                      |         | setting     |                                                                          |
+--------------------------------------+---------+-------------+--------------------------------------------------------------------------+
|``enable_backward``                   | Boolean | Global      | A module-level override of the :attr:`warp.config.enable_backward`       |
|                                      |         | setting     | setting.                                                                 |
+--------------------------------------+---------+-------------+--------------------------------------------------------------------------+
|``fast_math``                         | Boolean | ``False``   | If ``True``, CUDA kernels will be compiled with the ``--use_fast_math``  |
|                                      |         |             | compiler option, which enables some fast math operations that are faster |
|                                      |         |             | but less accurate.                                                       |
+--------------------------------------+---------+-------------+--------------------------------------------------------------------------+
|``fuse_fp``                           | Boolean | ``True``    | If ``True``, allow compilers to emit fused floating point operations     |
|                                      |         |             | such as fused-multiply-add. This may improve numerical accuracy and      |
|                                      |         |             | is generally recommended. Setting to ``False`` can help ensuring         |
|                                      |         |             | that functionally equivalent kernels will produce identical results      |
|                                      |         |             | unaffected by the presence or absence of fused operations.               |
+--------------------------------------+---------+-------------+--------------------------------------------------------------------------+
|``lineinfo``                          | Boolean | Global      | A module-level override of the :attr:`warp.config.lineinfo` setting.     |
|                                      |         | setting     |                                                                          |
+--------------------------------------+---------+-------------+--------------------------------------------------------------------------+
|``compile_time_trace``                | Boolean | Global      | A module-level override of the :attr:`warp.config.compile_time_trace`    |
|                                      |         | setting     | setting.                                                                 |
+--------------------------------------+---------+-------------+--------------------------------------------------------------------------+
|``cuda_output``                       | String  | ``None``    | A module-level override of the :attr:`warp.config.cuda_output` setting.  |
+--------------------------------------+---------+-------------+--------------------------------------------------------------------------+
|``block_dim``                         | Integer | 256         | The number of CUDA threads per block that kernels in the module will be  |
|                                      |         |             | compiled for.                                                            |
+--------------------------------------+---------+-------------+--------------------------------------------------------------------------+
|``strip_hash``                        | Boolean | ``False``   | If ``True``, avoids using a content-based hash to identify the module    |
|                                      |         |             | and its functions.                                                       |
+--------------------------------------+---------+-------------+--------------------------------------------------------------------------+
|``enable_mathdx_gemm``                | Boolean | ``None``    | A module-level override of the :attr:`warp.config.enable_mathdx_gemm`    |
|                                      |         |             | setting. ``None`` defers to the global setting at compile time.          |
+--------------------------------------+---------+-------------+--------------------------------------------------------------------------+

Kernel Settings
---------------

Kernel-level settings can be passed as arguments to the :func:`@wp.kernel <warp.kernel>` decorator.

.. list-table::
   :header-rows: 1
   :widths: 20 20 10 50

   * - Field
     - Type
     - Default Value
     - Description
   * - ``enable_backward``
     - Boolean
     - ``None``
     - If ``False``, the backward pass will not be generated for this kernel.
       If ``None``, inherits from the module/global setting.
   * - ``module``
     - Module | ``"unique"`` | str
     - ``None``
     - Controls which module the kernel belongs to. If ``"unique"``, the kernel
       is assigned to a new module named after the kernel (with a hash suffix). If a
       plain string is provided, the kernel is registered in the module with
       that name. If ``None``, the module is inferred from the function's module.
   * - ``launch_bounds``
     - int | tuple
     - ``None``
     - CUDA ``__launch_bounds__`` attribute for the kernel. Can be an int
       (``maxThreadsPerBlock``) or a tuple of 1--2 ints
       ``(maxThreadsPerBlock, minBlocksPerMultiprocessor)``. Only applies to
       CUDA kernels. The ``block_dim`` parameter in :func:`warp.launch` must
       not exceed the ``maxThreadsPerBlock`` value specified here.

.. code-block:: python

    @wp.kernel(enable_backward=False)
    def scale_2(
        x: wp.array(dtype=float),
        y: wp.array(dtype=float),
    ):
        y[0] = x[0] ** 2.0


    @wp.kernel(module="unique")
    def isolated_kernel(a: wp.array(dtype=float), b: wp.array(dtype=float)):
        # This kernel will be registered in a new unique module created
        # just for this kernel and its dependent functions and structs
        tid = wp.tid()
        b[tid] = a[tid] + 1.0


    @wp.kernel(launch_bounds=(256, 1))
    def bounded_kernel(a: wp.array(dtype=float)):
        # CUDA __launch_bounds__ will be set to (256, 1)
        tid = wp.tid()
        a[tid] = a[tid] * 2.0