Executor Compatibility#
MatXβs executor design allows expressions to run on different targets while leaving user code largely unchanged. This page summarizes which public operators are expected to work with each executor family.
Legend:
β Fully Supported.
π§ Partially supported, or supported with executor-specific limitations described in the notes.
β Not supported.
Note that there can be small differences in results between host executors and CUDA executors because floating-point arithmetic is performed by different libraries and devices. HostExecutor covers SingleThreadedHostExecutor, SelectThreadsHostExecutor, and AllThreadsHostExecutor. Host executor support for FFT, BLAS, and solver routines depends on the corresponding CPU backend CMake options. Optional backend dependencies such as CPU BLAS, CPU solver libraries, or MathDx requirements are documented in the notes while still using β when the operator is supported for that executor. CUDAJITExecutor support means the operator can participate in a fused JIT expression; non-JIT CUDA execution through cudaExecutor remains available for the broader CUDA library paths.
Operator |
HostExecutor |
CUDAExecutor |
CUDAJITExecutor |
Notes |
|---|---|---|---|---|
abs |
β |
β |
β |
Element-wise expression. |
abs2 |
β |
β |
β |
Element-wise expression. |
acos |
β |
β |
β |
Element-wise expression. |
acosh |
β |
β |
β |
Element-wise expression. |
all |
π§ |
β |
β |
Reduction transform; host execution is available but reductions are not generally parallelized across host threads. |
allclose |
π§ |
β |
β |
Reduction transform with tolerance comparison; host execution is available but reductions are not generally parallelized across host threads. |
alternate |
β |
β |
β |
Generator expression. |
ambgfun |
β |
β |
β |
CUDA-only radar transform. |
angle |
β |
β |
β |
Element-wise expression. |
any |
π§ |
β |
β |
Reduction transform; host execution is available but reductions are not generally parallelized across host threads. |
apply |
β |
β |
π§ |
User callable must be valid for the selected executor; JIT expressions require device/JIT-compatible callable code. |
apply_idx |
β |
β |
π§ |
User callable receives indices and must be valid for the selected executor; JIT expressions require device/JIT-compatible callable code. |
argmax |
π§ |
β |
β |
Reduction transform; host execution is available but reductions are not generally parallelized across host threads. |
argmin |
π§ |
β |
β |
Reduction transform; host execution is available but reductions are not generally parallelized across host threads. |
argminmax |
π§ |
β |
β |
Reduction transform; host execution is available but reductions are not generally parallelized across host threads. |
argsort |
β |
β |
β |
Sort transform; CUDA path uses device sort support. |
as_complex_double |
β |
β |
β |
Cast expression. |
as_complex_float |
β |
β |
β |
Cast expression. |
as_double |
β |
β |
β |
Cast expression. |
as_float |
β |
β |
β |
Cast expression. |
as_int16 |
β |
β |
β |
Cast expression. |
as_int32 |
β |
β |
β |
Cast expression. |
as_int64 |
β |
β |
β |
Cast expression. |
as_int8 |
β |
β |
β |
Cast expression. |
as_type |
β |
β |
β |
Cast expression. |
as_uint16 |
β |
β |
β |
Cast expression. |
as_uint32 |
β |
β |
β |
Cast expression. |
as_uint64 |
β |
β |
β |
Cast expression. |
as_uint8 |
β |
β |
β |
Cast expression. |
asin |
β |
β |
β |
Element-wise expression. |
asinh |
β |
β |
β |
Element-wise expression. |
atan |
β |
β |
β |
Element-wise expression. |
atan2 |
β |
β |
β |
Element-wise expression. |
atanh |
β |
β |
β |
Element-wise expression. |
at |
β |
β |
β |
Indexing/view expression. |
bartlett |
β |
β |
β |
Window generator expression. |
blackman |
β |
β |
β |
Window generator expression. |
cart2sph |
β |
β |
β |
Element-wise coordinate conversion expression. |
ceil |
β |
β |
β |
Element-wise expression. |
cgsolve |
β |
β |
β |
CUDA iterative solver path. |
channelize_poly |
β |
β |
β |
Polyphase channelizer; host path directly computes the per-branch FIR and DFT stages. |
chirp |
β |
β |
β |
Generator expression. |
chol |
β |
β |
β |
Host support requires the CPU solver backend. CUDAJITExecutor support uses cuSolverDx through MathDx for supported rank 2-4 square float, double, complex-float, and complex-double matrices. |
clone |
β |
β |
β |
View expression. |
concat |
β |
β |
β |
View/expression composition. |
conj |
β |
β |
β |
Element-wise expression. |
conv1d |
β |
β |
β |
Direct convolution supports host and CUDA executors. FFT convolution can use the CPU FFT backend when enabled. |
conv2d |
β |
β |
β |
Direct convolution supports host and CUDA executors. |
copy |
β |
β |
β |
Executor-dispatched assignment/copy transform; CUDAJITExecutor fuses expression evaluation instead of using this transform directly. |
corr |
β |
β |
β |
Direct correlation supports host and CUDA executors. FFT correlation can use the CPU FFT backend when enabled. |
cos |
β |
β |
β |
Element-wise expression. |
cosh |
β |
β |
β |
Element-wise expression. |
cov |
β |
β |
β |
CUDA-only covariance transform. |
cross |
β |
β |
β |
Small vector expression. |
cumsum |
π§ |
β |
β |
Scan transform; host execution is available but does not generally use multithreaded host executor paths. |
dct |
β |
β |
β |
CUDA-only DCT transform. |
det |
β |
β |
β |
Built from solver functionality; host support requires the CPU solver backend. JIT support follows cuSolverDx matrix type and shape limits. |
diag |
β |
β |
β |
Generator/view expression. |
downsample |
β |
β |
β |
View/reindex expression. |
eig |
β |
β |
β |
Host support requires the CPU solver backend. CUDAJITExecutor supports the cuSolverDx-backed projection path for supported types and shapes. |
einsum |
β |
β |
π§ |
CUDA transform. JIT support is available when the expression lowers to supported fused element-wise or matmul-style work. |
erf |
β |
β |
β |
Element-wise expression. |
exp |
β |
β |
β |
Element-wise expression. |
expj |
β |
β |
β |
Element-wise expression. |
eye |
β |
β |
β |
Generator expression. |
fft |
β |
β |
β |
Host support requires the CPU FFT backend. CUDAJITExecutor support uses cuFFTDx through MathDx for supported runtime shapes, precisions, and layouts. |
fft2 |
β |
β |
β |
Host support requires the CPU FFT backend. CUDAJITExecutor support uses cuFFTDx through MathDx for supported 2D runtime shapes, precisions, and layouts. |
fftfreq |
β |
β |
β |
Generator expression. |
fftshift1D |
β |
β |
β |
View/reindex expression. |
fftshift2D |
β |
β |
β |
View/reindex expression. |
fill |
β |
β |
β |
Generator expression. |
filter |
β |
β |
β |
CUDA-only filter transform. |
find |
β |
β |
β |
Search/compaction transform. |
find_idx |
β |
β |
β |
Search/compaction transform. |
find_peaks |
β |
β |
β |
Search transform. |
flattop |
β |
β |
β |
Window generator expression. |
flatten |
β |
β |
β |
View expression. |
fliplr |
β |
β |
β |
View/reindex expression. |
flipud |
β |
β |
β |
View/reindex expression. |
floor |
β |
β |
β |
Element-wise expression. |
fmod |
β |
β |
β |
Element-wise expression. |
frexp |
β |
β |
β |
Element-wise mantissa/exponent expression. |
frexpc |
β |
β |
β |
Element-wise mantissa/exponent expression for complex values. |
hamming |
β |
β |
β |
Window generator expression. |
hanning |
β |
β |
β |
Window generator expression. |
hermitianT |
β |
β |
β |
View/expression composition. |
hist |
β |
β |
β |
CUDA-only histogram transform. |
ifft |
β |
β |
β |
Host support requires the CPU FFT backend. CUDAJITExecutor support uses cuFFTDx through MathDx for supported runtime shapes, precisions, and layouts. |
ifft2 |
β |
β |
β |
Host support requires the CPU FFT backend. CUDAJITExecutor support uses cuFFTDx through MathDx for supported 2D runtime shapes, precisions, and layouts. |
ifftshift1D |
β |
β |
β |
View/reindex expression. |
ifftshift2D |
β |
β |
β |
View/reindex expression. |
IF |
β |
β |
β |
Conditional expression. |
IFELSE |
β |
β |
β |
Conditional expression. |
imag |
β |
β |
β |
Element-wise expression. |
index |
β |
β |
β |
Index generator expression. |
interp1 |
π§ |
β |
π§ |
Linear interpolation is expression-friendly. Spline interpolation uses CUDA-only transform support. |
inv |
β |
β |
β |
CUDA solver transform. CUDAJITExecutor support uses cuSolverDx through MathDx for supported rank 2-4 square float, double, complex-float, and complex-double matrices. |
isclose |
β |
β |
β |
Element-wise comparison expression. |
isinf |
β |
β |
β |
Element-wise predicate expression. |
isnan |
β |
β |
β |
Element-wise predicate expression. |
kron |
β |
β |
β |
Kronecker product uses matmul-style support where possible; host requires CPU BLAS support for library-backed paths and JIT follows MathDx BLAS limits. |
lcollapse |
β |
β |
β |
View expression. |
legendre |
β |
β |
β |
Element-wise polynomial expression. |
linspace |
β |
β |
β |
Generator expression. |
log |
β |
β |
β |
Element-wise expression. |
log10 |
β |
β |
β |
Element-wise expression. |
log2 |
β |
β |
β |
Element-wise expression. |
logspace |
β |
β |
β |
Generator expression. |
lu |
β |
β |
β |
Host support requires the CPU solver backend. CUDAJITExecutor supports cuSolverDx-backed lazy projections for supported types and shapes. |
matmul |
β |
β |
β |
Host support requires the CPU BLAS backend and supported floating or complex types. CUDAJITExecutor support uses cuBLASDx through MathDx for supported runtime shapes, precisions, layouts, and block-size intersections. |
matrix_norm |
π§ |
β |
β |
Reduction transform; host execution is available but reductions are not generally parallelized across host threads. |
matvec |
β |
β |
β |
Host support requires the CPU BLAS backend and supported floating or complex types. CUDAJITExecutor support follows cuBLASDx matmul constraints. |
max |
π§ |
β |
β |
Reduction transform; host execution is available but reductions are not generally parallelized across host threads. Element-wise maximum through binary operators remains JIT-compatible. |
mean |
π§ |
β |
β |
Reduction/statistics transform; host execution is available but reductions are not generally parallelized across host threads. |
median |
β |
β |
β |
Sort/statistics transform. |
meshgrid |
β |
β |
β |
Generator/view expression. |
min |
π§ |
β |
β |
Reduction transform; host execution is available but reductions are not generally parallelized across host threads. Element-wise minimum through binary operators remains JIT-compatible. |
mvdr |
π§ |
β |
π§ |
Built from solver, BLAS, and expression components; accelerated paths inherit backend and JIT restrictions from those components. |
normalize |
π§ |
β |
β |
Reduction-based expression; host execution is available but reductions are not generally parallelized across host threads. |
ones |
β |
β |
β |
Generator expression. |
outer |
β |
β |
β |
Host support requires the CPU BLAS backend for library-backed paths. CUDAJITExecutor support follows cuBLASDx matmul constraints. |
overlap |
β |
β |
β |
View expression. |
pad |
β |
β |
β |
View/expression composition. |
percentile |
β |
β |
β |
Sort/statistics transform. |
permute |
β |
β |
β |
View/reindex expression. |
pinv |
β |
β |
β |
SVD-backed solver transform; host support requires the CPU solver backend. cuSolverDx JIT SVD projection is not currently enabled. |
polyval |
β |
β |
β |
Element-wise polynomial evaluation expression. |
pow |
β |
β |
β |
Element-wise expression. |
prod |
π§ |
β |
β |
Reduction transform; host execution is available but reductions are not generally parallelized across host threads. |
pwelch |
β |
β |
β |
CUDA-only spectral estimation transform. |
qr |
β |
β |
β |
CUDA solver transform. CUDAJITExecutor supports cuSolverDx-backed lazy projections for supported types and shapes. |
qr_econ |
β |
β |
β |
CUDA solver transform. CUDAJITExecutor supports cuSolverDx-backed lazy projections for supported types and shapes; the Q projection is limited to non-wide matrices where m >= n. |
qr_solver |
β |
β |
β |
Host support requires the CPU solver backend. CUDAJITExecutor supports cuSolverDx-backed lazy projections for supported types and shapes. |
r2c |
β |
β |
β |
Real-to-complex view/cast expression. |
random |
β |
β |
β |
Random generator state is not JIT-fused. |
randomi |
β |
β |
β |
Random integer generator state is not JIT-fused. |
range |
β |
β |
β |
Generator expression. |
rcollapse |
β |
β |
β |
View expression. |
real |
β |
β |
β |
Element-wise expression. |
reduce |
β |
β |
β |
Generic custom reduction currently uses CUDA reduction support. |
remap |
β |
β |
β |
View/reindex expression. |
repmat |
β |
β |
β |
View/expression composition. |
resample_poly |
β |
β |
β |
Polyphase resampling transform for host and CUDA executors. |
reshape |
β |
β |
β |
View expression. |
reverse |
β |
β |
β |
View/reindex expression. |
round |
β |
β |
β |
Element-wise expression. |
rsqrt |
β |
β |
β |
Element-wise expression. |
sar_bp |
β |
β |
β |
CUDA-only SAR backprojection transform. |
select |
β |
β |
β |
Selection expression. |
shift |
β |
β |
β |
View/reindex expression. |
sign |
β |
β |
β |
Element-wise expression. |
sin |
β |
β |
β |
Element-wise expression. |
sincos |
β |
β |
β |
Element-wise expression. |
sinh |
β |
β |
β |
Element-wise expression. |
slice |
β |
β |
β |
View expression. |
softmax |
β |
β |
β |
CUDA-only reduction-style transform. |
solve |
β |
β |
β |
Solver transform; host support requires the CPU solver backend. JIT support follows the underlying cuSolverDx factorization limits when it lowers to a supported solver projection. |
sort |
π§ |
β |
β |
Sort transform; host execution is available but does not generally use multithreaded host executor paths. |
sph2cart |
β |
β |
β |
Element-wise coordinate conversion expression. |
sqrt |
β |
β |
β |
Element-wise expression. |
stack |
β |
β |
β |
View/expression composition. |
stdd |
π§ |
β |
β |
Reduction/statistics transform; host execution is available but reductions are not generally parallelized across host threads. |
sum |
π§ |
β |
β |
Reduction transform; host execution is available but reductions are not generally parallelized across host threads. |
svd |
β |
β |
β |
Host support requires the CPU solver backend. CUDAJITExecutor SVD projection is not currently enabled. |
svdbpi |
β |
β |
β |
CUDA-only batched power-iteration SVD transform. |
svdpi |
β |
β |
β |
CUDA-only power-iteration SVD transform. |
tan |
β |
β |
β |
Element-wise expression. |
tanh |
β |
β |
β |
Element-wise expression. |
toeplitz |
β |
β |
β |
Generator/expression composition. |
trace |
π§ |
β |
β |
Reduction-style matrix transform; host execution is available but reductions are not generally parallelized across host threads. |
transpose |
β |
β |
β |
View/reindex expression. |
transpose_matrix |
β |
β |
β |
View/reindex expression. |
unique |
β |
β |
β |
Sort/compaction transform. |
unwrap |
β |
β |
β |
Element-wise/reindex expression. |
upsample |
β |
β |
β |
View/reindex expression. |
var |
π§ |
β |
β |
Reduction/statistics transform; host execution is available but reductions are not generally parallelized across host threads. |
vector_norm |
π§ |
β |
β |
Reduction transform; host execution is available but reductions are not generally parallelized across host threads. |
zeros |
β |
β |
β |
Generator expression. |
zipvec |
β |
β |
β |
Vector packing expression. |
arithmetic operators |
β |
β |
β |
Includes unary minus and binary |
comparison operators |
β |
β |
β |
Includes |
logical operators |
β |
β |
β |
Includes element-wise |