Initialize Cos Sin Cache#

void trt_edgellm::kernel::initializeNormalRopeCosSin(
float *cosSinCache,
float rotaryBaseFrequency,
float rotaryScale,
int32_t rotaryDim,
int32_t rotaryEmbeddingMaxPositions,
cudaStream_t stream
)#

Initialize normal RoPE cos/sin cache.

Precomputes cos/sin values for standard rotary position encoding.

Parameters:
  • cosSinCache – Output cos/sin cache

  • rotaryBaseFrequency – Base frequency for RoPE

  • rotaryScale – Scaling factor

  • rotaryDim – Rotary embedding dimension

  • rotaryEmbeddingMaxPositions – Maximum positions

  • stream – CUDA stream

void trt_edgellm::kernel::initializeLongRopeCosSin(
float *shortCosSinCache,
float *longCosSinCache,
float *shortFactor,
float *longFactor,
float rotaryBaseFrequency,
int32_t rotaryDim,
int32_t rotaryEmbeddingMaxPositions,
int32_t maxPositionEmbeddings,
int32_t originalMaxPositionEmbeddings,
cudaStream_t stream
)#

Initialize long RoPE cos/sin cache with interpolation.

Precomputes cos/sin values for long-context RoPE with position interpolation. Used for extending context length beyond original training range.

Parameters:
  • shortCosSinCache – Short-range cos/sin cache

  • longCosSinCache – Long-range cos/sin cache

  • shortFactor – Short interpolation factors

  • longFactor – Long interpolation factors

  • rotaryBaseFrequency – Base frequency

  • rotaryDim – Rotary dimension

  • rotaryEmbeddingMaxPositions – Maximum positions

  • maxPositionEmbeddings – Maximum position embeddings

  • originalMaxPositionEmbeddings – Original max positions from training

  • stream – CUDA stream

void trt_edgellm::kernel::initializeMRopeCosSin(
float *cosSinCache,
int64_t *mropePositionIds,
float rotaryBaseFrequency,
int64_t rotaryDim,
int64_t rotaryEmbeddingMaxPositions,
int64_t batchSize,
bool interleaved,
cudaStream_t stream
)#

Initialize multi-dimensional RoPE cos/sin cache (MRoPE)

Precomputes cos/sin values for multi-dimensional rotary encoding (e.g., Qwen2-VL). Supports separate position encodings for different dimensions (temporal, spatial).

Parameters:
  • cosSinCache – Output cos/sin cache

  • mropePositionIds – Multi-dimensional position IDs

  • rotaryBaseFrequency – Base frequency

  • rotaryDim – Rotary dimension

  • rotaryEmbeddingMaxPositions – Maximum positions

  • batchSize – Batch size

  • interleaved – Whether to use interleaved MRoPE

  • stream – CUDA stream

void trt_edgellm::kernel::initializeTextOnlyMRopeCosSin(
float *cosSinCache,
float rotaryBaseFrequency,
int64_t rotaryDim,
int64_t maxPositions,
cudaStream_t stream
)#

Initialize MRoPE cos/sin cache for text-only inputs with sequential positions.

Initializes the MRoPE cache using sequential position IDs (pos[i] = i) for all 3 MRoPE sections. This is the correct default for text-only and audio-only modes where no spatial position information is available.

Parameters:
  • cosSinCache – Output cos/sin cache, shape [1, maxPositions, rotaryDim]

  • rotaryBaseFrequency – Base frequency

  • rotaryDim – Rotary dimension

  • maxPositions – Maximum number of positions

  • stream – CUDA stream