Initialize Cos Sin Cache#

void trt_edgellm::kernel::initializeNormalRopeCosSin(
float *cosSinCache,
float rotaryBaseFrequency,
float rotaryScale,
int32_t rotaryDim,
int32_t rotaryEmbeddingMaxPositions,
cudaStream_t stream
)#

Initialize normal RoPE cos/sin cache.

Precomputes cos/sin values for standard rotary position encoding.

Parameters:
  • cosSinCache – Output cos/sin cache

  • rotaryBaseFrequency – Base frequency for RoPE

  • rotaryScale – Scaling factor

  • rotaryDim – Rotary embedding dimension

  • rotaryEmbeddingMaxPositions – Maximum positions

  • stream – CUDA stream

void trt_edgellm::kernel::initializeLongRopeCosSin(
float *shortCosSinCache,
float *longCosSinCache,
float *shortFactor,
float *longFactor,
float rotaryBaseFrequency,
int32_t rotaryDim,
int32_t rotaryEmbeddingMaxPositions,
int32_t maxPositionEmbeddings,
int32_t originalMaxPositionEmbeddings,
cudaStream_t stream
)#

Initialize long RoPE cos/sin cache with interpolation.

Precomputes cos/sin values for long-context RoPE with position interpolation. Used for extending context length beyond original training range.

Parameters:
  • shortCosSinCache – Short-range cos/sin cache

  • longCosSinCache – Long-range cos/sin cache

  • shortFactor – Short interpolation factors

  • longFactor – Long interpolation factors

  • rotaryBaseFrequency – Base frequency

  • rotaryDim – Rotary dimension

  • rotaryEmbeddingMaxPositions – Maximum positions

  • maxPositionEmbeddings – Maximum position embeddings

  • originalMaxPositionEmbeddings – Original max positions from training

  • stream – CUDA stream

void trt_edgellm::kernel::initializeMRopeCosSin(
float *cosSinCache,
int64_t *mropePositionIds,
float rotaryBaseFrequency,
int64_t rotaryDim,
int64_t rotaryEmbeddingMaxPositions,
int64_t batchSize,
bool interleaved,
cudaStream_t stream
)#

Initialize multi-dimensional RoPE cos/sin cache (MRoPE)

Precomputes cos/sin values for multi-dimensional rotary encoding (e.g., Qwen2-VL). Supports separate position encodings for different dimensions (temporal, spatial).

Parameters:
  • cosSinCache – Output cos/sin cache

  • mropePositionIds – Multi-dimensional position IDs

  • rotaryBaseFrequency – Base frequency

  • rotaryDim – Rotary dimension

  • rotaryEmbeddingMaxPositions – Maximum positions

  • batchSize – Batch size

  • interleaved – Whether to use interleaved MRoPE

  • stream – CUDA stream