Initialize Cos Sin Cache#
- void trt_edgellm::kernel::initializeNormalRopeCosSin(
- float *cosSinCache,
- float rotaryBaseFrequency,
- float rotaryScale,
- int32_t rotaryDim,
- int32_t rotaryEmbeddingMaxPositions,
- cudaStream_t stream
Initialize normal RoPE cos/sin cache.
Precomputes cos/sin values for standard rotary position encoding.
- Parameters:
cosSinCache – Output cos/sin cache
rotaryBaseFrequency – Base frequency for RoPE
rotaryScale – Scaling factor
rotaryDim – Rotary embedding dimension
rotaryEmbeddingMaxPositions – Maximum positions
stream – CUDA stream
- void trt_edgellm::kernel::initializeLongRopeCosSin(
- float *shortCosSinCache,
- float *longCosSinCache,
- float *shortFactor,
- float *longFactor,
- float rotaryBaseFrequency,
- int32_t rotaryDim,
- int32_t rotaryEmbeddingMaxPositions,
- int32_t maxPositionEmbeddings,
- int32_t originalMaxPositionEmbeddings,
- cudaStream_t stream
Initialize long RoPE cos/sin cache with interpolation.
Precomputes cos/sin values for long-context RoPE with position interpolation. Used for extending context length beyond original training range.
- Parameters:
shortCosSinCache – Short-range cos/sin cache
longCosSinCache – Long-range cos/sin cache
shortFactor – Short interpolation factors
longFactor – Long interpolation factors
rotaryBaseFrequency – Base frequency
rotaryDim – Rotary dimension
rotaryEmbeddingMaxPositions – Maximum positions
maxPositionEmbeddings – Maximum position embeddings
originalMaxPositionEmbeddings – Original max positions from training
stream – CUDA stream
- void trt_edgellm::kernel::initializeMRopeCosSin(
- float *cosSinCache,
- int64_t *mropePositionIds,
- float rotaryBaseFrequency,
- int64_t rotaryDim,
- int64_t rotaryEmbeddingMaxPositions,
- int64_t batchSize,
- bool interleaved,
- cudaStream_t stream
Initialize multi-dimensional RoPE cos/sin cache (MRoPE)
Precomputes cos/sin values for multi-dimensional rotary encoding (e.g., Qwen2-VL). Supports separate position encodings for different dimensions (temporal, spatial).
- Parameters:
cosSinCache – Output cos/sin cache
mropePositionIds – Multi-dimensional position IDs
rotaryBaseFrequency – Base frequency
rotaryDim – Rotary dimension
rotaryEmbeddingMaxPositions – Maximum positions
batchSize – Batch size
interleaved – Whether to use interleaved MRoPE
stream – CUDA stream