Embedding Kernels#

void trt_edgellm::kernel::embeddingLookup(
rt::Tensor const &inputIds,
rt::Tensor const &embeddingTable,
rt::Tensor &output,
cudaStream_t stream = 0
)#

Standard embedding lookup kernel.

Parameters:
  • inputIds[in] Input token IDs with shape [batchSize, seqLen]

  • embeddingTable[in] Embedding table with shape [vocabSize, hiddenSize]

  • output[out] Hidden states with shape [batchSize, seqLen, hiddenSize]

  • stream[in] CUDA stream for execution

void trt_edgellm::kernel::embeddingLookupWithImageInsertion(
rt::Tensor const &inputIds,
rt::Tensor const &embeddingTable,
rt::Tensor const &imageEmbeds,
rt::Tensor &output,
cudaStream_t stream = 0
)#

Embedding lookup with image embedding insertion following PromptTuningEmbedding logic.

Parameters:
  • inputIds[in] Input token IDs with shape [batchSize, seqLen]

  • embeddingTable[in] Embedding table with shape [vocabSize, hiddenSize]

  • imageEmbeds[in] Image embeddings with shape [imageTokenLen, hiddenSize]

  • output[out] Hidden states with shape [batchSize, seqLen, hiddenSize]

  • stream[in] CUDA stream for execution