Embedding Kernels#

void trt_edgellm::kernel::embeddingLookup( rt::Tensor const &inputIds, rt::Tensor const &embeddingTable, rt::Tensor &output, cudaStream_t stream = 0 )#

Standard embedding lookup kernel.

Parameters:

inputIds – [in] Input token IDs with shape [batchSize, seqLen]
embeddingTable – [in] Embedding table with shape [vocabSize, hiddenSize]
output – [out] Hidden states with shape [batchSize, seqLen, hiddenSize]
stream – [in] CUDA stream for execution

void trt_edgellm::kernel::embeddingLookupWithImageInsertion( rt::Tensor const &inputIds, rt::Tensor const &embeddingTable, rt::Tensor const &imageEmbeds, rt::Tensor &output, cudaStream_t stream = 0 )#

Embedding lookup with image embedding insertion following PromptTuningEmbedding logic.

Parameters:

inputIds – [in] Input token IDs with shape [batchSize, seqLen]
embeddingTable – [in] Embedding table with shape [vocabSize, hiddenSize]
imageEmbeds – [in] Image embeddings with shape [imageTokenLen, hiddenSize]
output – [out] Hidden states with shape [batchSize, seqLen, hiddenSize]
stream – [in] CUDA stream for execution