Embedding Kernels#
- void trt_edgellm::kernel::embeddingLookup(
- rt::Tensor const &inputIds,
- rt::Tensor const &embeddingTable,
- rt::Tensor &output,
- cudaStream_t stream = 0
Standard embedding lookup kernel.
- Parameters:
inputIds – [in] Input token IDs with shape [batchSize, seqLen]
embeddingTable – [in] Embedding table with shape [vocabSize, hiddenSize]
output – [out] Hidden states with shape [batchSize, seqLen, hiddenSize]
stream – [in] CUDA stream for execution
- void trt_edgellm::kernel::embeddingLookupWithImageInsertion(
- rt::Tensor const &inputIds,
- rt::Tensor const &embeddingTable,
- rt::Tensor const &imageEmbeds,
- rt::Tensor &output,
- cudaStream_t stream = 0
Embedding lookup with image embedding insertion following PromptTuningEmbedding logic.
- Parameters:
inputIds – [in] Input token IDs with shape [batchSize, seqLen]
embeddingTable – [in] Embedding table with shape [vocabSize, hiddenSize]
imageEmbeds – [in] Image embeddings with shape [imageTokenLen, hiddenSize]
output – [out] Hidden states with shape [batchSize, seqLen, hiddenSize]
stream – [in] CUDA stream for execution