Troubleshoot NeMo Retriever Extraction

Use this documentation to troubleshoot issues that arise when you use NeMo Retriever extraction.

Note

NeMo Retriever extraction is also known as NVIDIA Ingest and nv-ingest.

Can't process long, non-language text strings

NeMo Retriever extraction is designed to process language and language-length strings. If you submit a document that contains extremely long, or non-language text strings, such as a DNA sequence, errors or unexpected results occur.

Can't process malformed input files

When you run a job you might see errors similar to the following:

Failed to process the message
Failed to extract image
File may be malformed
Failed to format paragraph

These errors can occur when your input file is malformed. Verify or fix the format of your input file, and try resubmitting your job.

Can't start new thread error

In rare cases, when you run a job you might an see an error similar to can't start new thread. This error occurs when the maximum number of processes available to a single user is too low. To resolve the issue, set or raise the maximum number of processes (-u) by using the ulimit command. Before you change the -u setting, consider the following:

Apply the -u setting directly to the user (or the Docker container environment) that runs your ingest service.
For -u we recommend 10,000 as a baseline, but you might need to raise or lower it based on your actual usage and system configuration.

ulimit -u 10,000

Out-of-Memory (OOM) Error when Processing Large Datasets

When you process a very large dataset with thousands of documents, you might encounter an Out-of-Memory (OOM) error. This happens because, by default, NeMo Retriever extraction stores the results from every document in system memory (RAM). If the total size of the results exceeds the available memory, the process fails.

To resolve this issue, use the save_to_disk method. For details, refer to Working with Large Datasets: Saving to Disk.

Embedding service fails to start with an unsupported batch size error

On certain hardware, for example RTX 6000, the embedding service might fail to start and you might see an error similar to the following.

ValueError: Configured max_batch_size (30) is larger than the model''s supported max_batch_size (3).

If you are using hardware where the embedding NIM uses the ONNX model profile, you must set EMBEDDER_BATCH_SIZE=3 in your environment. You can set the variable in your .env file or directly in your environment.

Extract method nemoretriever-parse doesn't support image files

Currently, extraction with nemoretriever-parse doesn't support image files, only scanned PDFs. To work around this issue, convert image files to PDFs before you use extract_method="nemoretriever_parse".

Too many open files error

In rare cases, when you run a job you might an see an error similar to too many open files or max open file descriptor. This error occurs when the open file descriptor limit for your service user account is too low. To resolve the issue, set or raise the maximum number of open file descriptors (-n) by using the ulimit command. Before you change the -n setting, consider the following:

Apply the -n setting directly to the user (or the Docker container environment) that runs your ingest service.
For -n we recommend 10,000 as a baseline, but you might need to raise or lower it based on your actual usage and system configuration.

ulimit -n 10,000

Triton server INFO messages incorrectly logged as errors

Sometimes messages are incorrectly logged as errors, when they are information. When this happens, you can ignore the errors, and treat the messages as information. For example, you might see log messages that look similar to the following.

ERROR 2025-04-24 22:49:44.266 nimutils.py:68] tritonserver: /usr/local/lib/libcurl.so.4: ...
ERROR 2025-04-24 22:49:44.268 nimutils.py:68] I0424 22:49:44.265292 98 cache_manager.cc:480] "Create CacheManager with cache_dir: '/opt/tritonserver/caches'"
ERROR 2025-04-24 22:49:44.431 nimutils.py:68] I0424 22:49:44.431796 98 pinned_memory_manager.cc:277] "Pinned memory pool is created at '0x7f8e4a000000' with size 268435456"
ERROR 2025-04-24 22:49:44.432 nimutils.py:68] I0424 22:49:44.432036 98 cuda_memory_manager.cc:107] "CUDA memory pool is created on device 0 with size 67108864"
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] I0424 22:49:44.433448 98 model_config_utils.cc:753] "Server side auto-completed config: "
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] name: "yolox"
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] platform: "tensorrt_plan"
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] max_batch_size: 32
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] input {
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] name: "input"
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] data_type: TYPE_FP32
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] dims: 3
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] dims: 1024
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] dims: 1024
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] }
ERROR 2025-04-24 22:49:44.433 nimutils.py:68] output {
ERROR 2025-04-24 22:49:44.434 nimutils.py:68] name: "output"
ERROR 2025-04-24 22:49:44.434 nimutils.py:68] data_type: TYPE_FP32
ERROR 2025-04-24 22:49:44.434 nimutils.py:68] dims: 21504
ERROR 2025-04-24 22:49:44.434 nimutils.py:68] dims: 9
ERROR 2025-04-24 22:49:44.434 nimutils.py:68] }
...