bionemo-recipeutils
Shared, framework-agnostic utilities for BioNeMo recipes.
Constraint
This package must not depend on megatron-core, megatron-bridge, or NeMo.
Recipe-specific framework adapters (e.g., DatasetProvider wrappers) belong
in each recipe, not here.
Contents
bionemo.recipeutils.data.basecamp— High-performance SQLite-backed genomic dataset (ShardedEdenDataset) and window pre-computation CLI, contributed by BaseCamp Research.bionemo.recipeutils.io— File format conversion utilities (FASTA to JSONL).
Installation
pip install bionemo-recipeutils # core
pip install bionemo-recipeutils[basecamp] # + polars for window pre-computation
CLI tools
bionemo_fasta_to_jsonl input.fasta output.jsonl
bionemo_precompute_windows precompute split.parquet output.sqlite --window-size 8192