How to use FlashDreams as a developer#
Pick the path that matches your goal. For the full runtime map, see Inference pipeline overview.
Run existing models
Clone the repo, install runner extras, then jump to a model page for the exact slug and flags.
Use as a Python library
Install FlashDreams and import runtime contracts from
flashdreams.infra.Add a standalone model
Ship configs and runners from your own package through
flashdreams.runner_configs.Build serving apps
Keep sessions alive and stream outputs through the serving path.
Run existing models from source#
git clone https://github.com/NVIDIA/flashdreams.git
cd flashdreams
uv sync --extra dev --extra runners
uv run flashdreams-run --help
Then pick a model page for actual slugs and flags:
Programmatic access#
pip install flashdreams
from flashdreams.infra.pipeline import StreamInferencePipeline
from my_integration.config import MY_MODEL_RUNNER
runner = MY_MODEL_RUNNER.setup()
runner.run()
For lower-level experiments:
from my_integration.config import MY_PIPELINE
pipeline: StreamInferencePipeline = MY_PIPELINE.setup()
cache = pipeline.initialize_cache(height=480, width=832)
for ar_idx in range(4):
output = pipeline.generate(ar_idx, cache)
pipeline.finalize(ar_idx, cache)
Arguments are model-specific; use the integration runner as the source of truth.
Add a new model from a standalone repo#
Minimal entry point:
[project]
name = "my-flashdreams-model"
dependencies = ["flashdreams"]
[project.entry-points."flashdreams.runner_configs"]
my-model-fast = "my_integration.config:MY_MODEL_FAST_RUNNER"
Then:
pip install -e .
flashdreams-run my-model-fast --help
Use Add a new method for the complete authoring guide.
Next links#
Models for the list of supported models with launch commands.
Add a new method for integration authoring.
Interactive serving for serving concepts.