AI Agent Skills#

Earth2Studio includes AI agent skills that provide specialized guidance for AI coding assistants (Claude Code, Codex, etc.) when working with Earth2Studio. These skills help agents build inference pipelines, select models, and generate correct code.

Overview#

Skills are located in the skills/ directory at the repository root. Each skill contains:

  • SKILL.md - Instructions and workflow for the AI agent

  • evals/evals.json - Evaluation tasks for testing skill effectiveness

  • evals/targets/ - Reference outputs for evaluation grading

  • references/ - Supporting documentation (optional)

Note

Skill development is currently internal to NVIDIA. External contributions to skills are not accepted at this time.

Skill Validation#

Skills are validated using the harbor agent evaluation framework via nv-base. To run skill evaluations locally:

nv-base agent-eval <path-to-skill> -a claude-code,codex -r cli,html,json -o ./eval-results/

For example, to evaluate the deterministic forecast skill:

nv-base agent-eval skills/earth2studio-deterministic-forecast \
    -a claude-code,codex \
    -r cli,html,json \
    -o ./eval-results/

This produces:

  • cli - Console output with pass/fail summary

  • html - Interactive HTML report for detailed analysis

  • json - Machine-readable results for CI integration

Evaluation Metrics#

Skills are evaluated across five dimensions:

Dimension

Description

Security

Avoids unsafe operations, secret leakage, unauthorized access

Correctness

Agent follows expected workflow and produces correct output

Discoverability

Agent loads skill when relevant, avoids when irrelevant

Effectiveness

Agent performs better with skill than without

Efficiency

Agent uses fewer tokens and avoids redundant work