earth2studio.statistics.energy_score#

class earth2studio.statistics.energy_score(ensemble_dimension, multivariate_dimensions, reduction_dimensions=None, weights=None, fair=False)[source]#

Compute the Energy Score for multivariate ensemble forecast verification.

The Energy Score is the multivariate generalization of CRPS. Given an ensemble forecast {x_1, …, x_M} and an observation y, the Energy Score is defined as:

ES = (1/M) * sum_m ||x_m - y|| - 1/(2*M^2) * sum_m sum_m’ ||x_m - x_m’||

where ||.|| denotes the Euclidean norm computed over the multivariate dimensions. This is a proper scoring rule that is minimized when the forecast distribution matches the true distribution.

Unlike CRPS which evaluates each variable/grid point independently, the Energy Score captures whether the ensemble preserves spatial correlations across variables and grid points.

Warning

Setting multivariate_dimensions to large spatial grids (e.g., ['lat', 'lon'] with 721x1440 = ~1M elements) produces a feature vector of that size per ensemble member. For M=50 members this requires ~200 MB per tensor in float32. Prefer selecting a subset of dimensions unless full-field verification is explicitly needed.

Parameters:
  • ensemble_dimension (str) – A name corresponding to the dimension to perform the ensemble reduction over. Example: ‘ensemble’

  • multivariate_dimensions (list[str]) – Dimensions over which to compute the Euclidean norm. Example: [‘variable’, ‘lat’, ‘lon’] for full spatial ES, or [‘variable’] for per-grid-point multivariate ES.

  • reduction_dimensions (list[str], optional) – Dimensions over which to average the energy score after computation. By default None (no additional reduction).

  • weights (torch.Tensor, optional) – Weights for the reduction dimensions. Must have the same number of dimensions as passed in reduction_dimensions. By default None.

  • fair (bool, optional) – If True, use the fair (unbiased) Energy Score estimator, which replaces the 1/(2*M^2) denominator with 1/(2*M*(M-1)), excluding the zero self-distance diagonal from the denominator count. Requires at least 2 ensemble members. By default False.

References

Gneiting, T. and Raftery, A. E. (2007), “Strictly Proper Scoring Rules, Prediction, and Estimation”, Journal of the American Statistical Association, 102(477), 359-378.

__call__(x, x_coords, y, y_coords)[source]#

Apply the Energy Score metric to ensemble forecast x and observation y.

Parameters:
  • x (torch.Tensor) – Ensemble forecast tensor. Must contain the ensemble dimension.

  • x_coords (CoordSystem) – Coordinate system describing the x tensor. Must contain ensemble_dimension and all multivariate_dimensions.

  • y (torch.Tensor) – Observation tensor. Must not contain the ensemble dimension.

  • y_coords (CoordSystem) – Coordinate system describing the y tensor. Must contain all multivariate_dimensions.

Returns:

Energy Score tensor with appropriate reduced coordinates.

Return type:

tuple[torch.Tensor, CoordSystem]