earth2studio.data.RandomDataFrame#

class earth2studio.data.RandomDataFrame(n_obs=10, tolerance=numpy.timedelta64(0), schema=None, field_generators=None)[source]#

A randomly generated DataFrame source. Primarily useful for testing.

Generates random observations at random locations for specified times and variables. Each observation is a point in space-time with a random value.

Parameters:
  • n_obs (int, optional) – Number of random observations to generate per time step, by default 10

  • tolerance (timedelta | np.timedelta64, optional) – Time tolerance; observations will be randomly sampled within +/- tolerance of each requested time, by default np.timedelta64(0)

  • schema (pa.Schema | None, optional) – PyArrow schema to use for data generation. If None, uses default SCHEMA. Data will be generated dynamically based on schema field types, by default None

  • field_generators (dict[str, Callable[[], Any]] | None, optional) – Dictionary mapping field names to generator functions. These will be merged with the default generators. Default generators include: time, lat, lon, observation, variable. User-provided generators will override defaults, by default None

__call__(time, variable, fields=None)[source]#

Retrieve random observation DataFrame.

Parameters:
  • time (datetime | list[datetime] | TimeArray) – Timestamps to return data for.

  • variable (str | list[str] | VariableArray) – Strings or list of strings that refer to variables to return.

  • fields (str | list[str] | pa.Schema | FieldArray | None, optional) – Fields to include in output, by default None (all fields).

Returns:

Random observation DataFrame

Return type:

pd.DataFrame