data#

Data layer wrapping PipelineStore for the dashboard.

DashboardStore is a param.Parameterized adapter that queries the SQLite database and exposes results as pandas DataFrames suitable for Panel reactive updates.

Classes#

DashboardStore

Reactive wrapper around PipelineStore.

Module Contents#

class physicsnemo_curator.dashboard.data.DashboardStore(db_path: str, **kwargs: Any)#

Bases: param.Parameterized

Reactive wrapper around PipelineStore.

Provides pandas DataFrame views of pipeline metrics. Triggers a cache invalidation when the refresh event fires, causing the next property access to re-query the database.

Initialize the dashboard store.

Parameters:
  • db_path (str) – Path to an existing PipelineStore SQLite database.

  • **kwargs (Any) – Additional param keyword arguments.

all_artifacts() dict[str, list[str]]#

Return all filter artifacts across all indices, resolved to absolute paths.

Returns:

Mapping of filter name to list of all resolved artifact paths.

Return type:

dict[str, list[str]]

artifacts(index: int) dict[str, list[str]]#

Return filter artifacts for a given index, resolved to absolute paths.

Parameters:

index (int) – Pipeline source index.

Returns:

Mapping of filter name to list of resolved artifact paths.

Return type:

dict[str, list[str]]

log_worker_ids() list[str]#

Return unique worker IDs from logs.

Returns:

Sorted list of unique worker IDs (including “Main”).

Return type:

list[str]

logs_df(limit: int = 500, min_level: int = 0) pandas.DataFrame#

DataFrame of log entries from the pipeline run.

Parameters:
  • limit (int) – Maximum number of log entries to retrieve (default: 500).

  • min_level (int) – Minimum log level (0=DEBUG, 10=DEBUG, 20=INFO, 30=WARNING, 40=ERROR).

Returns:

Log entries with columns: timestamp, level_name, worker_id, idx, message.

Return type:

pd.DataFrame

output_paths(index: int) list[str]#

Return output file paths for a given index.

Parameters:

index (int) – Pipeline source index.

Returns:

Ordered list of output file paths.

Return type:

list[str]

property index_df: pandas.DataFrame#

DataFrame of per-index results.

Columns: index, status, wall_time_s, peak_memory_mb, gpu_memory_mb, error.

Returns:

One row per processed index.

Return type:

pd.DataFrame

property pipeline_config: dict#

Return the pipeline configuration dictionary.

Returns:

Pipeline configuration as stored in the database.

Return type:

dict

refresh#
selected_index#
property stage_df: pandas.DataFrame#

DataFrame of per-stage timing for all indices.

Columns: index, stage_name, stage_order, wall_time_s.

Returns:

One row per (index, stage) combination.

Return type:

pd.DataFrame

property summary: dict[str, Any]#

Summary of the pipeline run state.

Returns:

Keys: total, completed, failed, remaining, elapsed_s, config_hash, db_path, workers.

Return type:

dict[str, Any]

property workers_df: pandas.DataFrame#

DataFrame of registered workers.

Columns: worker_id, pid, hostname, started_at, last_heartbeat, current_index.

Returns:

One row per worker.

Return type:

pd.DataFrame