Data Collection#

class mesa_frames.DataCollector(model: ModelDF, model_reporters: dict[str, Callable] | None = None, agent_reporters: dict[str, str | Callable] | None = None, trigger: Callable[[Any], bool] | None = None, reset_memory: bool = True, storage: Literal['memory', 'csv', 'parquet', 'S3-csv', 'S3-parquet', 'postgresql'] = 'memory', storage_uri: str | None = None, schema: str = 'public', max_worker: int = 4)[source]#

Methods:

__init__

Initialize the DataCollector with configuration options.

collect

Trigger Data collection.

conditional_collect

Trigger data collection if condition is met.

flush

Persist all collected data to configured backend.

Attributes:

data

Retrieve the collected data as eagerly evaluated Polars DataFrames.

seed

Function to get the model seed.

__init__(model: ModelDF, model_reporters: dict[str, Callable] | None = None, agent_reporters: dict[str, str | Callable] | None = None, trigger: Callable[[Any], bool] | None = None, reset_memory: bool = True, storage: Literal['memory', 'csv', 'parquet', 'S3-csv', 'S3-parquet', 'postgresql'] = 'memory', storage_uri: str | None = None, schema: str = 'public', max_worker: int = 4)[source]#

Initialize the DataCollector with configuration options.

Parameters:
  • model (ModelDF) – The model object from which data is collected.

  • model_reporters (dict[str, Callable] | None) – Functions to collect data at the model level.

  • agent_reporters (dict[str, str | Callable] | None) – Attributes or functions to collect data at the agent level.

  • trigger (Callable[[Any], bool] | None) – A function(model) -> bool that determines whether to collect data.

  • reset_memory (bool) – Whether to reset in-memory data after flushing. Default is True.

  • storage (Literal["memory", "csv", "parquet", "S3-csv", "S3-parquet", "postgresql" ]) – Storage backend URI (e.g. ‘memory:’, ‘csv:’, ‘postgresql:’).

  • storage_uri (str | None) – URI or path corresponding to the selected storage backend.

  • schema (str) – Schema name used for PostgreSQL storage.

  • max_worker (int) – Maximum number of worker threads used for flushing collected data asynchronously

property data: dict[str, DataFrame]#

Retrieve the collected data as eagerly evaluated Polars DataFrames.

Returns:

A dictionary with keys “model” and “agent” mapping to concatenated DataFrames of collected data.

Return type:

dict[str, pl.DataFrame]

collect() None#

Trigger Data collection.

This method calls _collect() to perform actual data collection.

Example

>>> datacollector.collect()
conditional_collect() None#

Trigger data collection if condition is met.

This method calls _collect() to perform actual data collection only if trigger returns True

Example

>>> datacollector.conditional_collect()
flush() None#

Persist all collected data to configured backend.

After flushing data optionally clears in-memory data buffer if reset_memory is True (default behavior).

use this method to save collected data.

Example

>>> datacollector.flush()
>>> # Data is saved externally and in-memory buffers are cleared if configured
property seed: int#

Function to get the model seed.