Data Collection#

class mesa_frames.DataCollector(model: Model, model_reporters: dict[str, Callable] | None = None, agent_reporters: dict[str, str | Callable] | None = None, trigger: Callable[[Any], bool] | None = None, reset_memory: bool = True, storage: Literal['memory', 'csv', 'parquet', 'S3-csv', 'S3-parquet', 'postgresql'] = 'memory', storage_uri: str | None = None, schema: str = 'public', max_worker: int = 4)[source]#

Methods:

`__init__`	Initialize the DataCollector with configuration options.
`collect`	Trigger Data collection.
`conditional_collect`	Trigger data collection if condition is met.
`flush`	Persist all collected data to configured backend.

Attributes:

`data`	Retrieve the collected data as eagerly evaluated Polars DataFrames.
`seed`	Function to get the model seed.

__init__(model: Model, model_reporters: dict[str, Callable] | None = None, agent_reporters: dict[str, str | Callable] | None = None, trigger: Callable[[Any], bool] | None = None, reset_memory: bool = True, storage: Literal['memory', 'csv', 'parquet', 'S3-csv', 'S3-parquet', 'postgresql'] = 'memory', storage_uri: str | None = None, schema: str = 'public', max_worker: int = 4)[source]#

Initialize the DataCollector with configuration options.

Parameters:

model (Model) – The model object from which data is collected.
model_reporters (dict[str, Callable] | None) – Functions to collect data at the model level.
agent_reporters (dict[str, str | Callable] | None) – Attributes or functions to collect data at the agent level.
trigger (Callable[[Any], bool] | None) – A function(model) -> bool that determines whether to collect data.
reset_memory (bool) – Whether to reset in-memory data after flushing. Default is True.
storage (Literal["memory", "csv", "parquet", "S3-csv", "S3-parquet", "postgresql" ]) – Storage backend URI (e.g. ‘memory:’, ‘csv:’, ‘postgresql:’).
storage_uri (str | None) – URI or path corresponding to the selected storage backend.
schema (str) – Schema name used for PostgreSQL storage.
max_worker (int) – Maximum number of worker threads used for flushing collected data asynchronously

property data: dict[str, DataFrame]#

Retrieve the collected data as eagerly evaluated Polars DataFrames.

Returns:: A dictionary with keys “model” and “agent” mapping to concatenated DataFrames of collected data.
Return type:: dict[str, pl.DataFrame]

collect() → None#

Trigger Data collection.

This method calls _collect() to perform actual data collection.

Example

>>> datacollector.collect()

conditional_collect() → None#

Trigger data collection if condition is met.

This method calls _collect() to perform actual data collection only if trigger returns True

Example

>>> datacollector.conditional_collect()

flush() → None#

Persist all collected data to configured backend.

After flushing data optionally clears in-memory data buffer if reset_memory is True (default behavior).

use this method to save collected data.

Example

>>> datacollector.flush()
>>> # Data is saved externally and in-memory buffers are cleared if configured

property seed: int#: Function to get the model seed.