SmartBuildSim

Data Generation

smartbuildsim.data produces deterministic, reproducible telemetry once a Building description is available.

Configuration

smartbuildsim.data.generator.DataGeneratorConfig controls the synthetic time series. Important fields include:

Core functions

CLI usage

The data generate command combines BIM input, generator configuration, and output handling. With the example configuration you can run:

smartbuildsim data generate examples/configs/default.yaml

This command is the second step of examples/scripts/run_example.py and writes outputs/dataset.csv in the project root.

Python example

from smartbuildsim.data.generator import DataGeneratorConfig, generate_dataset
from smartbuildsim.data.validation import compare_datasets
from smartbuildsim.scenarios.presets import get_scenario
import pandas as pd

scenario = get_scenario("office-small")
config = DataGeneratorConfig(**scenario.data.model_dump())
dataset = generate_dataset(scenario.building, config)
print(dataset.attrs.get("normalization"))

reference = pd.read_csv("docs/reference/datasets/ashrae_sample.csv")
report = compare_datasets(dataset, reference, sensor_mapping={"office_energy": "meter_0_energy"})

print(report.notes)

The resulting dataset flows into feature engineering and the modelling pipelines.