pyorps.core.cost_assumptions

PYORPS: An Open-Source Tool for Automated Power Line Routing

Reference: [1] Hofmann, M., Stetz, T., Kammer, F., Repo, S.: ‘PYORPS: An Open-Source Tool for

Automated Power Line Routing’, CIRED 2025 - 28th Conference and Exhibition on Electricity Distribution, 16 - 19 June 2025, Geneva, Switzerland

Functions

calculate_column_statistics(gdf, columns[, ...])

Calculate statistical properties of columns for feature selection.

calculate_entropy_score(column_name, col_stats)

Calculate combined entropy score for a column, weighing area entropy more heavily.

calculate_geometry_area(geometries)

Calculate the sum of areas for a collection of geometries.

column_shows_relationship_to_main_feature(...)

Determine if a column adds meaningful information in relation to the main feature.

detect_feature_columns(gdf[, ...])

Analyze columns in a geodataframe to identify the best candidates for main_feature and side_features based on statistical metrics.

find_side_features(gdf, main_feature, col_stats)

Find suitable side feature columns that refine the main feature.

get_zero_cost_assumptions(gdf, main_feature, ...)

Generate cost assumptions with zero values for all feature combinations.

save_empty_cost_assumptions(geo_dataset, ...)

Generate and save empty cost assumptions with zero values for a geo dataset.

select_main_feature(col_stats)

Select the best main feature column based on statistics.

Classes

CostAssumptions([source])

A class for handling cost assumptions for rasterization.

class pyorps.core.cost_assumptions.CostAssumptions(source=None)[source]

Bases: object

A class for handling cost assumptions for rasterization.

This class handles: - Loading cost assumptions from files (CSV, Excel, JSON) or generating of cost assumptions from a dictionary or a GeoDataFrame. - Mapping costs to features in a GeoDataFrame - Managing hierarchical cost structures

Initialize the CostAssumptions object.

Parameters:

source (str | dict | None) –

  1. Path to a cost assumptions file

  2. A dictionary of cost values

__init__(source=None)[source]

Initialize the CostAssumptions object.

Parameters:

source (str | dict | None) –

  1. Path to a cost assumptions file

  2. A dictionary of cost values

load(source)[source]

Load cost assumptions from a file or dictionary.

Parameters:

source (str | dict) – Path to a file or a dictionary containing cost assumptions

Returns:

dictionary of cost assumptions

Return type:

dict

convert_df_to_cost_dict(df)[source]

Convert a DataFrame to a nested dictionary for cost assumptions.

Parameters:

df (DataFrame) – DataFrame containing cost assumptions with hierarchical structure

Returns:

dictionary of cost assumptions with nested structure based on DataFrame columns

Return type:

dict

Uses one numeric column for costs, and all other columns as a hierarchical index: - The first column is the ‘main_feature’ - All additional columns are ‘side_features’

apply_to_geodataframe(gdf, main_feature=None, side_features=None)[source]

Apply cost assumptions to a GeoDataFrame.

Parameters:
  • gdf (GeoDataFrame) – GeoDataFrame to apply costs to

  • main_feature (str | None) – Main feature column name

  • side_features (list[str] | None) – list of side feature column names or single side feature name

Returns:

GeoDataFrame with ‘cost’ column added

to_csv(filepath, separator=';', decimal='.', encoding='ISO-8859-1')[source]

Save the cost assumptions to a CSV file.

Parameters:
  • filepath (str) – Path where to save the CSV file

  • separator (str) – Column separator character (default is ‘;’)

  • decimal (str) – Decimal separator character (default is ‘.’)

  • encoding (str) – The encoding of the file (default is ‘ISO-8859-1’)

Return type:

None

to_json(filepath, indent=2, encoding='ISO-8859-1')[source]

Save the cost assumptions to a JSON file.

Parameters:
  • filepath (str) – Path where to save the JSON file

  • indent (int) – Number of spaces for indentation (default is 2)

  • encoding (str) – The encoding of the file (default is ‘ISO-8859-1’)

Return type:

None

to_excel(filepath, sheet_name='CostAssumptions', index=False)[source]

Save the cost assumptions to an Excel file.

Parameters:
  • filepath (str) – Path where to save the Excel file

  • sheet_name (str) – Name of the worksheet (default is ‘CostAssumptions’)

  • index (bool) – Whether to write row indices (default is False)

Return type:

None

cost_dict_to_df(cost_dict)[source]

Convert cost assumptions dictionary to DataFrame.

Parameters:

cost_dict (dict) – Dictionary of cost assumptions

Returns:

DataFrame representation of cost assumptions

Return type:

DataFrame

pyorps.core.cost_assumptions.save_empty_cost_assumptions(geo_dataset, save_path, main_feature=None, side_features=None, file_type='csv', **kwargs)[source]

Generate and save empty cost assumptions with zero values for a geo dataset.

This function analyzes the given dataset to detect appropriate feature columns, creates a CostAssumptions object with zero costs for all feature combinations, and saves it to the specified path in the requested format.

Parameters:
  • geo_dataset (Any) – GeoDataset object with a ‘data’ attribute containing a GeoDataFrame

  • save_path (str | Path) – File path where the cost assumptions should be saved

  • main_feature (str | None) – Column name for the primary feature

  • side_features (list[str] | None) – List containing a single column name for the secondary feature

  • file_type (str) – Output file format - one of ‘json’, ‘csv’, or ‘excel’ (default is ‘json’)

Raises:
Returns:

This function saves to a file and doesn’t return a value

Return type:

None

pyorps.core.cost_assumptions.detect_feature_columns(gdf, max_features_per_column=50)[source]

Analyze columns in a geodataframe to identify the best candidates for main_feature and side_features based on statistical metrics.

Parameters:
  • gdf (GeoDataFrame) – GeoDataFrame to analyze

  • max_features_per_column (int) – Maximum number of unique values allowed in a

  • column (categorical)

Returns:

tuple of (main_feature, side_features)

Raises:

NoSuitableColumnsError – When no suitable columns are found for feature selection

Return type:

tuple[str, list[str]]

pyorps.core.cost_assumptions.find_side_features(gdf, main_feature, col_stats)[source]

Find suitable side feature columns that refine the main feature.

Parameters:
  • gdf (GeoDataFrame) – GeoDataFrame to analyze

  • main_feature (str) – Selected main feature column name

  • col_stats (dict[str, dict[str, Any]]) – dictionary with column statistics

Returns:

list of side feature column names

Return type:

list[str]

pyorps.core.cost_assumptions.column_shows_relationship_to_main_feature(gdf, main_feature, side_feature)[source]

Determine if a column adds meaningful information in relation to the main feature.

Parameters:
  • gdf (GeoDataFrame) – GeoDataFrame containing the data

  • main_feature (str) – Name of the main feature column

  • side_feature (str) – Name of the potential side feature column

Returns:

True if the column shows a meaningful relationship, False otherwise

Return type:

bool

pyorps.core.cost_assumptions.get_zero_cost_assumptions(gdf, main_feature, side_features)[source]

Generate cost assumptions with zero values for all feature combinations.

Creates structures matching format for CostAssumptions: - Without side features: {main_feature: {val1: 0, val2: 0, …}} - With side features: {(main_feature, side_feature1, …): {(val1, val2, …): 0, …}}

Parameters:
  • gdf (GeoDataFrame) – GeoDataFrame with feature columns

  • main_feature (str) – Primary feature column name

  • side_features (list[str]) – List of secondary feature column names

Returns:

Instacne of zero-cost assumptions

Return type:

CostAssumptions

pyorps.core.cost_assumptions.calculate_geometry_area(geometries)[source]

Calculate the sum of areas for a collection of geometries.

Parameters:

geometries (GeoSeries) – Collection of geometry objects

Returns:

Sum of areas of all geometries with area attribute

Return type:

float

pyorps.core.cost_assumptions.calculate_column_statistics(gdf, columns, max_features_per_column=50)[source]

Calculate statistical properties of columns for feature selection.

Parameters:
  • gdf (GeoDataFrame) – GeoDataFrame to analyze

  • columns (list[str]) – list of column names to analyze

  • max_features_per_column (int) – Maximum number of unique values for a column to be

  • categorical (considered)

Returns:

dictionary with column statistics

Raises:

ColumnAnalysisError – When column analysis fails unexpectedly

Return type:

dict[str, dict[str, Any]]

pyorps.core.cost_assumptions.calculate_entropy_score(column_name, col_stats)[source]

Calculate combined entropy score for a column, weighing area entropy more heavily.

Parameters:
  • column_name (str) – Name of the column to calculate score for

  • col_stats (dict[str, dict[str, Any]]) – dictionary with column statistics

Returns:

Combined entropy score

Return type:

float

pyorps.core.cost_assumptions.select_main_feature(col_stats)[source]

Select the best main feature column based on statistics.

Parameters:

col_stats (dict[str, dict[str, Any]]) – dictionary with column statistics

Returns:

Name of the best main feature column

Return type:

str