pyorps.core.cost_assumptions

PYORPS: An Open-Source Tool for Automated Power Line Routing

Reference: [1] Hofmann, M., Stetz, T., Kammer, F., Repo, S.: ‘PYORPS: An Open-Source Tool for

Automated Power Line Routing’, CIRED 2025 - 28th Conference and Exhibition on Electricity Distribution, 16 - 19 June 2025, Geneva, Switzerland

Functions

`calculate_column_statistics`(gdf, columns[, ...])	Calculate statistical properties of columns for feature selection.
`calculate_entropy_score`(column_name, col_stats)	Calculate combined entropy score for a column, weighing area entropy more heavily.
`calculate_geometry_area`(geometries)	Calculate the sum of areas for a collection of geometries.
`column_shows_relationship_to_main_feature`(...)	Determine if a column adds meaningful information in relation to the main feature.
`detect_feature_columns`(gdf[, ...])	Analyze columns in a geodataframe to identify the best candidates for main_feature and side_features based on statistical metrics.
`find_side_features`(gdf, main_feature, col_stats)	Find suitable side feature columns that refine the main feature.
`get_zero_cost_assumptions`(gdf, main_feature, ...)	Generate cost assumptions with zero values for all feature combinations.
`save_empty_cost_assumptions`(geo_dataset, ...)	Generate and save empty cost assumptions with zero values for a geo dataset.
`select_main_feature`(col_stats)	Select the best main feature column based on statistics.

Classes

CostAssumptions([source])

A class for handling cost assumptions for rasterization.

class pyorps.core.cost_assumptions.CostAssumptions(source=None)[source]

Bases: object

A class for handling cost assumptions for rasterization.

This class handles: - Loading cost assumptions from files (CSV, Excel, JSON) or generating of cost assumptions from a dictionary or a GeoDataFrame. - Mapping costs to features in a GeoDataFrame - Managing hierarchical cost structures

Initialize the CostAssumptions object.

Parameters:

source (str | dict | None) –

Path to a cost assumptions file
A dictionary of cost values

__init__(source=None)[source]

Initialize the CostAssumptions object.

Parameters:

source (str | dict | None) –

Path to a cost assumptions file
A dictionary of cost values

load(source)[source]

Load cost assumptions from a file or dictionary.

Parameters:: source (str | dict) – Path to a file or a dictionary containing cost assumptions
Returns:: dictionary of cost assumptions
Return type:: dict

convert_df_to_cost_dict(df)[source]

Convert a DataFrame to a nested dictionary for cost assumptions.

Parameters:: df (DataFrame) – DataFrame containing cost assumptions with hierarchical structure
Returns:: dictionary of cost assumptions with nested structure based on DataFrame columns
Return type:: dict

Uses one numeric column for costs, and all other columns as a hierarchical index: - The first column is the ‘main_feature’ - All additional columns are ‘side_features’

apply_to_geodataframe(gdf, main_feature=None, side_features=None)[source]

Apply cost assumptions to a GeoDataFrame.

Parameters:

gdf (GeoDataFrame) – GeoDataFrame to apply costs to
main_feature (str | None) – Main feature column name
side_features (list[str] | None) – list of side feature column names or single side feature name

Returns:

GeoDataFrame with ‘cost’ column added

to_csv(filepath, separator=';', decimal='.', encoding='ISO-8859-1')[source]

Save the cost assumptions to a CSV file.

Parameters:

filepath (str) – Path where to save the CSV file
separator (str) – Column separator character (default is ‘;’)
decimal (str) – Decimal separator character (default is ‘.’)
encoding (str) – The encoding of the file (default is ‘ISO-8859-1’)

Return type:

None

to_json(filepath, indent=2, encoding='ISO-8859-1')[source]

Save the cost assumptions to a JSON file.

Parameters:

filepath (str) – Path where to save the JSON file
indent (int) – Number of spaces for indentation (default is 2)
encoding (str) – The encoding of the file (default is ‘ISO-8859-1’)

Return type:

None

to_excel(filepath, sheet_name='CostAssumptions', index=False)[source]

Save the cost assumptions to an Excel file.

Parameters:

filepath (str) – Path where to save the Excel file
sheet_name (str) – Name of the worksheet (default is ‘CostAssumptions’)
index (bool) – Whether to write row indices (default is False)

Return type:

None

cost_dict_to_df(cost_dict)[source]

Convert cost assumptions dictionary to DataFrame.

Parameters:: cost_dict (dict) – Dictionary of cost assumptions
Returns:: DataFrame representation of cost assumptions
Return type:: DataFrame

pyorps.core.cost_assumptions.save_empty_cost_assumptions(geo_dataset, save_path, main_feature=None, side_features=None, file_type='csv', **kwargs)[source]

Generate and save empty cost assumptions with zero values for a geo dataset.

This function analyzes the given dataset to detect appropriate feature columns, creates a CostAssumptions object with zero costs for all feature combinations, and saves it to the specified path in the requested format.

Parameters:

geo_dataset (Any) – GeoDataset object with a ‘data’ attribute containing a GeoDataFrame
save_path (str | Path) – File path where the cost assumptions should be saved
main_feature (str | None) – Column name for the primary feature
side_features (list[str] | None) – List containing a single column name for the secondary feature
file_type (str) – Output file format - one of ‘json’, ‘csv’, or ‘excel’ (default is ‘json’)

Raises:

TypeError – If file_type is not one of the supported formats
NoSuitableColumnsError – If no suitable columns can be detected in the dataset

Returns:

This function saves to a file and doesn’t return a value

Return type:

None

pyorps.core.cost_assumptions.detect_feature_columns(gdf, max_features_per_column=50)[source]

Analyze columns in a geodataframe to identify the best candidates for main_feature and side_features based on statistical metrics.

Parameters:

gdf (GeoDataFrame) – GeoDataFrame to analyze
max_features_per_column (int) – Maximum number of unique values allowed in a
column (categorical)

Returns:

tuple of (main_feature, side_features)

Raises:

NoSuitableColumnsError – When no suitable columns are found for feature selection

Return type:

tuple[str, list[str]]

pyorps.core.cost_assumptions.find_side_features(gdf, main_feature, col_stats)[source]

Find suitable side feature columns that refine the main feature.

Parameters:

gdf (GeoDataFrame) – GeoDataFrame to analyze
main_feature (str) – Selected main feature column name
col_stats (dict[str, dict[str, Any]]) – dictionary with column statistics

Returns:

list of side feature column names

Return type:

list[str]

pyorps.core.cost_assumptions.column_shows_relationship_to_main_feature(gdf, main_feature, side_feature)[source]

Determine if a column adds meaningful information in relation to the main feature.

Parameters:

gdf (GeoDataFrame) – GeoDataFrame containing the data
main_feature (str) – Name of the main feature column
side_feature (str) – Name of the potential side feature column

Returns:

True if the column shows a meaningful relationship, False otherwise

Return type:

bool

pyorps.core.cost_assumptions.get_zero_cost_assumptions(gdf, main_feature, side_features)[source]

Generate cost assumptions with zero values for all feature combinations.

Creates structures matching format for CostAssumptions: - Without side features: {main_feature: {val1: 0, val2: 0, …}} - With side features: {(main_feature, side_feature1, …): {(val1, val2, …): 0, …}}

Parameters:

gdf (GeoDataFrame) – GeoDataFrame with feature columns
main_feature (str) – Primary feature column name
side_features (list[str]) – List of secondary feature column names

Returns:

Instacne of zero-cost assumptions

Return type:

CostAssumptions

pyorps.core.cost_assumptions.calculate_geometry_area(geometries)[source]

Calculate the sum of areas for a collection of geometries.

Parameters:: geometries (GeoSeries) – Collection of geometry objects
Returns:: Sum of areas of all geometries with area attribute
Return type:: float

pyorps.core.cost_assumptions.calculate_column_statistics(gdf, columns, max_features_per_column=50)[source]

Calculate statistical properties of columns for feature selection.

Parameters:

gdf (GeoDataFrame) – GeoDataFrame to analyze
columns (list[str]) – list of column names to analyze
max_features_per_column (int) – Maximum number of unique values for a column to be
categorical (considered)

Returns:

dictionary with column statistics

Raises:

ColumnAnalysisError – When column analysis fails unexpectedly

Return type:

dict[str, dict[str, Any]]

pyorps.core.cost_assumptions.calculate_entropy_score(column_name, col_stats)[source]

Calculate combined entropy score for a column, weighing area entropy more heavily.

Parameters:

column_name (str) – Name of the column to calculate score for
col_stats (dict[str, dict[str, Any]]) – dictionary with column statistics

Returns:

Combined entropy score

Return type:

float

pyorps.core.cost_assumptions.select_main_feature(col_stats)[source]

Select the best main feature column based on statistics.

Parameters:: col_stats (dict[str, dict[str, Any]]) – dictionary with column statistics
Returns:: Name of the best main feature column
Return type:: str