pyorps.core package

Submodules

pyorps.core.cost_assumptions module

class pyorps.core.cost_assumptions.Any(*args, **kwargs)[source]

Bases: object

Special type indicating an unconstrained type.

Any is compatible with every type.
Any assumed to have all methods.
All values assumed to be instances of Any.

Note that all the above statements are true from the point of view of static type checkers. At runtime, Any should not be used with instance checks.

class pyorps.core.cost_assumptions.CostAssumptions(source=None)[source]

Bases: object

A class for handling cost assumptions for rasterization.

This class handles: - Loading cost assumptions from files (CSV, Excel, JSON) or generating of cost assumptions from a dictionary or a GeoDataFrame. - Mapping costs to features in a GeoDataFrame - Managing hierarchical cost structures

_apply_nested_costs(gdf, main_feature=None, side_features=None)[source]

Apply costs to the GeoDataFrame based on nested dictionary cost assumptions.

Parameters:

gdf (GeoDataFrame) – GeoDataFrame to update with cost values
main_feature (Optional[str]) – Column name for the primary feature
side_features (Optional[list[str]]) – List containing a single column name for the
feature (secondary)

Returns:

None (modifies gdf in-place)

_apply_tuple_costs(gdf, main_feature=None, side_features=None)[source]

Apply costs to the GeoDataFrame based on tuple keys in cost assumptions.

Parameters:

gdf (GeoDataFrame) – GeoDataFrame to update with cost values
main_feature (Optional[str]) – Column name for the primary feature
side_features (Optional[list[str]]) – List of column names for secondary features

Returns:

None (modifies gdf in-place)

static _convert_numeric_columns(df)[source]

Convert columns to numeric, handling different decimal separators.

Parameters:

df (DataFrame) – DataFrame with potential numeric columns that might use different
separators (decimal)

Return type:

DataFrame

Returns:

DataFrame with properly converted numeric columns

_load_csv_cost_assumptions(filepath)[source]

Load cost assumptions from a CSV file with auto-detection of encoding, delimiter, and decimal separator.

Parameters:: filepath (str) – Path to the CSV file
Return type:: dict
Returns:: dictionary of cost assumptions

_load_excel_cost_assumptions(filepath)[source]

Load cost assumptions from an Excel file, handling different decimal separators.

Parameters:: filepath (str) – Path to the Excel file
Return type:: dict
Returns:: dictionary of cost assumptions

_load_json_cost_assumptions(filepath)[source]

Load cost assumptions from a JSON file with auto-detection of encoding.

Parameters:: filepath (str) – Path to the JSON file
Return type:: dict
Returns:: dictionary of cost assumptions

apply_to_geodataframe(gdf, main_feature=None, side_features=None)[source]

Apply cost assumptions to a GeoDataFrame.

Parameters:

gdf (GeoDataFrame) – GeoDataFrame to apply costs to
main_feature (Optional[str]) – Main feature column name
side_features (Optional[list[str]]) – list of side feature column names or single side feature name

Returns:

GeoDataFrame with ‘cost’ column added

convert_df_to_cost_dict(df)[source]

Convert a DataFrame to a nested dictionary for cost assumptions.

Parameters:: df (DataFrame) – DataFrame containing cost assumptions with hierarchical structure
Return type:: dict
Returns:: dictionary of cost assumptions with nested structure based on DataFrame columns

Uses one numeric column for costs, and all other columns as a hierarchical index: - The first column is the ‘main_feature’ - All additional columns are ‘side_features’

cost_dict_to_df(cost_dict)[source]

Convert cost assumptions dictionary to DataFrame.

Parameters:: cost_dict (dict) – Dictionary of cost assumptions
Return type:: DataFrame
Returns:: DataFrame representation of cost assumptions

load(source)[source]

Load cost assumptions from a file or dictionary.

Parameters:: source (Union[str, dict]) – Path to a file or a dictionary containing cost assumptions
Return type:: dict
Returns:: dictionary of cost assumptions

to_csv(filepath, separator=';', decimal='.', encoding='ISO-8859-1')[source]

Save the cost assumptions to a CSV file.

Parameters:

filepath (str) – Path where to save the CSV file
separator (str) – Column separator character (default is ‘;’)
decimal (str) – Decimal separator character (default is ‘.’)
encoding (str) – The encoding of the file (default is ‘ISO-8859-1’)

Return type:

None

to_excel(filepath, sheet_name='CostAssumptions', index=False)[source]

Save the cost assumptions to an Excel file.

Parameters:

filepath (str) – Path where to save the Excel file
sheet_name (str) – Name of the worksheet (default is ‘CostAssumptions’)
index (bool) – Whether to write row indices (default is False)

Return type:

None

to_json(filepath, indent=2, encoding='ISO-8859-1')[source]

Save the cost assumptions to a JSON file.

Parameters:

filepath (str) – Path where to save the JSON file
indent (int) – Number of spaces for indentation (default is 2)
encoding (str) – The encoding of the file (default is ‘ISO-8859-1’)

Return type:

None

exception pyorps.core.cost_assumptions.FileLoadError[source]

Bases: CostAssumptionsError

Exception raised when loading files fails.

exception pyorps.core.cost_assumptions.FormatError[source]

Bases: CostAssumptionsError

Exception raised when data format is invalid.

class pyorps.core.cost_assumptions.GeoDataFrame(data=None, *args, geometry=None, crs=None, **kwargs)[source]

Bases: GeoPandasBase, DataFrame

A GeoDataFrame object is a pandas.DataFrame that has one or more columns containing geometry. In addition to the standard DataFrame constructor arguments, GeoDataFrame also accepts the following keyword arguments:

Parameters:

crs (value (optional)) – Coordinate Reference System of the geometry objects. Can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.
geometry (str or array-like (optional)) –
Value to use as the active geometry column. If str, treated as column name to use. If array-like, it will be added as new column named ‘geometry’ on the GeoDataFrame and set as the active geometry column.

Note that if geometry is a (Geo)Series with a name, the name will not be used, a column named “geometry” will still be added. To preserve the name, you can use rename_geometry() to update the geometry column name.

Examples

Constructing GeoDataFrame from a dictionary.

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(d, crs="EPSG:4326")
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)

Notice that the inferred dtype of ‘geometry’ columns is geometry.

>>> gdf.dtypes
col1          object
geometry    geometry
dtype: object

Constructing GeoDataFrame from a pandas DataFrame with a column of WKT geometries:

>>> import pandas as pd
>>> d = {'col1': ['name1', 'name2'], 'wkt': ['POINT (1 2)', 'POINT (2 1)']}
>>> df = pd.DataFrame(d)
>>> gs = geopandas.GeoSeries.from_wkt(df['wkt'])
>>> gdf = geopandas.GeoDataFrame(df, geometry=gs, crs="EPSG:4326")
>>> gdf
    col1          wkt     geometry
0  name1  POINT (1 2)  POINT (1 2)
1  name2  POINT (2 1)  POINT (2 1)

See also

GeoSeries: Series object designed to store shapely geometry objects

property _constructor: Used when a manipulation result has the same dimensions as the original.

_constructor_from_mgr(mgr, axes)[source]

property _constructor_sliced

One-dimensional ndarray with axis labels (including time series).

Labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index. Statistical methods from ndarray have been overridden to automatically exclude missing data (currently represented as NaN).

Operations between Series (+, -, /, *, **) align values based on their associated index values– they need not be the same length. The result index will be the sorted union of the two indexes.

Parameters:

data (array-like, Iterable, dict, or scalar value) – Contains data stored in Series. If data is a dict, argument order is maintained.
index (array-like or Index (1d)) – Values must be hashable and have the same length as data. Non-unique index values are allowed. Will default to RangeIndex (0, 1, 2, …, n) if not provided. If data is dict-like and index is None, then the keys in the data are used as the index. If the index is not None, the resulting Series is reindexed with the index values.
dtype (str, numpy.dtype, or ExtensionDtype, optional) – Data type for the output Series. If not specified, this will be inferred from data. See the user guide for more usages.
name (Hashable, default None) – The name to give to the Series.
copy (bool, default False) – Copy input data. Only affects Series or 1d ndarray input. See examples.

Notes

Please reference the User Guide for more information.

Examples

Constructing Series from a dictionary with an Index specified

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['a', 'b', 'c'])
>>> ser
a   1
b   2
c   3
dtype: int64

The keys of the dictionary match with the Index values, hence the Index values have no effect.

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['x', 'y', 'z'])
>>> ser
x   NaN
y   NaN
z   NaN
dtype: float64

Note that the Index is first build with the keys from the dictionary. After this the Series is reindexed with the given Index values, hence we get all NaN as a result.

Constructing Series from a list with copy=False.

>>> r = [1, 2]
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
[1, 2]
>>> ser
0    999
1      2
dtype: int64

Due to input data type the Series has a copy of the original data even though copy=False, so the data is unchanged.

Constructing Series from a 1d ndarray with copy=False.

>>> r = np.array([1, 2])
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
array([999,   2])
>>> ser
0    999
1      2
dtype: int64

Due to input data type the Series has a view on the original data, so the data is changed as well.

_constructor_sliced_from_mgr(mgr, axes)[source]

_geometry_column_name = None

_get_geometry()[source]

_internal_names: list[str] = ['_mgr', '_cacher', '_item_cache', '_cache', '_is_copy', '_name', '_metadata', '_flags', 'geometry']

_internal_names_set: set[str] = {'_cache', '_cacher', '_flags', '_is_copy', '_item_cache', '_metadata', '_mgr', '_name', 'geometry'}

_metadata: list[str] = ['_geometry_column_name']

_persist_old_default_geometry_colname()[source]: Internal util to temporarily persist the default geometry column name of ‘geometry’ for backwards compatibility.

_set_geometry(col)[source]

property active_geometry_name

Return the name of the active geometry column

Returns a string name if a GeoDataFrame has an active geometry column set. Otherwise returns None. You can also access the active geometry column using the .geometry property. You can set a GeoSeries to be an active geometry using the set_geometry() method.

Returns:: name of an active geometry column or None
Return type:: str

See also

GeoDataFrame.set_geometry: set the active geometry

apply(func, axis=0, raw=False, result_type=None, args=(), **kwargs)[source]

Two-dimensional, size-mutable, potentially heterogeneous tabular data.

Data structure also contains labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure.

Parameters:

data (ndarray (structured or homogeneous), Iterable, dict, or DataFrame) –
Dict can contain Series, arrays, constants, dataclass or list-like objects. If data is a dict, column order follows insertion-order. If a dict contains Series which have an index defined, it is aligned by its index. This alignment also occurs if data is a Series or a DataFrame itself. Alignment is done on Series/DataFrame inputs.

If data is a list of dicts, column order follows insertion-order.
index (Index or array-like) – Index to use for resulting frame. Will default to RangeIndex if no indexing information part of input data and no index provided.
columns (Index or array-like) – Column labels to use for resulting frame when data does not have them, defaulting to RangeIndex(0, 1, 2, …, n). If data contains column labels, will perform column selection instead.
dtype (dtype, default None) – Data type to force. Only a single dtype is allowed. If None, infer.
copy (bool or None, default None) –
Copy data from inputs. For dict data, the default of None behaves like copy=True. For DataFrame or 2d ndarray input, the default of None behaves like copy=False. If data is a dict containing one or more Series (possibly of different dtypes), copy=False will ensure that these inputs are not copied.

Changed in version 1.3.0.

See also

DataFrame.from_records: Constructor from tuples, also record arrays.
DataFrame.from_dict: From dicts of Series, arrays, or dicts.
read_csv: Read a comma-separated values (csv) file into DataFrame.
read_table: Read general delimited file into DataFrame.
read_clipboard: Read text from clipboard into DataFrame.

Notes

Please reference the User Guide for more information.

Examples

Constructing DataFrame from a dictionary.

>>> d = {'col1': [1, 2], 'col2': [3, 4]}
>>> df = pd.DataFrame(data=d)
>>> df
   col1  col2
0     1     3
1     2     4

Notice that the inferred dtype is int64.

>>> df.dtypes
col1    int64
col2    int64
dtype: object

To enforce a single dtype:

>>> df = pd.DataFrame(data=d, dtype=np.int8)
>>> df.dtypes
col1    int8
col2    int8
dtype: object

Constructing DataFrame from a dictionary including Series:

>>> d = {'col1': [0, 1, 2, 3], 'col2': pd.Series([2, 3], index=[2, 3])}
>>> pd.DataFrame(data=d, index=[0, 1, 2, 3])
   col1  col2
0     0   NaN
1     1   NaN
2     2   2.0
3     3   3.0

Constructing DataFrame from numpy ndarray:

>>> df2 = pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]),
...                    columns=['a', 'b', 'c'])
>>> df2
   a  b  c
0  1  2  3
1  4  5  6
2  7  8  9

Constructing DataFrame from a numpy ndarray that has labeled columns:

>>> data = np.array([(1, 2, 3), (4, 5, 6), (7, 8, 9)],
...                 dtype=[("a", "i4"), ("b", "i4"), ("c", "i4")])
>>> df3 = pd.DataFrame(data, columns=['c', 'a'])
...
>>> df3
   c  a
0  3  1
1  6  4
2  9  7

Constructing DataFrame from dataclass:

>>> from dataclasses import make_dataclass
>>> Point = make_dataclass("Point", [("x", int), ("y", int)])
>>> pd.DataFrame([Point(0, 0), Point(0, 3), Point(2, 3)])
   x  y
0  0  0
1  0  3
2  2  3

Constructing DataFrame from Series/DataFrame:

>>> ser = pd.Series([1, 2, 3], index=["a", "b", "c"])
>>> df = pd.DataFrame(data=ser, index=["a", "c"])
>>> df
   0
a  1
c  3

>>> df1 = pd.DataFrame([1, 2, 3], index=["a", "b", "c"], columns=["x"])
>>> df2 = pd.DataFrame(data=df1, index=["a", "c"])
>>> df2
   x
a  1
c  3

astype(dtype, copy=None, errors='raise', **kwargs)[source]: Cast a pandas object to a specified dtype dtype. Returns a GeoDataFrame when the geometry column is kept as geometries, otherwise returns a pandas DataFrame. See the pandas.DataFrame.astype docstring for more details. :rtype: GeoDataFrame or DataFrame

clip(mask, keep_geom_type=False, sort=False)[source]

Clip points, lines, or polygon geometries to the mask extent.

Both layers must be in the same Coordinate Reference System (CRS). The GeoDataFrame will be clipped to the full extent of the mask object.

If there are multiple polygons in mask, data from the GeoDataFrame will be clipped to the total boundary of all polygons in mask.

Parameters:

mask (GeoDataFrame, GeoSeries, (Multi)Polygon, list-like) – Polygon vector layer used to clip the GeoDataFrame. The mask’s geometry is dissolved into one geometric feature and intersected with GeoDataFrame. If the mask is list-like with four elements (minx, miny, maxx, maxy), clip will use a faster rectangle clipping (clip_by_rect()), possibly leading to slightly different results.
keep_geom_type (boolean, default False) – If True, return only geometries of original type in case of intersection resulting in multiple geometry types or GeometryCollections. If False, return all resulting geometries (potentially mixed types).
sort (boolean, default False) – If True, the order of rows in the clipped GeoDataFrame will be preserved at small performance cost. If False the order of rows in the clipped GeoDataFrame will be random.

Returns:

Vector data (points, lines, polygons) from the GeoDataFrame clipped to polygon boundary from mask.

Return type:

GeoDataFrame

See also

clip: equivalent top-level function

Examples

Clip points (grocery stores) with polygons (the Near West Side community):

>>> import geodatasets
>>> chicago = geopandas.read_file(
...     geodatasets.get_path("geoda.chicago_health")
... )
>>> near_west_side = chicago[chicago["community"] == "NEAR WEST SIDE"]
>>> groceries = geopandas.read_file(
...     geodatasets.get_path("geoda.groceries")
... ).to_crs(chicago.crs)
>>> groceries.shape
(148, 8)

>>> nws_groceries = groceries.clip(near_west_side)
>>> nws_groceries.shape
(7, 8)

copy(deep=True)[source]

Two-dimensional, size-mutable, potentially heterogeneous tabular data.

Data structure also contains labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure.

Parameters:

data (ndarray (structured or homogeneous), Iterable, dict, or DataFrame) –
Dict can contain Series, arrays, constants, dataclass or list-like objects. If data is a dict, column order follows insertion-order. If a dict contains Series which have an index defined, it is aligned by its index. This alignment also occurs if data is a Series or a DataFrame itself. Alignment is done on Series/DataFrame inputs.

If data is a list of dicts, column order follows insertion-order.
index (Index or array-like) – Index to use for resulting frame. Will default to RangeIndex if no indexing information part of input data and no index provided.
columns (Index or array-like) – Column labels to use for resulting frame when data does not have them, defaulting to RangeIndex(0, 1, 2, …, n). If data contains column labels, will perform column selection instead.
dtype (dtype, default None) – Data type to force. Only a single dtype is allowed. If None, infer.
copy (bool or None, default None) –
Copy data from inputs. For dict data, the default of None behaves like copy=True. For DataFrame or 2d ndarray input, the default of None behaves like copy=False. If data is a dict containing one or more Series (possibly of different dtypes), copy=False will ensure that these inputs are not copied.

Changed in version 1.3.0.

See also

DataFrame.from_records: Constructor from tuples, also record arrays.
DataFrame.from_dict: From dicts of Series, arrays, or dicts.
read_csv: Read a comma-separated values (csv) file into DataFrame.
read_table: Read general delimited file into DataFrame.
read_clipboard: Read text from clipboard into DataFrame.

Notes

Please reference the User Guide for more information.

Examples

Constructing DataFrame from a dictionary.

>>> d = {'col1': [1, 2], 'col2': [3, 4]}
>>> df = pd.DataFrame(data=d)
>>> df
   col1  col2
0     1     3
1     2     4

Notice that the inferred dtype is int64.

>>> df.dtypes
col1    int64
col2    int64
dtype: object

To enforce a single dtype:

>>> df = pd.DataFrame(data=d, dtype=np.int8)
>>> df.dtypes
col1    int8
col2    int8
dtype: object

Constructing DataFrame from a dictionary including Series:

>>> d = {'col1': [0, 1, 2, 3], 'col2': pd.Series([2, 3], index=[2, 3])}
>>> pd.DataFrame(data=d, index=[0, 1, 2, 3])
   col1  col2
0     0   NaN
1     1   NaN
2     2   2.0
3     3   3.0

Constructing DataFrame from numpy ndarray:

>>> df2 = pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]),
...                    columns=['a', 'b', 'c'])
>>> df2
   a  b  c
0  1  2  3
1  4  5  6
2  7  8  9

Constructing DataFrame from a numpy ndarray that has labeled columns:

>>> data = np.array([(1, 2, 3), (4, 5, 6), (7, 8, 9)],
...                 dtype=[("a", "i4"), ("b", "i4"), ("c", "i4")])
>>> df3 = pd.DataFrame(data, columns=['c', 'a'])
...
>>> df3
   c  a
0  3  1
1  6  4
2  9  7

Constructing DataFrame from dataclass:

>>> from dataclasses import make_dataclass
>>> Point = make_dataclass("Point", [("x", int), ("y", int)])
>>> pd.DataFrame([Point(0, 0), Point(0, 3), Point(2, 3)])
   x  y
0  0  0
1  0  3
2  2  3

Constructing DataFrame from Series/DataFrame:

>>> ser = pd.Series([1, 2, 3], index=["a", "b", "c"])
>>> df = pd.DataFrame(data=ser, index=["a", "c"])
>>> df
   0
a  1
c  3

>>> df1 = pd.DataFrame([1, 2, 3], index=["a", "b", "c"], columns=["x"])
>>> df2 = pd.DataFrame(data=df1, index=["a", "c"])
>>> df2
   x
a  1
c  3

property crs

The Coordinate Reference System (CRS) represented as a pyproj.CRS object.

Returns None if the CRS is not set, and to set the value it :getter: Returns a pyproj.CRS or None. When setting, the value can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.

Examples

>>> gdf.crs
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

See also

GeoDataFrame.set_crs: assign CRS
GeoDataFrame.to_crs: re-project to another CRS

dissolve(by=None, aggfunc='first', as_index=True, level=None, sort=True, observed=False, dropna=True, method='unary', **kwargs)[source]

Dissolve geometries within groupby into single observation. This is accomplished by applying the union_all method to all geometries within a groupself.

Observations associated with each groupby group will be aggregated using the aggfunc.

Parameters:

by (str or list-like, default None) – Column(s) whose values define the groups to be dissolved. If None, the entire GeoDataFrame is considered as a single group. If a list-like object is provided, the values in the list are treated as categorical labels, and polygons will be combined based on the equality of these categorical labels.
aggfunc (function or string, default "first") –
Aggregation function for manipulation of data associated with each group. Passed to pandas groupby.agg method. Accepted combinations are:
- function
- string function name
- list of functions and/or function names, e.g. [np.sum, ‘mean’]
- dict of axis labels -> functions, function names or list of such.
as_index (boolean, default True) – If true, groupby columns become index of result.
level (int or str or sequence of int or sequence of str, default None) – If the axis is a MultiIndex (hierarchical), group by a particular level or levels.
sort (bool, default True) – Sort group keys. Get better performance by turning this off. Note this does not influence the order of observations within each group. Groupby preserves the order of rows within each group.
observed (bool, default False) – This only applies if any of the groupers are Categoricals. If True: only show observed values for categorical groupers. If False: show all values for categorical groupers.
dropna (bool, default True) – If True, and if group keys contain NA values, NA values together with row/column will be dropped. If False, NA values will also be treated as the key in groups.
method (str (default "unary")) –
The method to use for the union. Options are:
- "unary": use the unary union algorithm. This option is the most robust but can be slow for large numbers of geometries (default).
- "coverage": use the coverage union algorithm. This option is optimized for non-overlapping polygons and can be significantly faster than the unary union algorithm. However, it can produce invalid geometries if the polygons overlap.
**kwargs –
Keyword arguments to be passed to the pandas DataFrameGroupby.agg method which is used by dissolve. In particular, numeric_only may be supplied, which will be required in pandas 2.0 for certain aggfuncs.

Added in version 0.13.0.

Return type:

GeoDataFrame

Examples

>>> from shapely.geometry import Point
>>> d = {
...     "col1": ["name1", "name2", "name1"],
...     "geometry": [Point(1, 2), Point(2, 1), Point(0, 1)],
... }
>>> gdf = geopandas.GeoDataFrame(d, crs=4326)
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)
2  name1  POINT (0 1)

>>> dissolved = gdf.dissolve('col1')
>>> dissolved
                        geometry
col1
name1  MULTIPOINT ((0 1), (1 2))
name2                POINT (2 1)

See also

GeoDataFrame.explode: explode multi-part geometries into single geometries

estimate_utm_crs(datum_name='WGS 84')[source]

Returns the estimated UTM CRS based on the bounds of the dataset.

Added in version 0.9.

Parameters:: datum_name (str, optional) – The name of the datum to use in the query. Default is WGS 84.
Return type:: pyproj.CRS

Examples

>>> import geodatasets
>>> df = geopandas.read_file(
...     geodatasets.get_path("geoda.chicago_health")
... )
>>> df.estimate_utm_crs()
<Derived Projected CRS: EPSG:32616>
Name: WGS 84 / UTM zone 16N
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: Between 90°W and 84°W, northern hemisphere between equator and 84°N...
- bounds: (-90.0, 0.0, -84.0, 84.0)
Coordinate Operation:
- name: UTM zone 16N
- method: Transverse Mercator
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

explode(column=None, ignore_index=False, index_parts=False, **kwargs)[source]

Explode multi-part geometries into multiple single geometries.

Each row containing a multi-part geometry will be split into multiple rows with single geometries, thereby increasing the vertical size of the GeoDataFrame.

Parameters:

column (string, default None) – Column to explode. In the case of a geometry column, multi-part geometries are converted to single-part. If None, the active geometry column is used.
ignore_index (bool, default False) – If True, the resulting index will be labelled 0, 1, …, n - 1, ignoring index_parts.
index_parts (boolean, default False) – If True, the resulting index will be a multi-index (original index with an additional level indicating the multiple geometries: a new zero-based index for each single part geometry per multi-part geometry).

Returns:

Exploded geodataframe with each single geometry as a separate entry in the geodataframe.

Return type:

GeoDataFrame

Examples

>>> from shapely.geometry import MultiPoint
>>> d = {
...     "col1": ["name1", "name2"],
...     "geometry": [
...         MultiPoint([(1, 2), (3, 4)]),
...         MultiPoint([(2, 1), (0, 0)]),
...     ],
... }
>>> gdf = geopandas.GeoDataFrame(d, crs=4326)
>>> gdf
    col1               geometry
0  name1  MULTIPOINT ((1 2), (3 4))
1  name2  MULTIPOINT ((2 1), (0 0))

>>> exploded = gdf.explode(index_parts=True)
>>> exploded
      col1     geometry
0 0  name1  POINT (1 2)
  1  name1  POINT (3 4)
1 0  name2  POINT (2 1)
  1  name2  POINT (0 0)

>>> exploded = gdf.explode(index_parts=False)
>>> exploded
    col1     geometry
0  name1  POINT (1 2)
0  name1  POINT (3 4)
1  name2  POINT (2 1)
1  name2  POINT (0 0)

>>> exploded = gdf.explode(ignore_index=True)
>>> exploded
    col1     geometry
0  name1  POINT (1 2)
1  name1  POINT (3 4)
2  name2  POINT (2 1)
3  name2  POINT (0 0)

See also

GeoDataFrame.dissolve: dissolve geometries into a single observation.

explore(*args, **kwargs)[source]

Interactive map based on GeoPandas and folium/leaflet.js

Generate an interactive leaflet map based on GeoDataFrame

Parameters:

column (str, np.array, pd.Series (default None)) – The name of the dataframe column, numpy.array, or pandas.Series to be plotted. If numpy.array or pandas.Series are used then it must have same length as dataframe.
cmap (str, matplotlib.Colormap, branca.colormap or function (default None)) –
The name of a colormap recognized by matplotlib, a list-like of colors, matplotlib.colors.Colormap, a branca.colormap.ColorMap or function that returns a named color or hex based on the column value, e.g.:
```
def my_colormap(value):  # scalar value defined in 'column'
    if value > 1:
        return "green"
    return "red"
```
color (str, array-like (default None)) – Named color or a list-like of colors (named or hex).
m (folium.Map (default None)) – Existing map instance on which to draw the plot.
tiles (str, xyzservices.TileProvider (default 'OpenStreetMap Mapnik')) –
Map tileset to use. Can choose from the list supported by folium, query a xyzservices.TileProvider by a name from xyzservices.providers, pass xyzservices.TileProvider object or pass custom XYZ URL. The current list of built-in providers (when xyzservices is not available):

["OpenStreetMap", "CartoDB positron", “CartoDB dark_matter"]

You can pass a custom tileset to Folium by passing a Leaflet-style URL to the tiles parameter: http://{s}.yourtiles.com/{z}/{x}/{y}.png. Be sure to check their terms and conditions and to provide attribution with the attr keyword.
attr (str (default None)) – Map tile attribution; only required if passing custom tile URL.
tooltip (bool, str, int, list (default True)) – Display GeoDataFrame attributes when hovering over the object. True includes all columns. False removes tooltip. Pass string or list of strings to specify a column(s). Integer specifies first n columns to be included. Defaults to True.
popup (bool, str, int, list (default False)) – Input GeoDataFrame attributes for object displayed when clicking. True includes all columns. False removes popup. Pass string or list of strings to specify a column(s). Integer specifies first n columns to be included. Defaults to False.
highlight (bool (default True)) – Enable highlight functionality when hovering over a geometry.
categorical (bool (default False)) – If False, cmap will reflect numerical values of the column being plotted. For non-numerical columns, this will be set to True.
legend (bool (default True)) – Plot a legend in choropleth plots. Ignored if no column is given.
scheme (str (default None)) – Name of a choropleth classification scheme (requires mapclassify >= 2.4.0). A mapclassify.classify() will be used under the hood. Supported are all schemes provided by mapclassify (e.g. 'BoxPlot', 'EqualInterval', 'FisherJenks', 'FisherJenksSampled', 'HeadTailBreaks', 'JenksCaspall', 'JenksCaspallForced', 'JenksCaspallSampled', 'MaxP', 'MaximumBreaks', 'NaturalBreaks', 'Quantiles', 'Percentiles', 'StdMean', 'UserDefined'). Arguments can be passed in classification_kwds.
k (int (default 5)) – Number of classes
vmin (None or float (default None)) – Minimum value of cmap. If None, the minimum data value in the column to be plotted is used.
vmax (None or float (default None)) – Maximum value of cmap. If None, the maximum data value in the column to be plotted is used.
width (pixel int or percentage string (default: '100%')) – Width of the folium Map. If the argument m is given explicitly, width is ignored.
height (pixel int or percentage string (default: '100%')) – Height of the folium Map. If the argument m is given explicitly, height is ignored.
categories (list-like) – Ordered list-like object of categories to be used for categorical plot.
classification_kwds (dict (default None)) – Keyword arguments to pass to mapclassify
control_scale (bool, (default True)) – Whether to add a control scale on the map.
marker_type (str, folium.Circle, folium.CircleMarker, folium.Marker (default None)) – Allowed string options are (‘marker’, ‘circle’, ‘circle_marker’). Defaults to folium.CircleMarker.
marker_kwds (dict (default {})) –
Additional keywords to be passed to the selected marker_type, e.g.:

radiusfloat (default 2 for circle_marker and 50 for circle))
Radius of the circle, in meters (for circle) or pixels (for circle_marker).

fillbool (default True)
Whether to fill the circle or circle_marker with color.

iconfolium.map.Icon
the folium.map.Icon object to use to render the marker.

draggablebool (default False)
Set to True to be able to drag the marker around the map.
style_kwds (dict (default {})) –
Additional style to be passed to folium style_function:
strokebool (default True)
Whether to draw stroke along the path. Set it to False to disable borders on polygons or circles.

colorstr
Stroke color

weightint
Stroke width in pixels

opacityfloat (default 1.0)
Stroke opacity

fillboolean (default True)
Whether to fill the path with color. Set it to False to disable filling on polygons or circles.

fillColorstr
Fill color. Defaults to the value of the color option

fillOpacityfloat (default 0.5)
Fill opacity.

style_functioncallable
Function mapping a GeoJson Feature to a style dict.
- Style properties folium.vector_layers.path_options()
- GeoJson features GeoDataFrame.__geo_interface__
e.g.:
lambda x: {"color":"red" if x["properties"]["gdp_md_est"]<10**6 else "blue"}
Plus all supported by folium.vector_layers.path_options(). See the documentation of folium.features.GeoJson for details.
highlight_kwds (dict (default {})) – Style to be passed to folium highlight_function. Uses the same keywords as style_kwds. When empty, defaults to {"fillOpacity": 0.75}.
tooltip_kwds (dict (default {})) – Additional keywords to be passed to folium.features.GeoJsonTooltip, e.g. aliases, labels, or sticky.
popup_kwds (dict (default {})) – Additional keywords to be passed to folium.features.GeoJsonPopup, e.g. aliases or labels.
legend_kwds (dict (default {})) –
Additional keywords to be passed to the legend.

Currently supported customisation:

captionstring
Custom caption of the legend. Defaults to the column name.

Additional accepted keywords when scheme is specified:

colorbarbool (default True)
An option to control the style of the legend. If True, continuous colorbar will be used. If False, categorical legend will be used for bins.

scalebool (default True)
Scale bins along the colorbar axis according to the bin edges (True) or use the equal length for each bin (False)

fmtstring (default “{:.2f}”)
A formatting specification for the bin edges of the classes in the legend. For example, to have no decimals: {"fmt": "{:.0f}"}. Applies if colorbar=False.

labelslist-like
A list of legend labels to override the auto-generated labels. Needs to have the same number of elements as the number of classes (k). Applies if colorbar=False.

intervalboolean (default False)
An option to control brackets from mapclassify legend. If True, open/closed interval brackets are shown in the legend. Applies if colorbar=False.

max_labelsint, default 10
Maximum number of colorbar tick labels (requires branca>=0.5.0)
map_kwds (dict (default {})) – Additional keywords to be passed to folium Map, e.g. dragging, or scrollWheelZoom.

**kwargsdict: Additional options to be passed on to the folium object.

Returns:: m – folium Map instance
Return type:: folium.folium.Map

Examples

>>> import geodatasets
>>> df = geopandas.read_file(
...     geodatasets.get_path("geoda.chicago_health")
... )
>>> df.head(2)
   ComAreaID  ...                                           geometry
0         35  ...  POLYGON ((-87.60914 41.84469, -87.60915 41.844...
1         36  ...  POLYGON ((-87.59215 41.81693, -87.59231 41.816...

[2 rows x 87 columns]

>>> df.explore("Pop2012", cmap="Blues")

classmethod from_arrow(table, geometry=None)[source]

Construct a GeoDataFrame from a Arrow table object based on GeoArrow extension types.

See https://geoarrow.org/ for details on the GeoArrow specification.

This functions accepts any tabular Arrow object implementing the Arrow PyCapsule Protocol (i.e. having an __arrow_c_array__ or __arrow_c_stream__ method).

Added in version 1.0.

Parameters:

table (pyarrow.Table or Arrow-compatible table) – Any tabular object implementing the Arrow PyCapsule Protocol (i.e. has an __arrow_c_array__ or __arrow_c_stream__ method). This table should have at least one column with a geoarrow geometry type.
geometry (str, default None) – The name of the geometry column to set as the active geometry column. If None, the first geometry column found will be used.

Return type:

GeoDataFrame

classmethod from_dict(data, geometry=None, crs=None, **kwargs)[source]

Construct GeoDataFrame from dict of array-like or dicts by overriding DataFrame.from_dict method with geometry and crs

Parameters:

data (dict) – Of the form {field : array-like} or {field : dict}.
geometry (str or array (optional)) – If str, column to use as geometry. If array, will be set as ‘geometry’ column on GeoDataFrame.
crs (str or dict (optional)) – Coordinate reference system to set on the resulting frame.
kwargs (key-word arguments) – These arguments are passed to DataFrame.from_dict

Return type:

GeoDataFrame

classmethod from_features(features, crs=None, columns=None)[source]

Alternate constructor to create GeoDataFrame from an iterable of features or a feature collection.

Parameters:

features –
- Iterable of features, where each element must be a feature dictionary or implement the __geo_interface__.
- Feature collection, where the ‘features’ key contains an iterable of features.
- Object holding a feature collection that implements the __geo_interface__.
crs (str or dict (optional)) – Coordinate reference system to set on the resulting frame.
columns (list of column names, optional) – Optionally specify the column names to include in the output frame. This does not overwrite the property names of the input, but can ensure a consistent output format.

Return type:

GeoDataFrame

Notes

For more information about the __geo_interface__, see https://gist.github.com/sgillies/2217756

Examples

>>> feature_coll = {
...     "type": "FeatureCollection",
...     "features": [
...         {
...             "id": "0",
...             "type": "Feature",
...             "properties": {"col1": "name1"},
...             "geometry": {"type": "Point", "coordinates": (1.0, 2.0)},
...             "bbox": (1.0, 2.0, 1.0, 2.0),
...         },
...         {
...             "id": "1",
...             "type": "Feature",
...             "properties": {"col1": "name2"},
...             "geometry": {"type": "Point", "coordinates": (2.0, 1.0)},
...             "bbox": (2.0, 1.0, 2.0, 1.0),
...         },
...     ],
...     "bbox": (1.0, 1.0, 2.0, 2.0),
... }
>>> df = geopandas.GeoDataFrame.from_features(feature_coll)
>>> df
      geometry   col1
0  POINT (1 2)  name1
1  POINT (2 1)  name2

classmethod from_file(filename, **kwargs)[source]

Alternate constructor to create a GeoDataFrame from a file.

It is recommended to use geopandas.read_file() instead.

Can load a GeoDataFrame from a file in any format recognized by pyogrio. See http://pyogrio.readthedocs.io/ for details.

Parameters:

filename (str) – File path or file handle to read from. Depending on which kwargs are included, the content of filename may vary. See pyogrio.read_dataframe() for usage details.
kwargs (key-word arguments) – These arguments are passed to pyogrio.read_dataframe(), and can be used to access multi-layer data, data stored within archives (zip files), etc.

Examples

>>> import geodatasets
>>> path = geodatasets.get_path('nybb')
>>> gdf = geopandas.GeoDataFrame.from_file(path)
>>> gdf
   BoroCode       BoroName     Shape_Leng    Shape_Area                                           geometry
0         5  Staten Island  330470.010332  1.623820e+09  MULTIPOLYGON (((970217.022 145643.332, 970227....
1         4         Queens  896344.047763  3.045213e+09  MULTIPOLYGON (((1029606.077 156073.814, 102957...
2         3       Brooklyn  741080.523166  1.937479e+09  MULTIPOLYGON (((1021176.479 151374.797, 102100...
3         1      Manhattan  359299.096471  6.364715e+08  MULTIPOLYGON (((981219.056 188655.316, 980940....
4         2          Bronx  464392.991824  1.186925e+09  MULTIPOLYGON (((1012821.806 229228.265, 101278...

The recommended method of reading files is geopandas.read_file():

>>> gdf = geopandas.read_file(path)

See also

read_file: read file to GeoDataFame
GeoDataFrame.to_file: write GeoDataFrame to file

classmethod from_postgis(sql, con, geom_col='geom', crs=None, index_col=None, coerce_float=True, parse_dates=None, params=None, chunksize=None)[source]

Alternate constructor to create a GeoDataFrame from a sql query containing a geometry column in WKB representation.

Parameters:

sql (string)
con (sqlalchemy.engine.Connection or sqlalchemy.engine.Engine)
geom_col (string, default 'geom') – column name to convert to shapely geometries
crs (optional) – Coordinate reference system to use for the returned GeoDataFrame
index_col (string or list of strings, optional, default: None) – Column(s) to set as index(MultiIndex)
coerce_float (boolean, default True) – Attempt to convert values of non-string, non-numeric objects (like decimal.Decimal) to floating point, useful for SQL result sets
parse_dates (list or dict, default None) –
- List of column names to parse as dates.
- Dict of {column_name: format string} where format string is strftime compatible in case of parsing string times, or is one of (D, s, ns, ms, us) in case of parsing integer timestamps.
- Dict of {column_name: arg dict}, where the arg dict corresponds to the keyword arguments of pandas.to_datetime(). Especially useful with databases without native Datetime support, such as SQLite.
params (list, tuple or dict, optional, default None) – List of parameters to pass to execute method.
chunksize (int, default None) – If specified, return an iterator where chunksize is the number of rows to include in each chunk.

Examples

PostGIS

>>> from sqlalchemy import create_engine
>>> db_connection_url = "postgresql://myusername:mypassword@myhost:5432/mydb"
>>> con = create_engine(db_connection_url)
>>> sql = "SELECT geom, highway FROM roads"
>>> df = geopandas.GeoDataFrame.from_postgis(sql, con)

SpatiaLite

>>> sql = "SELECT ST_Binary(geom) AS geom, highway FROM roads"
>>> df = geopandas.GeoDataFrame.from_postgis(sql, con)

The recommended method of reading from PostGIS is geopandas.read_postgis():

>>> df = geopandas.read_postgis(sql, con)

See also

geopandas.read_postgis: read PostGIS database to GeoDataFrame

property geometry: Geometry data for GeoDataFrame

iterfeatures(na='null', show_bbox=False, drop_id=False)[source]

Returns an iterator that yields feature dictionaries that comply with __geo_interface__

Parameters:

na (str, optional) –
Options are {‘null’, ‘drop’, ‘keep’}, default ‘null’. Indicates how to output missing (NaN) values in the GeoDataFrame
- null: output the missing entries as JSON null
- drop: remove the property from the feature. This applies to each feature individually so that features may have different properties
- keep: output the missing entries as NaN
show_bbox (bool, optional) – Include bbox (bounds) in the geojson. Default False.
drop_id (bool, default: False) – Whether to retain the index of the GeoDataFrame as the id property in the generated GeoJSON. Default is False, but may want True if the index is just arbitrary row numbers.

Examples

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(d, crs="EPSG:4326")
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)

>>> feature = next(gdf.iterfeatures())
>>> feature
{'id': '0', 'type': 'Feature', 'properties': {'col1': 'name1'}, 'geometry': {'type': 'Point', 'coordinates': (1.0, 2.0)}}

overlay(right, how='intersection', keep_geom_type=None, make_valid=True)[source]

Perform spatial overlay between GeoDataFrames.

Currently only supports data GeoDataFrames with uniform geometry types, i.e. containing only (Multi)Polygons, or only (Multi)Points, or a combination of (Multi)LineString and LinearRing shapes. Implements several methods that are all effectively subsets of the union.

See the User Guide page ../../user_guide/set_operations for details.

Parameters:

right (GeoDataFrame)
how (string) – Method of spatial overlay: ‘intersection’, ‘union’, ‘identity’, ‘symmetric_difference’ or ‘difference’.
keep_geom_type (bool) – If True, return only geometries of the same geometry type the GeoDataFrame has, if False, return all resulting geometries. Default is None, which will set keep_geom_type to True but warn upon dropping geometries.
make_valid (bool, default True) – If True, any invalid input geometries are corrected with a call to make_valid(), if False, a ValueError is raised if any input geometries are invalid.

Returns:

df – GeoDataFrame with new set of polygons and attributes resulting from the overlay

Return type:

GeoDataFrame

Examples

>>> from shapely.geometry import Polygon
>>> polys1 = geopandas.GeoSeries([Polygon([(0,0), (2,0), (2,2), (0,2)]),
...                               Polygon([(2,2), (4,2), (4,4), (2,4)])])
>>> polys2 = geopandas.GeoSeries([Polygon([(1,1), (3,1), (3,3), (1,3)]),
...                               Polygon([(3,3), (5,3), (5,5), (3,5)])])
>>> df1 = geopandas.GeoDataFrame({'geometry': polys1, 'df1_data':[1,2]})
>>> df2 = geopandas.GeoDataFrame({'geometry': polys2, 'df2_data':[1,2]})

>>> df1.overlay(df2, how='union')
   df1_data  df2_data                                           geometry
     1.0       1.0                POLYGON ((2 2, 2 1, 1 1, 1 2, 2 2))
     2.0       1.0                POLYGON ((2 2, 2 3, 3 3, 3 2, 2 2))
     2.0       2.0                POLYGON ((4 4, 4 3, 3 3, 3 4, 4 4))
     1.0       NaN      POLYGON ((2 0, 0 0, 0 2, 1 2, 1 1, 2 1, 2 0))
     2.0       NaN  MULTIPOLYGON (((3 4, 3 3, 2 3, 2 4, 3 4)), ((4...
     NaN       1.0  MULTIPOLYGON (((2 3, 2 2, 1 2, 1 3, 2 3)), ((3...
     NaN       2.0      POLYGON ((3 5, 5 5, 5 3, 4 3, 4 4, 3 4, 3 5))

>>> df1.overlay(df2, how='intersection')
   df1_data  df2_data                             geometry
0         1         1  POLYGON ((2 2, 2 1, 1 1, 1 2, 2 2))
1         2         1  POLYGON ((2 2, 2 3, 3 3, 3 2, 2 2))
2         2         2  POLYGON ((4 4, 4 3, 3 3, 3 4, 4 4))

>>> df1.overlay(df2, how='symmetric_difference')
   df1_data  df2_data                                           geometry
0       1.0       NaN      POLYGON ((2 0, 0 0, 0 2, 1 2, 1 1, 2 1, 2 0))
1       2.0       NaN  MULTIPOLYGON (((3 4, 3 3, 2 3, 2 4, 3 4)), ((4...
2       NaN       1.0  MULTIPOLYGON (((2 3, 2 2, 1 2, 1 3, 2 3)), ((3...
3       NaN       2.0      POLYGON ((3 5, 5 5, 5 3, 4 3, 4 4, 3 4, 3 5))

>>> df1.overlay(df2, how='difference')
                                            geometry  df1_data
0      POLYGON ((2 0, 0 0, 0 2, 1 2, 1 1, 2 1, 2 0))         1
1  MULTIPOLYGON (((3 4, 3 3, 2 3, 2 4, 3 4)), ((4...         2

>>> df1.overlay(df2, how='identity')
   df1_data  df2_data                                           geometry
     1.0       1.0                POLYGON ((2 2, 2 1, 1 1, 1 2, 2 2))
     2.0       1.0                POLYGON ((2 2, 2 3, 3 3, 3 2, 2 2))
     2.0       2.0                POLYGON ((4 4, 4 3, 3 3, 3 4, 4 4))
     1.0       NaN      POLYGON ((2 0, 0 0, 0 2, 1 2, 1 1, 2 1, 2 0))
     2.0       NaN  MULTIPOLYGON (((3 4, 3 3, 2 3, 2 4, 3 4)), ((4...

See also

GeoDataFrame.sjoin: spatial join
overlay: equivalent top-level function

Notes

Every operation in GeoPandas is planar, i.e. the potential third dimension is not taken into account.

plot: alias of GeoplotAccessor

rename_geometry(col, inplace=False)[source]

Renames the GeoDataFrame geometry column to the specified name. By default yields a new object.

The original geometry column is replaced with the input.

Parameters:

col (new geometry column label)
inplace (boolean, default False) – Modify the GeoDataFrame in place (do not create a new object)

Examples

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> df = geopandas.GeoDataFrame(d, crs="EPSG:4326")
>>> df1 = df.rename_geometry('geom1')
>>> df1.geometry.name
'geom1'
>>> df.rename_geometry('geom1', inplace=True)
>>> df.geometry.name
'geom1'

Returns:: geodataframe
Return type:: GeoDataFrame

See also

GeoDataFrame.set_geometry: set the active geometry

set_crs(crs=None, epsg=None, inplace=False, allow_override=False)[source]

Set the Coordinate Reference System (CRS) of the GeoDataFrame.

If there are multiple geometry columns within the GeoDataFrame, only the CRS of the active geometry column is set.

Pass None to remove CRS from the active geometry column.

Notes

The underlying geometries are not transformed to this CRS. To transform the geometries to a new CRS, use the to_crs method.

Parameters:

crs (pyproj.CRS | None, optional) – The value can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.
epsg (int, optional) – EPSG code specifying the projection.
inplace (bool, default False) – If True, the CRS of the GeoDataFrame will be changed in place (while still returning the result) instead of making a copy of the GeoDataFrame.
allow_override (bool, default False) – If the the GeoDataFrame already has a CRS, allow to replace the existing CRS, even when both are not equal.

Examples

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(d)
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)

Setting CRS to a GeoDataFrame without one:

>>> gdf.crs is None
True

>>> gdf = gdf.set_crs('epsg:3857')
>>> gdf.crs
<Projected CRS: EPSG:3857>
Name: WGS 84 / Pseudo-Mercator
Axis Info [cartesian]:
- X[east]: Easting (metre)
- Y[north]: Northing (metre)
Area of Use:
- name: World - 85°S to 85°N
- bounds: (-180.0, -85.06, 180.0, 85.06)
Coordinate Operation:
- name: Popular Visualisation Pseudo-Mercator
- method: Popular Visualisation Pseudo Mercator
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

Overriding existing CRS:

>>> gdf = gdf.set_crs(4326, allow_override=True)

Without allow_override=True, set_crs returns an error if you try to override CRS.

See also

GeoDataFrame.to_crs: re-project to another CRS

set_geometry(col, drop=None, inplace=False, crs=None)[source]

Set the GeoDataFrame geometry using either an existing column or the specified input. By default yields a new object.

The original geometry column is replaced with the input.

Parameters:

col (column label or array-like) – An existing column name or values to set as the new geometry column. If values (array-like, (Geo)Series) are passed, then if they are named (Series) the new geometry column will have the corresponding name, otherwise the existing geometry column will be replaced. If there is no existing geometry column, the new geometry column will use the default name “geometry”.
drop (boolean, default False) –
When specifying a named Series or an existing column name for col, controls if the previous geometry column should be dropped from the result. The default of False keeps both the old and new geometry column.

Deprecated since version 1.0.0.
inplace (boolean, default False) – Modify the GeoDataFrame in place (do not create a new object)
crs (pyproj.CRS, optional) – Coordinate system to use. The value can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string. If passed, overrides both DataFrame and col’s crs. Otherwise, tries to get crs from passed col values or DataFrame.

Examples

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(d, crs="EPSG:4326")
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)

Passing an array:

>>> df1 = gdf.set_geometry([Point(0,0), Point(1,1)])
>>> df1
    col1     geometry
0  name1  POINT (0 0)
1  name2  POINT (1 1)

Using existing column:

>>> gdf["buffered"] = gdf.buffer(2)
>>> df2 = gdf.set_geometry("buffered")
>>> df2.geometry
0    POLYGON ((3 2, 2.99037 1.80397, 2.96157 1.6098...
1    POLYGON ((4 1, 3.99037 0.80397, 3.96157 0.6098...
Name: buffered, dtype: geometry

Return type:: GeoDataFrame

See also

GeoDataFrame.rename_geometry: rename an active geometry column

sjoin(df, *args, **kwargs)[source]

Spatial join of two GeoDataFrames.

See the User Guide page ../../user_guide/mergingdata for details.

Parameters:

df (GeoDataFrame)
how (string, default 'inner') –
The type of join:
- ’left’: use keys from left_df; retain only left_df geometry column
- ’right’: use keys from right_df; retain only right_df geometry column
- ’inner’: use intersection of keys from both dfs; retain only left_df geometry column
predicate (string, default 'intersects') – Binary predicate. Valid values are determined by the spatial index used. You can check the valid values in left_df or right_df as left_df.sindex.valid_query_predicates or right_df.sindex.valid_query_predicates
lsuffix (string, default 'left') – Suffix to apply to overlapping column names (left GeoDataFrame).
rsuffix (string, default 'right') – Suffix to apply to overlapping column names (right GeoDataFrame).
distance (number or array_like, optional) – Distance(s) around each input geometry within which to query the tree for the ‘dwithin’ predicate. If array_like, must be one-dimesional with length equal to length of left GeoDataFrame. Required if predicate='dwithin'.
on_attribute (string, list or tuple) – Column name(s) to join on as an additional join restriction on top of the spatial predicate. These must be found in both DataFrames. If set, observations are joined only if the predicate applies and values in specified columns match.

Examples

>>> import geodatasets
>>> chicago = geopandas.read_file(
...     geodatasets.get_path("geoda.chicago_commpop")
... )
>>> groceries = geopandas.read_file(
...     geodatasets.get_path("geoda.groceries")
... ).to_crs(chicago.crs)

>>> chicago.head()
         community  ...                                           geometry
        DOUGLAS  ...  MULTIPOLYGON (((-87.60914 41.84469, -87.60915 ...
        OAKLAND  ...  MULTIPOLYGON (((-87.59215 41.81693, -87.59231 ...
    FULLER PARK  ...  MULTIPOLYGON (((-87.62880 41.80189, -87.62879 ...
GRAND BOULEVARD  ...  MULTIPOLYGON (((-87.60671 41.81681, -87.60670 ...
        KENWOOD  ...  MULTIPOLYGON (((-87.59215 41.81693, -87.59215 ...

[5 rows x 9 columns]

>>> groceries.head()
   OBJECTID     Ycoord  ...  Category                           geometry
0        16  41.973266  ...       NaN  MULTIPOINT ((-87.65661 41.97321))
1        18  41.696367  ...       NaN  MULTIPOINT ((-87.68136 41.69713))
2        22  41.868634  ...       NaN  MULTIPOINT ((-87.63918 41.86847))
3        23  41.877590  ...       new  MULTIPOINT ((-87.65495 41.87783))
4        27  41.737696  ...       NaN  MULTIPOINT ((-87.62715 41.73623))
[5 rows x 8 columns]

>>> groceries_w_communities = groceries.sjoin(chicago)
>>> groceries_w_communities[["OBJECTID", "community", "geometry"]].head()
   OBJECTID       community                           geometry
0        16          UPTOWN  MULTIPOINT ((-87.65661 41.97321))
1        18     MORGAN PARK  MULTIPOINT ((-87.68136 41.69713))
2        22  NEAR WEST SIDE  MULTIPOINT ((-87.63918 41.86847))
3        23  NEAR WEST SIDE  MULTIPOINT ((-87.65495 41.87783))
4        27         CHATHAM  MULTIPOINT ((-87.62715 41.73623))

Notes

Every operation in GeoPandas is planar, i.e. the potential third dimension is not taken into account.

See also

GeoDataFrame.sjoin_nearest: nearest neighbor join
sjoin: equivalent top-level function

sjoin_nearest(right, how='inner', max_distance=None, lsuffix='left', rsuffix='right', distance_col=None, exclusive=False)[source]

Spatial join of two GeoDataFrames based on the distance between their geometries.

Results will include multiple output records for a single input record where there are multiple equidistant nearest or intersected neighbors.

See the User Guide page https://geopandas.readthedocs.io/en/latest/docs/user_guide/mergingdata.html for more details.

Parameters:

right (GeoDataFrame)
how (string, default 'inner') –
The type of join:
- ’left’: use keys from left_df; retain only left_df geometry column
- ’right’: use keys from right_df; retain only right_df geometry column
- ’inner’: use intersection of keys from both dfs; retain only left_df geometry column
max_distance (float, default None) – Maximum distance within which to query for nearest geometry. Must be greater than 0. The max_distance used to search for nearest items in the tree may have a significant impact on performance by reducing the number of input geometries that are evaluated for nearest items in the tree.
lsuffix (string, default 'left') – Suffix to apply to overlapping column names (left GeoDataFrame).
rsuffix (string, default 'right') – Suffix to apply to overlapping column names (right GeoDataFrame).
distance_col (string, default None) – If set, save the distances computed between matching geometries under a column of this name in the joined GeoDataFrame.
exclusive (bool, optional, default False) – If True, the nearest geometries that are equal to the input geometry will not be returned, default False. Requires Shapely >= 2.0

Examples

>>> import geodatasets
>>> groceries = geopandas.read_file(
...     geodatasets.get_path("geoda.groceries")
... )
>>> chicago = geopandas.read_file(
...     geodatasets.get_path("geoda.chicago_health")
... ).to_crs(groceries.crs)

>>> chicago.head()
   ComAreaID  ...                                           geometry
0         35  ...  POLYGON ((-87.60914 41.84469, -87.60915 41.844...
1         36  ...  POLYGON ((-87.59215 41.81693, -87.59231 41.816...
2         37  ...  POLYGON ((-87.62880 41.80189, -87.62879 41.801...
3         38  ...  POLYGON ((-87.60671 41.81681, -87.60670 41.816...
4         39  ...  POLYGON ((-87.59215 41.81693, -87.59215 41.816...
[5 rows x 87 columns]

>>> groceries.head()
   OBJECTID     Ycoord  ...  Category                           geometry
0        16  41.973266  ...       NaN  MULTIPOINT ((-87.65661 41.97321))
1        18  41.696367  ...       NaN  MULTIPOINT ((-87.68136 41.69713))
2        22  41.868634  ...       NaN  MULTIPOINT ((-87.63918 41.86847))
3        23  41.877590  ...       new  MULTIPOINT ((-87.65495 41.87783))
4        27  41.737696  ...       NaN  MULTIPOINT ((-87.62715 41.73623))
[5 rows x 8 columns]

>>> groceries_w_communities = groceries.sjoin_nearest(chicago)
>>> groceries_w_communities[["Chain", "community", "geometry"]].head(2)
               Chain    community                                geometry
0     VIET HOA PLAZA       UPTOWN   MULTIPOINT ((1168268.672 1933554.35))
1  COUNTY FAIR FOODS  MORGAN PARK  MULTIPOINT ((1162302.618 1832900.224))

To include the distances:

>>> groceries_w_communities = groceries.sjoin_nearest(chicago, distance_col="distances")
>>> groceries_w_communities[["Chain", "community", "distances"]].head(2)
               Chain    community  distances
0     VIET HOA PLAZA       UPTOWN        0.0
1  COUNTY FAIR FOODS  MORGAN PARK        0.0

In the following example, we get multiple groceries for Uptown because all results are equidistant (in this case zero because they intersect). In fact, we get 4 results in total:

>>> chicago_w_groceries = groceries.sjoin_nearest(chicago, distance_col="distances", how="right")
>>> uptown_results = chicago_w_groceries[chicago_w_groceries["community"] == "UPTOWN"]
>>> uptown_results[["Chain", "community"]]
            Chain community
30  VIET HOA PLAZA    UPTOWN
30      JEWEL OSCO    UPTOWN
30          TARGET    UPTOWN
30       Mariano's    UPTOWN

See also

GeoDataFrame.sjoin: binary predicate joins
sjoin_nearest: equivalent top-level function

Notes

Since this join relies on distances, results will be inaccurate if your geometries are in a geographic CRS.

Every operation in GeoPandas is planar, i.e. the potential third dimension is not taken into account.

to_arrow(*, index=None, geometry_encoding='WKB', interleaved=True, include_z=None)[source]

Encode a GeoDataFrame to GeoArrow format.

See https://geoarrow.org/ for details on the GeoArrow specification.

This functions returns a generic Arrow data object implementing the Arrow PyCapsule Protocol (i.e. having an __arrow_c_stream__ method). This object can then be consumed by your Arrow implementation of choice that supports this protocol.

Added in version 1.0.

Parameters:

index (bool, default None) – If True, always include the dataframe’s index(es) as columns in the file output. If False, the index(es) will not be written to the file. If None, the index(ex) will be included as columns in the file output except RangeIndex which is stored as metadata only.
geometry_encoding ({'WKB', 'geoarrow' }, default 'WKB') – The GeoArrow encoding to use for the data conversion.
interleaved (bool, default True) – Only relevant for ‘geoarrow’ encoding. If True, the geometries’ coordinates are interleaved in a single fixed size list array. If False, the coordinates are stored as separate arrays in a struct type.
include_z (bool, default None) – Only relevant for ‘geoarrow’ encoding (for WKB, the dimensionality of the individial geometries is preserved). If False, return 2D geometries. If True, include the third dimension in the output (if a geometry has no third dimension, the z-coordinates will be NaN). By default, will infer the dimensionality from the input geometries. Note that this inference can be unreliable with empty geometries (for a guaranteed result, it is recommended to specify the keyword).

Returns:

A generic Arrow table object with geometry columns encoded to GeoArrow.

Return type:

ArrowTable

Examples

>>> from shapely.geometry import Point
>>> data = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(data)
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)

>>> arrow_table = gdf.to_arrow()
>>> arrow_table
<geopandas.io._geoarrow.ArrowTable object at ...>

The returned data object needs to be consumed by a library implementing the Arrow PyCapsule Protocol. For example, wrapping the data as a pyarrow.Table (requires pyarrow >= 14.0):

>>> import pyarrow as pa
>>> table = pa.table(arrow_table)
>>> table
pyarrow.Table
col1: string
geometry: binary
----
col1: [["name1","name2"]]
geometry: [[0101000000000000000000F03F0000000000000040,01010000000000000000000040000000000000F03F]]

to_crs(crs=None, epsg=None, inplace=False)[source]

Transform geometries to a new coordinate reference system.

Transform all geometries in an active geometry column to a different coordinate reference system. The crs attribute on the current GeoSeries must be set. Either crs or epsg may be specified for output.

This method will transform all points in all objects. It has no notion of projecting entire geometries. All segments joining points are assumed to be lines in the current projection, not geodesics. Objects crossing the dateline (or other projection boundary) will have undesirable behavior.

Parameters:

crs (pyproj.CRS, optional if epsg is specified) – The value can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.
epsg (int, optional if crs is specified) – EPSG code specifying output projection.
inplace (bool, optional, default: False) – Whether to return a new GeoDataFrame or do the transformation in place.

Return type:

GeoDataFrame

Examples

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(d, crs=4326)
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)
>>> gdf.crs
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

>>> gdf = gdf.to_crs(3857)
>>> gdf
    col1                       geometry
0  name1  POINT (111319.491 222684.209)
1  name2  POINT (222638.982 111325.143)
>>> gdf.crs
<Projected CRS: EPSG:3857>
Name: WGS 84 / Pseudo-Mercator
Axis Info [cartesian]:
- X[east]: Easting (metre)
- Y[north]: Northing (metre)
Area of Use:
- name: World - 85°S to 85°N
- bounds: (-180.0, -85.06, 180.0, 85.06)
Coordinate Operation:
- name: Popular Visualisation Pseudo-Mercator
- method: Popular Visualisation Pseudo Mercator
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

See also

GeoDataFrame.set_crs: assign CRS without re-projection

to_feather(path, index=None, compression=None, schema_version=None, **kwargs)[source]

Write a GeoDataFrame to the Feather format.

Any geometry columns present are serialized to WKB format in the file.

Requires ‘pyarrow’ >= 0.17.

Added in version 0.8.

Parameters:

path (str, path object)
index (bool, default None) – If True, always include the dataframe’s index(es) as columns in the file output. If False, the index(es) will not be written to the file. If None, the index(ex) will be included as columns in the file output except RangeIndex which is stored as metadata only.
compression ({'zstd', 'lz4', 'uncompressed'}, optional) – Name of the compression to use. Use "uncompressed" for no compression. By default uses LZ4 if available, otherwise uncompressed.
schema_version ({'0.1.0', '0.4.0', '1.0.0', None}) – GeoParquet specification version; if not provided will default to latest supported version.
kwargs – Additional keyword arguments passed to to pyarrow.feather.write_feather().

Examples

>>> gdf.to_feather('data.feather')

See also

GeoDataFrame.to_parquet: write GeoDataFrame to parquet
GeoDataFrame.to_file: write GeoDataFrame to file

to_file(filename, driver=None, schema=None, index=None, **kwargs)[source]

Write the GeoDataFrame to a file.

By default, an ESRI shapefile is written, but any OGR data source supported by Pyogrio or Fiona can be written. A dictionary of supported OGR providers is available via:

>>> import pyogrio
>>> pyogrio.list_drivers()

Parameters:

filename (string) – File path or file handle to write to. The path may specify a GDAL VSI scheme.
driver (string, default None) – The OGR format driver used to write the vector file. If not specified, it attempts to infer it from the file extension. If no extension is specified, it saves ESRI Shapefile to a folder.
schema (dict, default None) – If specified, the schema dictionary is passed to Fiona to better control how the file is written. If None, GeoPandas will determine the schema based on each column’s dtype. Not supported for the “pyogrio” engine.
index (bool, default None) –
If True, write index into one or more columns (for MultiIndex). Default None writes the index into one or more columns only if the index is named, is a MultiIndex, or has a non-integer data type. If False, no index is written.

Added in version 0.7: Previously the index was not written.
mode (string, default 'w') – The write mode, ‘w’ to overwrite the existing file and ‘a’ to append. Not all drivers support appending. The drivers that support appending are listed in fiona.supported_drivers or https://github.com/Toblerity/Fiona/blob/master/fiona/drvsupport.py
crs (pyproj.CRS, default None) – If specified, the CRS is passed to Fiona to better control how the file is written. If None, GeoPandas will determine the crs based on crs df attribute. The value can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string. The keyword is not supported for the “pyogrio” engine.
engine (str, "pyogrio" or "fiona") – The underlying library that is used to write the file. Currently, the supported options are “pyogrio” and “fiona”. Defaults to “pyogrio” if installed, otherwise tries “fiona”.
metadata (dict[str, str], default None) – Optional metadata to be stored in the file. Keys and values must be strings. Supported only for “GPKG” driver.
**kwargs – Keyword args to be passed to the engine, and can be used to write to multi-layer data, store data within archives (zip files), etc. In case of the “pyogrio” engine, the keyword arguments are passed to pyogrio.write_dataframe. In case of the “fiona” engine, the keyword arguments are passed to fiona.open`. For more information on possible keywords, type: import pyogrio; help(pyogrio.write_dataframe).

Notes

The format drivers will attempt to detect the encoding of your data, but may fail. In this case, the proper encoding can be specified explicitly by using the encoding keyword parameter, e.g. encoding='utf-8'.

See also

GeoSeries.to_file

GeoDataFrame.to_postgis: write GeoDataFrame to PostGIS database
GeoDataFrame.to_parquet: write GeoDataFrame to parquet
GeoDataFrame.to_feather: write GeoDataFrame to feather

Examples

>>> gdf.to_file('dataframe.shp')

>>> gdf.to_file('dataframe.gpkg', driver='GPKG', layer='name')

>>> gdf.to_file('dataframe.geojson', driver='GeoJSON')

With selected drivers you can also append to a file with mode=”a”:

>>> gdf.to_file('dataframe.shp', mode="a")

Using the engine-specific keyword arguments it is possible to e.g. create a spatialite file with a custom layer name:

>>> gdf.to_file(
...     'dataframe.sqlite', driver='SQLite', spatialite=True, layer='test'
... )

to_geo_dict(na='null', show_bbox=False, drop_id=False)[source]

Returns a python feature collection representation of the GeoDataFrame as a dictionary with a list of features based on the __geo_interface__ GeoJSON-like specification.

Parameters:

na (str, optional) –
Options are {‘null’, ‘drop’, ‘keep’}, default ‘null’. Indicates how to output missing (NaN) values in the GeoDataFrame
- null: output the missing entries as JSON null
- drop: remove the property from the feature. This applies to each feature individually so that features may have different properties
- keep: output the missing entries as NaN
show_bbox (bool, optional) – Include bbox (bounds) in the geojson. Default False.
drop_id (bool, default: False) – Whether to retain the index of the GeoDataFrame as the id property in the generated dictionary. Default is False, but may want True if the index is just arbitrary row numbers.

Examples

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(d)
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)

>>> gdf.to_geo_dict()
{'type': 'FeatureCollection', 'features': [{'id': '0', 'type': 'Feature', 'properties': {'col1': 'name1'}, 'geometry': {'type': 'Point', 'coordinates': (1.0, 2.0)}}, {'id': '1', 'type': 'Feature', 'properties': {'col1': 'name2'}, 'geometry': {'type': 'Point', 'coordinates': (2.0, 1.0)}}]}

See also

GeoDataFrame.to_json: return a GeoDataFrame as a GeoJSON string

to_json(na='null', show_bbox=False, drop_id=False, to_wgs84=False, **kwargs)[source]

Returns a GeoJSON representation of the GeoDataFrame as a string.

Parameters:

na ({'null', 'drop', 'keep'}, default 'null') – Indicates how to output missing (NaN) values in the GeoDataFrame. See below.
show_bbox (bool, optional, default: False) – Include bbox (bounds) in the geojson
drop_id (bool, default: False) – Whether to retain the index of the GeoDataFrame as the id property in the generated GeoJSON. Default is False, but may want True if the index is just arbitrary row numbers.
to_wgs84 (bool, optional, default: False) – If the CRS is set on the active geometry column it is exported as WGS84 (EPSG:4326) to meet the 2016 GeoJSON specification. Set to True to force re-projection and set to False to ignore CRS. False by default.

Notes

The remaining kwargs are passed to json.dumps().

Missing (NaN) values in the GeoDataFrame can be represented as follows:

null: output the missing entries as JSON null.
drop: remove the property from the feature. This applies to each feature individually so that features may have different properties.
keep: output the missing entries as NaN.

If the GeoDataFrame has a defined CRS, its definition will be included in the output unless it is equal to WGS84 (default GeoJSON CRS) or not possible to represent in the URN OGC format, or unless to_wgs84=True is specified.

Examples

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(d, crs="EPSG:3857")
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)

>>> gdf.to_json()
'{"type": "FeatureCollection", "features": [{"id": "0", "type": "Feature", "properties": {"col1": "name1"}, "geometry": {"type": "Point", "coordinates": [1.0, 2.0]}}, {"id": "1", "type": "Feature", "properties": {"col1": "name2"}, "geometry": {"type": "Point", "coordinates": [2.0, 1.0]}}], "crs": {"type": "name", "properties": {"name": "urn:ogc:def:crs:EPSG::3857"}}}'

Alternatively, you can write GeoJSON to file:

>>> gdf.to_file(path, driver="GeoJSON")

See also

GeoDataFrame.to_file: write GeoDataFrame to file

to_parquet(path, index=None, compression='snappy', geometry_encoding='WKB', write_covering_bbox=False, schema_version=None, **kwargs)[source]

Write a GeoDataFrame to the Parquet format.

By default, all geometry columns present are serialized to WKB format in the file.

Requires ‘pyarrow’.

Added in version 0.8.

Parameters:

path (str, path object)
index (bool, default None) – If True, always include the dataframe’s index(es) as columns in the file output. If False, the index(es) will not be written to the file. If None, the index(ex) will be included as columns in the file output except RangeIndex which is stored as metadata only.
compression ({'snappy', 'gzip', 'brotli', None}, default 'snappy') – Name of the compression to use. Use None for no compression.
geometry_encoding ({'WKB', 'geoarrow'}, default 'WKB') – The encoding to use for the geometry columns. Defaults to “WKB” for maximum interoperability. Specify “geoarrow” to use one of the native GeoArrow-based single-geometry type encodings. Note: the “geoarrow” option is part of the newer GeoParquet 1.1 specification, should be considered as experimental, and may not be supported by all readers.
write_covering_bbox (bool, default False) – Writes the bounding box column for each row entry with column name ‘bbox’. Writing a bbox column can be computationally expensive, but allows you to specify a bbox in : func:read_parquet for filtered reading. Note: this bbox column is part of the newer GeoParquet 1.1 specification and should be considered as experimental. While writing the column is backwards compatible, using it for filtering may not be supported by all readers.
schema_version ({'0.1.0', '0.4.0', '1.0.0', '1.1.0', None}) – GeoParquet specification version; if not provided, will default to latest supported stable version (1.0.0).
kwargs – Additional keyword arguments passed to pyarrow.parquet.write_table().

Examples

>>> gdf.to_parquet('data.parquet')

See also

GeoDataFrame.to_feather: write GeoDataFrame to feather
GeoDataFrame.to_file: write GeoDataFrame to file

to_postgis(name, con, schema=None, if_exists='fail', index=False, index_label=None, chunksize=None, dtype=None)[source]

Upload GeoDataFrame into PostGIS database.

This method requires SQLAlchemy and GeoAlchemy2, and a PostgreSQL Python driver (psycopg or psycopg2) to be installed.

It is also possible to use to_file() to write to a database. Especially for file geodatabases like GeoPackage or SpatiaLite this can be easier.

Parameters:

name (str) – Name of the target table.
con (sqlalchemy.engine.Connection or sqlalchemy.engine.Engine) – Active connection to the PostGIS database.
if_exists ({'fail', 'replace', 'append'}, default 'fail') –
How to behave if the table already exists:
- fail: Raise a ValueError.
- replace: Drop the table before inserting new values.
- append: Insert new values to the existing table.
schema (string, optional) – Specify the schema. If None, use default schema: ‘public’.
index (bool, default False) – Write DataFrame index as a column. Uses index_label as the column name in the table.
index_label (string or sequence, default None) – Column label for index column(s). If None is given (default) and index is True, then the index names are used.
chunksize (int, optional) – Rows will be written in batches of this size at a time. By default, all rows will be written at once.
dtype (dict of column name to SQL type, default None) – Specifying the datatype for columns. The keys should be the column names and the values should be the SQLAlchemy types.

Examples

>>> from sqlalchemy import create_engine
>>> engine = create_engine("postgresql://myusername:mypassword@myhost:5432/mydatabase")
>>> gdf.to_postgis("my_table", engine)

See also

GeoDataFrame.to_file: write GeoDataFrame to file
read_postgis: read PostGIS database to GeoDataFrame

to_wkb(hex=False, **kwargs)[source]

Encode all geometry columns in the GeoDataFrame to WKB.

Parameters:

hex (bool) – If true, export the WKB as a hexadecimal string. The default is to return a binary bytes object.
kwargs – Additional keyword args will be passed to shapely.to_wkb().

Returns:

geometry columns are encoded to WKB

Return type:

DataFrame

to_wkt(**kwargs)[source]

Encode all geometry columns in the GeoDataFrame to WKT.

Parameters:: kwargs – Keyword args will be passed to shapely.to_wkt().
Returns:: geometry columns are encoded to WKT
Return type:: DataFrame

class pyorps.core.cost_assumptions.GeoSeries(data=None, index=None, crs=None, **kwargs)[source]

Bases: GeoPandasBase, Series

A Series object designed to store shapely geometry objects.

Parameters:

data (array-like, dict, scalar value) – The geometries to store in the GeoSeries.
index (array-like or Index) – The index for the GeoSeries.
crs (value (optional)) – Coordinate Reference System of the geometry objects. Can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.
kwargs –

Additional arguments passed to the Series constructor,
e.g. name.

Examples

>>> from shapely.geometry import Point
>>> s = geopandas.GeoSeries([Point(1, 1), Point(2, 2), Point(3, 3)])
>>> s
0    POINT (1 1)
1    POINT (2 2)
2    POINT (3 3)
dtype: geometry

>>> s = geopandas.GeoSeries(
...     [Point(1, 1), Point(2, 2), Point(3, 3)], crs="EPSG:3857"
... )
>>> s.crs
<Projected CRS: EPSG:3857>
Name: WGS 84 / Pseudo-Mercator
Axis Info [cartesian]:
- X[east]: Easting (metre)
- Y[north]: Northing (metre)
Area of Use:
- name: World - 85°S to 85°N
- bounds: (-180.0, -85.06, 180.0, 85.06)
Coordinate Operation:
- name: Popular Visualisation Pseudo-Mercator
- method: Popular Visualisation Pseudo Mercator
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

>>> s = geopandas.GeoSeries(
...    [Point(1, 1), Point(2, 2), Point(3, 3)], index=["a", "b", "c"], crs=4326
... )
>>> s
a    POINT (1 1)
b    POINT (2 2)
c    POINT (3 3)
dtype: geometry

>>> s.crs
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

See also

GeoDataFrame, pandas.Series

property _constructor: Used when a manipulation result has the same dimensions as the original.

property _constructor_expanddim: Used when a manipulation result has one higher dimension as the original, such as Series.to_frame()

_constructor_expanddim_from_mgr(mgr, axes)[source]

_constructor_from_mgr(mgr, axes)[source]

classmethod _from_wkb_or_wkt(from_wkb_or_wkt_function, data, index=None, crs=None, on_invalid='raise', **kwargs)[source]

Create a GeoSeries from either WKT or WKB values

Return type:: GeoSeries

_wrapped_pandas_method(mtd, *args, **kwargs)[source]: Wrap a generic pandas method to ensure it returns a GeoSeries

append(*args, **kwargs)[source]

Return type:: GeoSeries

apply(func, convert_dtype=None, args=(), **kwargs)[source]

One-dimensional ndarray with axis labels (including time series).

Labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index. Statistical methods from ndarray have been overridden to automatically exclude missing data (currently represented as NaN).

Operations between Series (+, -, /, *, **) align values based on their associated index values– they need not be the same length. The result index will be the sorted union of the two indexes.

Parameters:

data (array-like, Iterable, dict, or scalar value) – Contains data stored in Series. If data is a dict, argument order is maintained.
index (array-like or Index (1d)) – Values must be hashable and have the same length as data. Non-unique index values are allowed. Will default to RangeIndex (0, 1, 2, …, n) if not provided. If data is dict-like and index is None, then the keys in the data are used as the index. If the index is not None, the resulting Series is reindexed with the index values.
dtype (str, numpy.dtype, or ExtensionDtype, optional) – Data type for the output Series. If not specified, this will be inferred from data. See the user guide for more usages.
name (Hashable, default None) – The name to give to the Series.
copy (bool, default False) – Copy input data. Only affects Series or 1d ndarray input. See examples.

Notes

Please reference the User Guide for more information.

Examples

Constructing Series from a dictionary with an Index specified

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['a', 'b', 'c'])
>>> ser
a   1
b   2
c   3
dtype: int64

The keys of the dictionary match with the Index values, hence the Index values have no effect.

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['x', 'y', 'z'])
>>> ser
x   NaN
y   NaN
z   NaN
dtype: float64

Note that the Index is first build with the keys from the dictionary. After this the Series is reindexed with the given Index values, hence we get all NaN as a result.

Constructing Series from a list with copy=False.

>>> r = [1, 2]
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
[1, 2]
>>> ser
0    999
1      2
dtype: int64

Due to input data type the Series has a copy of the original data even though copy=False, so the data is unchanged.

Constructing Series from a 1d ndarray with copy=False.

>>> r = np.array([1, 2])
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
array([999,   2])
>>> ser
0    999
1      2
dtype: int64

Due to input data type the Series has a view on the original data, so the data is changed as well.

clip(mask, keep_geom_type=False, sort=False)[source]

Clip points, lines, or polygon geometries to the mask extent.

Both layers must be in the same Coordinate Reference System (CRS). The GeoSeries will be clipped to the full extent of the mask object.

If there are multiple polygons in mask, data from the GeoSeries will be clipped to the total boundary of all polygons in mask.

Parameters:

mask (GeoDataFrame, GeoSeries, (Multi)Polygon, list-like) – Polygon vector layer used to clip gdf. The mask’s geometry is dissolved into one geometric feature and intersected with GeoSeries. If the mask is list-like with four elements (minx, miny, maxx, maxy), clip will use a faster rectangle clipping (clip_by_rect()), possibly leading to slightly different results.
keep_geom_type (boolean, default False) – If True, return only geometries of original type in case of intersection resulting in multiple geometry types or GeometryCollections. If False, return all resulting geometries (potentially mixed-types).
sort (boolean, default False) – If True, the order of rows in the clipped GeoSeries will be preserved at small performance cost. If False the order of rows in the clipped GeoSeries will be random.

Returns:

Vector data (points, lines, polygons) from gdf clipped to polygon boundary from mask.

Return type:

GeoSeries

See also

clip: top-level function for clip

Examples

Clip points (grocery stores) with polygons (the Near West Side community):

>>> import geodatasets
>>> chicago = geopandas.read_file(
...     geodatasets.get_path("geoda.chicago_health")
... )
>>> near_west_side = chicago[chicago["community"] == "NEAR WEST SIDE"]
>>> groceries = geopandas.read_file(
...     geodatasets.get_path("geoda.groceries")
... ).to_crs(chicago.crs)
>>> groceries.shape
(148, 8)

>>> nws_groceries = groceries.geometry.clip(near_west_side)
>>> nws_groceries.shape
(7,)

property crs

The Coordinate Reference System (CRS) represented as a pyproj.CRS object.

Returns None if the CRS is not set, and to set the value it :getter: Returns a pyproj.CRS or None. When setting, the value can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.

Examples

>>> s.crs
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

See also

GeoSeries.set_crs: assign CRS
GeoSeries.to_crs: re-project to another CRS

estimate_utm_crs(datum_name='WGS 84')[source]

Returns the estimated UTM CRS based on the bounds of the dataset.

Added in version 0.9.

Parameters:: datum_name (str, optional) – The name of the datum to use in the query. Default is WGS 84.
Return type:: pyproj.CRS

Examples

>>> import geodatasets
>>> df = geopandas.read_file(
...     geodatasets.get_path("geoda.chicago_health")
... )
>>> df.geometry.estimate_utm_crs()
<Derived Projected CRS: EPSG:32616>
Name: WGS 84 / UTM zone 16N
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: Between 90°W and 84°W, northern hemisphere between equator and 84°N, ...
- bounds: (-90.0, 0.0, -84.0, 84.0)
Coordinate Operation:
- name: UTM zone 16N
- method: Transverse Mercator
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

explode(ignore_index=False, index_parts=False)[source]

Explode multi-part geometries into multiple single geometries.

Single rows can become multiple rows. This is analogous to PostGIS’s ST_Dump(). The ‘path’ index is the second level of the returned MultiIndex

Parameters:

ignore_index (bool, default False) – If True, the resulting index will be labelled 0, 1, …, n - 1, ignoring index_parts.
index_parts (boolean, default False) – If True, the resulting index will be a multi-index (original index with an additional level indicating the multiple geometries: a new zero-based index for each single part geometry per multi-part geometry).

Return type:

GeoSeries

Returns:

A GeoSeries with a MultiIndex. The levels of the MultiIndex are the
original index and a zero-based integer index that counts the
number of single geometries within a multi-part geometry.

Examples

>>> from shapely.geometry import MultiPoint
>>> s = geopandas.GeoSeries(
...     [MultiPoint([(0, 0), (1, 1)]), MultiPoint([(2, 2), (3, 3), (4, 4)])]
... )
>>> s
0           MULTIPOINT ((0 0), (1 1))
1    MULTIPOINT ((2 2), (3 3), (4 4))
dtype: geometry

>>> s.explode(index_parts=True)
0    POINT (0 0)
  POINT (1 1)
0    POINT (2 2)
  POINT (3 3)
  POINT (4 4)
dtype: geometry

See also

GeoDataFrame.explode

explore(*args, **kwargs)[source]

Interactive map based on folium/leaflet.jsInteractive map based on GeoPandas and folium/leaflet.js

Generate an interactive leaflet map based on GeoSeries

Parameters:

color (str, array-like (default None)) – Named color or a list-like of colors (named or hex).
m (folium.Map (default None)) – Existing map instance on which to draw the plot.
tiles (str, xyzservices.TileProvider (default 'OpenStreetMap Mapnik')) –
Map tileset to use. Can choose from the list supported by folium, query a xyzservices.TileProvider by a name from xyzservices.providers, pass xyzservices.TileProvider object or pass custom XYZ URL. The current list of built-in providers (when xyzservices is not available):

["OpenStreetMap", "CartoDB positron", “CartoDB dark_matter"]

You can pass a custom tileset to Folium by passing a Leaflet-style URL to the tiles parameter: http://{s}.yourtiles.com/{z}/{x}/{y}.png. Be sure to check their terms and conditions and to provide attribution with the attr keyword.
attr (str (default None)) – Map tile attribution; only required if passing custom tile URL.
highlight (bool (default True)) – Enable highlight functionality when hovering over a geometry.
width (pixel int or percentage string (default: '100%')) – Width of the folium Map. If the argument m is given explicitly, width is ignored.
height (pixel int or percentage string (default: '100%')) – Height of the folium Map. If the argument m is given explicitly, height is ignored.
control_scale (bool, (default True)) – Whether to add a control scale on the map.
marker_type (str, folium.Circle, folium.CircleMarker, folium.Marker (default None)) – Allowed string options are (‘marker’, ‘circle’, ‘circle_marker’). Defaults to folium.Marker.
marker_kwds (dict (default {})) –
Additional keywords to be passed to the selected marker_type, e.g.:

radiusfloat
Radius of the circle, in meters (for 'circle') or pixels (for circle_marker).

iconfolium.map.Icon
the folium.map.Icon object to use to render the marker.

draggablebool (default False)
Set to True to be able to drag the marker around the map.
style_kwds –
Additional style to be passed to folium style_function:
strokebool (default True)
Whether to draw stroke along the path. Set it to False to disable borders on polygons or circles.

colorstr
Stroke color

weightint
Stroke width in pixels

opacityfloat (default 1.0)
Stroke opacity

fillboolean (default True)
Whether to fill the path with color. Set it to False to disable filling on polygons or circles.

fillColorstr
Fill color. Defaults to the value of the color option

fillOpacityfloat (default 0.5)
Fill opacity.

style_functioncallable
Function mapping a GeoJson Feature to a style dict.
- Style properties folium.vector_layers.path_options()
- GeoJson features GeoSeries.__geo_interface__
e.g.:
lambda x: {"color":"red" if x["properties"]["gdp_md_est"]<10**6 else "blue"}

highlight_kwdsdict (default {}): Style to be passed to folium highlight_function. Uses the same keywords as style_kwds. When empty, defaults to {"fillOpacity": 0.75}.
map_kwdsdict (default {}): Additional keywords to be passed to folium Map, e.g. dragging, or scrollWheelZoom.
**kwargsdict: Additional options to be passed on to the folium.

Returns:: m – folium Map instance
Return type:: folium.folium.Map

fillna(value=None, inplace=False, limit=None, **kwargs)[source]

Fill NA values with geometry (or geometries).

Parameters:

value (shapely geometry or GeoSeries, default None) – If None is passed, NA values will be filled with GEOMETRYCOLLECTION EMPTY. If a shapely geometry object is passed, it will be used to fill all missing values. If a GeoSeries or GeometryArray are passed, missing values will be filled based on the corresponding index locations. If pd.NA or np.nan are passed, values will be filled with None (not GEOMETRYCOLLECTION EMPTY).
limit (int, default None) – This is the maximum number of entries along the entire axis where NaNs will be filled. Must be greater than 0 if not None.

Return type:

GeoSeries

Examples

>>> from shapely.geometry import Polygon
>>> s = geopandas.GeoSeries(
...     [
...         Polygon([(0, 0), (1, 1), (0, 1)]),
...         None,
...         Polygon([(0, 0), (-1, 1), (0, -1)]),
...     ]
... )
>>> s
0      POLYGON ((0 0, 1 1, 0 1, 0 0))
1                                None
2    POLYGON ((0 0, -1 1, 0 -1, 0 0))
dtype: geometry

Filled with an empty polygon.

>>> s.fillna()
0      POLYGON ((0 0, 1 1, 0 1, 0 0))
1            GEOMETRYCOLLECTION EMPTY
2    POLYGON ((0 0, -1 1, 0 -1, 0 0))
dtype: geometry

Filled with a specific polygon.

>>> s.fillna(Polygon([(0, 1), (2, 1), (1, 2)]))
0      POLYGON ((0 0, 1 1, 0 1, 0 0))
1      POLYGON ((0 1, 2 1, 1 2, 0 1))
2    POLYGON ((0 0, -1 1, 0 -1, 0 0))
dtype: geometry

Filled with another GeoSeries.

>>> from shapely.geometry import Point
>>> s_fill = geopandas.GeoSeries(
...     [
...         Point(0, 0),
...         Point(1, 1),
...         Point(2, 2),
...     ]
... )
>>> s.fillna(s_fill)
0      POLYGON ((0 0, 1 1, 0 1, 0 0))
1                         POINT (1 1)
2    POLYGON ((0 0, -1 1, 0 -1, 0 0))
dtype: geometry

See also

GeoSeries.isna: detect missing values

classmethod from_arrow(arr, **kwargs)[source]

Construct a GeoSeries from a Arrow array object with a GeoArrow extension type.

See https://geoarrow.org/ for details on the GeoArrow specification.

This functions accepts any Arrow array object implementing the Arrow PyCapsule Protocol (i.e. having an __arrow_c_array__ method).

Added in version 1.0.

Parameters:

arr (pyarrow.Array, Arrow array) – Any array object implementing the Arrow PyCapsule Protocol (i.e. has an __arrow_c_array__ or __arrow_c_stream__ method). The type of the array should be one of the geoarrow geometry types.
**kwargs – Other parameters passed to the GeoSeries constructor.

Return type:

GeoSeries

classmethod from_file(filename, **kwargs)[source]

Alternate constructor to create a GeoSeries from a file.

Can load a GeoSeries from a file from any format recognized by pyogrio. See http://pyogrio.readthedocs.io/ for details. From a file with attributes loads only geometry column. Note that to do that, GeoPandas first loads the whole GeoDataFrame.

Parameters:

filename (str) – File path or file handle to read from. Depending on which kwargs are included, the content of filename may vary. See pyogrio.read_dataframe() for usage details.
kwargs (key-word arguments) – These arguments are passed to pyogrio.read_dataframe(), and can be used to access multi-layer data, data stored within archives (zip files), etc.

Return type:

GeoSeries

Examples

>>> import geodatasets
>>> path = geodatasets.get_path('nybb')
>>> s = geopandas.GeoSeries.from_file(path)
>>> s
0    MULTIPOLYGON (((970217.022 145643.332, 970227....
1    MULTIPOLYGON (((1029606.077 156073.814, 102957...
2    MULTIPOLYGON (((1021176.479 151374.797, 102100...
3    MULTIPOLYGON (((981219.056 188655.316, 980940....
4    MULTIPOLYGON (((1012821.806 229228.265, 101278...
Name: geometry, dtype: geometry

See also

read_file: read file to GeoDataFrame

classmethod from_wkb(data, index=None, crs=None, on_invalid='raise', **kwargs)[source]

Alternate constructor to create a GeoSeries from a list or array of WKB objects

Parameters:

data (array-like or Series) – Series, list or array of WKB objects
index (array-like or Index) – The index for the GeoSeries.
crs (value, optional) – Coordinate Reference System of the geometry objects. Can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.
on_invalid ({"raise", "warn", "ignore"}, default "raise") –
- raise: an exception will be raised if a WKB input geometry is invalid.
- warn: a warning will be raised and invalid WKB geometries will be returned as None.
- ignore: invalid WKB geometries will be returned as None without a warning.
kwargs – Additional arguments passed to the Series constructor, e.g. name.

Return type:

GeoSeries

See also

GeoSeries.from_wkt

classmethod from_wkt(data, index=None, crs=None, on_invalid='raise', **kwargs)[source]

Alternate constructor to create a GeoSeries from a list or array of WKT objects

Parameters:

data (array-like, Series) – Series, list, or array of WKT objects
index (array-like or Index) – The index for the GeoSeries.
crs (value, optional) – Coordinate Reference System of the geometry objects. Can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.
on_invalid ({"raise", "warn", "ignore"}, default "raise") –
- raise: an exception will be raised if a WKT input geometry is invalid.
- warn: a warning will be raised and invalid WKT geometries will be returned as None.
- ignore: invalid WKT geometries will be returned as None without a warning.
kwargs – Additional arguments passed to the Series constructor, e.g. name.

Return type:

GeoSeries

See also

GeoSeries.from_wkb

Examples

>>> wkts = [
... 'POINT (1 1)',
... 'POINT (2 2)',
... 'POINT (3 3)',
... ]
>>> s = geopandas.GeoSeries.from_wkt(wkts)
>>> s
0    POINT (1 1)
1    POINT (2 2)
2    POINT (3 3)
dtype: geometry

classmethod from_xy(x, y, z=None, index=None, crs=None, **kwargs)[source]

Alternate constructor to create a GeoSeries of Point geometries from lists or arrays of x, y(, z) coordinates

In case of geographic coordinates, it is assumed that longitude is captured by x coordinates and latitude by y.

Parameters:

x (iterable)
y (iterable)
z (iterable)
index (array-like or Index, optional) – The index for the GeoSeries. If not given and all coordinate inputs are Series with an equal index, that index is used.
crs (value, optional) – Coordinate Reference System of the geometry objects. Can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.
**kwargs – Additional arguments passed to the Series constructor, e.g. name.

Return type:

GeoSeries

See also

GeoSeries.from_wkt, points_from_xy

Examples

>>> x = [2.5, 5, -3.0]
>>> y = [0.5, 1, 1.5]
>>> s = geopandas.GeoSeries.from_xy(x, y, crs="EPSG:4326")
>>> s
0    POINT (2.5 0.5)
1    POINT (5 1)
2    POINT (-3 1.5)
dtype: geometry

property geometry: GeoSeries

isna()[source]

Detect missing values.

Historically, NA values in a GeoSeries could be represented by empty geometric objects, in addition to standard representations such as None and np.nan. This behaviour is changed in version 0.6.0, and now only actual missing values return True. To detect empty geometries, use GeoSeries.is_empty instead.

Return type:

Series

Returns:

A boolean pandas Series of the same size as the GeoSeries,
True where a value is NA.

Examples

>>> from shapely.geometry import Polygon
>>> s = geopandas.GeoSeries(
...     [Polygon([(0, 0), (1, 1), (0, 1)]), None, Polygon([])]
... )
>>> s
0    POLYGON ((0 0, 1 1, 0 1, 0 0))
1                              None
2                     POLYGON EMPTY
dtype: geometry

>>> s.isna()
0    False
1     True
2    False
dtype: bool

See also

GeoSeries.notna: inverse of isna
GeoSeries.is_empty: detect empty geometries

isnull()[source]

Alias for isna method. See isna for more detail.

Return type:: Series

notna()[source]

Detect non-missing values.

Historically, NA values in a GeoSeries could be represented by empty geometric objects, in addition to standard representations such as None and np.nan. This behaviour is changed in version 0.6.0, and now only actual missing values return False. To detect empty geometries, use ~GeoSeries.is_empty instead.

Return type:

Series

Returns:

A boolean pandas Series of the same size as the GeoSeries,
False where a value is NA.

Examples

>>> from shapely.geometry import Polygon
>>> s = geopandas.GeoSeries(
...     [Polygon([(0, 0), (1, 1), (0, 1)]), None, Polygon([])]
... )
>>> s
0    POLYGON ((0 0, 1 1, 0 1, 0 0))
1                              None
2                     POLYGON EMPTY
dtype: geometry

>>> s.notna()
0     True
1    False
2     True
dtype: bool

See also

GeoSeries.isna: inverse of notna
GeoSeries.is_empty: detect empty geometries

notnull()[source]

Alias for notna method. See notna for more detail.

Return type:: Series

plot(*args, **kwargs)[source]

Plot a GeoSeries.

Generate a plot of a GeoSeries geometry with matplotlib.

Parameters:

s (Series) – The GeoSeries to be plotted. Currently Polygon, MultiPolygon, LineString, MultiLineString, Point and MultiPoint geometries can be plotted.
cmap (str (default None)) –
The name of a colormap recognized by matplotlib. Any colormap will work, but categorical colormaps are generally recommended. Examples of useful discrete colormaps include:

tab10, tab20, Accent, Dark2, Paired, Pastel1, Set1, Set2
color (str, np.array, pd.Series, List (default None)) – If specified, all objects will be colored uniformly.
ax (matplotlib.pyplot.Artist (default None)) – axes on which to draw the plot
figsize (pair of floats (default None)) – Size of the resulting matplotlib.figure.Figure. If the argument ax is given explicitly, figsize is ignored.
aspect ('auto', 'equal', None or float (default 'auto')) – Set aspect of axis. If ‘auto’, the default aspect for map plots is ‘equal’; if however data are not projected (coordinates are long/lat), the aspect is by default set to 1/cos(s_y * pi/180) with s_y the y coordinate of the middle of the GeoSeries (the mean of the y range of bounding box) so that a long/lat square appears square in the middle of the plot. This implies an Equirectangular projection. If None, the aspect of ax won’t be changed. It can also be set manually (float) as the ratio of y-unit to x-unit.
autolim (bool (default True)) – Update axes data limits to contain the new geometries.
**style_kwds (dict) – Color options to be passed on to the actual plot function, such as edgecolor, facecolor, linewidth, markersize, alpha.

Returns:

ax

Return type:

matplotlib axes instance

select(*args, **kwargs)[source]

One-dimensional ndarray with axis labels (including time series).

Labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index. Statistical methods from ndarray have been overridden to automatically exclude missing data (currently represented as NaN).

Operations between Series (+, -, /, *, **) align values based on their associated index values– they need not be the same length. The result index will be the sorted union of the two indexes.

Parameters:

data (array-like, Iterable, dict, or scalar value) – Contains data stored in Series. If data is a dict, argument order is maintained.
index (array-like or Index (1d)) – Values must be hashable and have the same length as data. Non-unique index values are allowed. Will default to RangeIndex (0, 1, 2, …, n) if not provided. If data is dict-like and index is None, then the keys in the data are used as the index. If the index is not None, the resulting Series is reindexed with the index values.
dtype (str, numpy.dtype, or ExtensionDtype, optional) – Data type for the output Series. If not specified, this will be inferred from data. See the user guide for more usages.
name (Hashable, default None) – The name to give to the Series.
copy (bool, default False) – Copy input data. Only affects Series or 1d ndarray input. See examples.

Notes

Please reference the User Guide for more information.

Examples

Constructing Series from a dictionary with an Index specified

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['a', 'b', 'c'])
>>> ser
a   1
b   2
c   3
dtype: int64

The keys of the dictionary match with the Index values, hence the Index values have no effect.

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['x', 'y', 'z'])
>>> ser
x   NaN
y   NaN
z   NaN
dtype: float64

Note that the Index is first build with the keys from the dictionary. After this the Series is reindexed with the given Index values, hence we get all NaN as a result.

Constructing Series from a list with copy=False.

>>> r = [1, 2]
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
[1, 2]
>>> ser
0    999
1      2
dtype: int64

Due to input data type the Series has a copy of the original data even though copy=False, so the data is unchanged.

Constructing Series from a 1d ndarray with copy=False.

>>> r = np.array([1, 2])
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
array([999,   2])
>>> ser
0    999
1      2
dtype: int64

Due to input data type the Series has a view on the original data, so the data is changed as well.

set_crs(**kwargs)

sort_index(*args, **kwargs)[source]

One-dimensional ndarray with axis labels (including time series).

Labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index. Statistical methods from ndarray have been overridden to automatically exclude missing data (currently represented as NaN).

Operations between Series (+, -, /, *, **) align values based on their associated index values– they need not be the same length. The result index will be the sorted union of the two indexes.

Parameters:

data (array-like, Iterable, dict, or scalar value) – Contains data stored in Series. If data is a dict, argument order is maintained.
index (array-like or Index (1d)) – Values must be hashable and have the same length as data. Non-unique index values are allowed. Will default to RangeIndex (0, 1, 2, …, n) if not provided. If data is dict-like and index is None, then the keys in the data are used as the index. If the index is not None, the resulting Series is reindexed with the index values.
dtype (str, numpy.dtype, or ExtensionDtype, optional) – Data type for the output Series. If not specified, this will be inferred from data. See the user guide for more usages.
name (Hashable, default None) – The name to give to the Series.
copy (bool, default False) – Copy input data. Only affects Series or 1d ndarray input. See examples.

Notes

Please reference the User Guide for more information.

Examples

Constructing Series from a dictionary with an Index specified

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['a', 'b', 'c'])
>>> ser
a   1
b   2
c   3
dtype: int64

The keys of the dictionary match with the Index values, hence the Index values have no effect.

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['x', 'y', 'z'])
>>> ser
x   NaN
y   NaN
z   NaN
dtype: float64

Note that the Index is first build with the keys from the dictionary. After this the Series is reindexed with the given Index values, hence we get all NaN as a result.

Constructing Series from a list with copy=False.

>>> r = [1, 2]
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
[1, 2]
>>> ser
0    999
1      2
dtype: int64

Due to input data type the Series has a copy of the original data even though copy=False, so the data is unchanged.

Constructing Series from a 1d ndarray with copy=False.

>>> r = np.array([1, 2])
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
array([999,   2])
>>> ser
0    999
1      2
dtype: int64

Due to input data type the Series has a view on the original data, so the data is changed as well.

take(*args, **kwargs)[source]

One-dimensional ndarray with axis labels (including time series).

Labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index. Statistical methods from ndarray have been overridden to automatically exclude missing data (currently represented as NaN).

Operations between Series (+, -, /, *, **) align values based on their associated index values– they need not be the same length. The result index will be the sorted union of the two indexes.

Parameters:

data (array-like, Iterable, dict, or scalar value) – Contains data stored in Series. If data is a dict, argument order is maintained.
index (array-like or Index (1d)) – Values must be hashable and have the same length as data. Non-unique index values are allowed. Will default to RangeIndex (0, 1, 2, …, n) if not provided. If data is dict-like and index is None, then the keys in the data are used as the index. If the index is not None, the resulting Series is reindexed with the index values.
dtype (str, numpy.dtype, or ExtensionDtype, optional) – Data type for the output Series. If not specified, this will be inferred from data. See the user guide for more usages.
name (Hashable, default None) – The name to give to the Series.
copy (bool, default False) – Copy input data. Only affects Series or 1d ndarray input. See examples.

Notes

Please reference the User Guide for more information.

Examples

Constructing Series from a dictionary with an Index specified

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['a', 'b', 'c'])
>>> ser
a   1
b   2
c   3
dtype: int64

The keys of the dictionary match with the Index values, hence the Index values have no effect.

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['x', 'y', 'z'])
>>> ser
x   NaN
y   NaN
z   NaN
dtype: float64

Note that the Index is first build with the keys from the dictionary. After this the Series is reindexed with the given Index values, hence we get all NaN as a result.

Constructing Series from a list with copy=False.

>>> r = [1, 2]
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
[1, 2]
>>> ser
0    999
1      2
dtype: int64

Due to input data type the Series has a copy of the original data even though copy=False, so the data is unchanged.

Constructing Series from a 1d ndarray with copy=False.

>>> r = np.array([1, 2])
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
array([999,   2])
>>> ser
0    999
1      2
dtype: int64

Due to input data type the Series has a view on the original data, so the data is changed as well.

to_arrow(geometry_encoding='WKB', interleaved=True, include_z=None)[source]

Encode a GeoSeries to GeoArrow format.

See https://geoarrow.org/ for details on the GeoArrow specification.

This functions returns a generic Arrow array object implementing the Arrow PyCapsule Protocol (i.e. having an __arrow_c_array__ method). This object can then be consumed by your Arrow implementation of choice that supports this protocol.

Added in version 1.0.

Parameters:

geometry_encoding ({'WKB', 'geoarrow' }, default 'WKB') – The GeoArrow encoding to use for the data conversion.
interleaved (bool, default True) – Only relevant for ‘geoarrow’ encoding. If True, the geometries’ coordinates are interleaved in a single fixed size list array. If False, the coordinates are stored as separate arrays in a struct type.
include_z (bool, default None) – Only relevant for ‘geoarrow’ encoding (for WKB, the dimensionality of the individial geometries is preserved). If False, return 2D geometries. If True, include the third dimension in the output (if a geometry has no third dimension, the z-coordinates will be NaN). By default, will infer the dimensionality from the input geometries. Note that this inference can be unreliable with empty geometries (for a guaranteed result, it is recommended to specify the keyword).

Returns:

A generic Arrow array object with geometry data encoded to GeoArrow.

Return type:

GeoArrowArray

Examples

>>> from shapely.geometry import Point
>>> gser = geopandas.GeoSeries([Point(1, 2), Point(2, 1)])
>>> gser
0    POINT (1 2)
1    POINT (2 1)
dtype: geometry

>>> arrow_array = gser.to_arrow()
>>> arrow_array
<geopandas.io._geoarrow.GeoArrowArray object at ...>

The returned array object needs to be consumed by a library implementing the Arrow PyCapsule Protocol. For example, wrapping the data as a pyarrow.Array (requires pyarrow >= 14.0):

>>> import pyarrow as pa
>>> array = pa.array(arrow_array)
>>> array
<pyarrow.lib.BinaryArray object at ...>
[
  0101000000000000000000F03F0000000000000040,
  01010000000000000000000040000000000000F03F
]

to_crs(crs=None, epsg=None)[source]

Returns a GeoSeries with all geometries transformed to a new coordinate reference system.

Transform all geometries in a GeoSeries to a different coordinate reference system. The crs attribute on the current GeoSeries must be set. Either crs or epsg may be specified for output.

This method will transform all points in all objects. It has no notion of projecting entire geometries. All segments joining points are assumed to be lines in the current projection, not geodesics. Objects crossing the dateline (or other projection boundary) will have undesirable behavior.

Parameters:

crs (pyproj.CRS, optional if epsg is specified) – The value can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.
epsg (int, optional if crs is specified) – EPSG code specifying output projection.

Return type:

GeoSeries

Examples

>>> from shapely.geometry import Point
>>> s = geopandas.GeoSeries([Point(1, 1), Point(2, 2), Point(3, 3)], crs=4326)
>>> s
0    POINT (1 1)
1    POINT (2 2)
2    POINT (3 3)
dtype: geometry
>>> s.crs
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

>>> s = s.to_crs(3857)
>>> s
0    POINT (111319.491 111325.143)
1    POINT (222638.982 222684.209)
2    POINT (333958.472 334111.171)
dtype: geometry
>>> s.crs
<Projected CRS: EPSG:3857>
Name: WGS 84 / Pseudo-Mercator
Axis Info [cartesian]:
- X[east]: Easting (metre)
- Y[north]: Northing (metre)
Area of Use:
- name: World - 85°S to 85°N
- bounds: (-180.0, -85.06, 180.0, 85.06)
Coordinate Operation:
- name: Popular Visualisation Pseudo-Mercator
- method: Popular Visualisation Pseudo Mercator
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

See also

GeoSeries.set_crs: assign CRS

to_file(filename, driver=None, index=None, **kwargs)[source]

Write the GeoSeries to a file.

By default, an ESRI shapefile is written, but any OGR data source supported by Pyogrio or Fiona can be written.

Parameters:

filename (string) – File path or file handle to write to. The path may specify a GDAL VSI scheme.
driver (string, default None) – The OGR format driver used to write the vector file. If not specified, it attempts to infer it from the file extension. If no extension is specified, it saves ESRI Shapefile to a folder.
index (bool, default None) –
If True, write index into one or more columns (for MultiIndex). Default None writes the index into one or more columns only if the index is named, is a MultiIndex, or has a non-integer data type. If False, no index is written.

Added in version 0.7: Previously the index was not written.
mode (string, default 'w') – The write mode, ‘w’ to overwrite the existing file and ‘a’ to append. Not all drivers support appending. The drivers that support appending are listed in fiona.supported_drivers or https://github.com/Toblerity/Fiona/blob/master/fiona/drvsupport.py
crs (pyproj.CRS, default None) – If specified, the CRS is passed to Fiona to better control how the file is written. If None, GeoPandas will determine the crs based on crs df attribute. The value can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string. The keyword is not supported for the “pyogrio” engine.
engine (str, "pyogrio" or "fiona") – The underlying library that is used to write the file. Currently, the supported options are “pyogrio” and “fiona”. Defaults to “pyogrio” if installed, otherwise tries “fiona”.
**kwargs – Keyword args to be passed to the engine, and can be used to write to multi-layer data, store data within archives (zip files), etc. In case of the “pyogrio” engine, the keyword arguments are passed to pyogrio.write_dataframe. In case of the “fiona” engine, the keyword arguments are passed to fiona.open`. For more information on possible keywords, type: import pyogrio; help(pyogrio.write_dataframe).

See also

GeoDataFrame.to_file: write GeoDataFrame to file
read_file: read file to GeoDataFrame

Examples

>>> s.to_file('series.shp')

>>> s.to_file('series.gpkg', driver='GPKG', layer='name1')

>>> s.to_file('series.geojson', driver='GeoJSON')

to_json(show_bbox=True, drop_id=False, to_wgs84=False, **kwargs)[source]

Returns a GeoJSON string representation of the GeoSeries.

Parameters:

show_bbox (bool, optional, default: True) – Include bbox (bounds) in the geojson
drop_id (bool, default: False) – Whether to retain the index of the GeoSeries as the id property in the generated GeoJSON. Default is False, but may want True if the index is just arbitrary row numbers.
to_wgs84 (bool, optional, default: False) –
If the CRS is set on the active geometry column it is exported as WGS84 (EPSG:4326) to meet the 2016 GeoJSON specification. Set to True to force re-projection and set to False to ignore CRS. False by default.
json.dumps(). (*kwargs* that will be passed to)

Return type:

JSON string

Examples

>>> from shapely.geometry import Point
>>> s = geopandas.GeoSeries([Point(1, 1), Point(2, 2), Point(3, 3)])
>>> s
0    POINT (1 1)
1    POINT (2 2)
2    POINT (3 3)
dtype: geometry

>>> s.to_json()
'{"type": "FeatureCollection", "features": [{"id": "0", "type": "Feature", "properties": {}, "geometry": {"type": "Point", "coordinates": [1.0, 1.0]}, "bbox": [1.0, 1.0, 1.0, 1.0]}, {"id": "1", "type": "Feature", "properties": {}, "geometry": {"type": "Point", "coordinates": [2.0, 2.0]}, "bbox": [2.0, 2.0, 2.0, 2.0]}, {"id": "2", "type": "Feature", "properties": {}, "geometry": {"type": "Point", "coordinates": [3.0, 3.0]}, "bbox": [3.0, 3.0, 3.0, 3.0]}], "bbox": [1.0, 1.0, 3.0, 3.0]}'

See also

GeoSeries.to_file: write GeoSeries to file

to_wkb(hex=False, **kwargs)[source]

Convert GeoSeries geometries to WKB

Parameters:

hex (bool) – If true, export the WKB as a hexadecimal string. The default is to return a binary bytes object.
kwargs – Additional keyword args will be passed to shapely.to_wkb().

Returns:

WKB representations of the geometries

Return type:

Series

See also

GeoSeries.to_wkt

to_wkt(**kwargs)[source]

Convert GeoSeries geometries to WKT

Parameters:: kwargs – Keyword args will be passed to shapely.to_wkt().
Returns:: WKT representations of the geometries
Return type:: Series

Examples

>>> from shapely.geometry import Point
>>> s = geopandas.GeoSeries([Point(1, 1), Point(2, 2), Point(3, 3)])
>>> s
0    POINT (1 1)
1    POINT (2 2)
2    POINT (3 3)
dtype: geometry

>>> s.to_wkt()
0    POINT (1 1)
1    POINT (2 2)
2    POINT (3 3)
dtype: object

See also

GeoSeries.to_wkb

property x: Series

Return the x location of point geometries in a GeoSeries

Return type:: pandas.Series

Examples

>>> from shapely.geometry import Point
>>> s = geopandas.GeoSeries([Point(1, 1), Point(2, 2), Point(3, 3)])
>>> s.x
0    1.0
1    2.0
2    3.0
dtype: float64

See also

GeoSeries.y, GeoSeries.z

property y: Series

Return the y location of point geometries in a GeoSeries

Return type:: pandas.Series

Examples

>>> from shapely.geometry import Point
>>> s = geopandas.GeoSeries([Point(1, 1), Point(2, 2), Point(3, 3)])
>>> s.y
0    1.0
1    2.0
2    3.0
dtype: float64

See also

GeoSeries.x, GeoSeries.z

property z: Series

Return the z location of point geometries in a GeoSeries

Return type:: pandas.Series

Examples

>>> from shapely.geometry import Point
>>> s = geopandas.GeoSeries([Point(1, 1, 1), Point(2, 2, 2), Point(3, 3, 3)])
>>> s.z
0    1.0
1    2.0
2    3.0
dtype: float64

See also

GeoSeries.x, GeoSeries.y

exception pyorps.core.cost_assumptions.InvalidSourceError[source]

Bases: CostAssumptionsError

Exception raised when the provided source is invalid.

exception pyorps.core.cost_assumptions.NoSuitableColumnsError[source]

Bases: FeatureColumnError

Exception raised when no suitable columns are found

class pyorps.core.cost_assumptions.Path(*args, **kwargs)[source]

Bases: PurePath

PurePath subclass that can make system calls.

Path represents a filesystem path but unlike PurePath, also offers methods to do system calls on path objects. Depending on your system, instantiating a Path will return either a PosixPath or a WindowsPath object. You can also instantiate a PosixPath or WindowsPath directly, but cannot instantiate a WindowsPath on a POSIX system or vice versa.

_make_child_relpath(part)[source]

_scandir()[source]

absolute()[source]

Return an absolute version of this path by prepending the current working directory. No normalization or symlink resolution is performed.

Use resolve() to get the canonical path to a file.

chmod(mode, *, follow_symlinks=True)[source]: Change the permissions of the path, like os.chmod().

classmethod cwd()[source]: Return a new path pointing to the current working directory (as returned by os.getcwd()).

exists()[source]: Whether this path exists.

expanduser()[source]: Return a new path with expanded ~ and ~user constructs (as returned by os.path.expanduser)

glob(pattern)[source]: Iterate over this subtree and yield all existing files (of any kind, including directories) matching the given relative pattern.

group()[source]: Return the group name of the file gid.

hardlink_to(target)[source]

Make this path a hard link pointing to the same file as target.

Note the order of arguments (self, target) is the reverse of os.link’s.

classmethod home()[source]: Return a new path pointing to the user’s home directory (as returned by os.path.expanduser(‘~’)).

is_block_device()[source]: Whether this path is a block device.

is_char_device()[source]: Whether this path is a character device.

is_dir()[source]: Whether this path is a directory.

is_fifo()[source]: Whether this path is a FIFO.

is_file()[source]: Whether this path is a regular file (also True for symlinks pointing to regular files).

is_mount()[source]: Check if this path is a POSIX mount point

is_socket()[source]: Whether this path is a socket.

is_symlink()[source]: Whether this path is a symbolic link.

iterdir()[source]: Iterate over the files in this directory. Does not yield any result for the special paths ‘.’ and ‘..’.

lchmod(mode)[source]: Like chmod(), except if the path points to a symlink, the symlink’s permissions are changed, rather than its target’s.

link_to(target)[source]

Make the target path a hard link pointing to this path.

Note this function does not make this path a hard link to target, despite the implication of the function and argument names. The order of arguments (target, link) is the reverse of Path.symlink_to, but matches that of os.link.

Deprecated since Python 3.10 and scheduled for removal in Python 3.12. Use hardlink_to() instead.

lstat()[source]: Like stat(), except if the path points to a symlink, the symlink’s status information is returned, rather than its target’s.

mkdir(mode=511, parents=False, exist_ok=False)[source]: Create a new directory at this given path.

open(mode='r', buffering=-1, encoding=None, errors=None, newline=None)[source]: Open the file pointed by this path and return a file object, as the built-in open() function does.

owner()[source]: Return the login name of the file owner.

read_bytes()[source]: Open the file in bytes mode, read it, and close the file.

read_text(encoding=None, errors=None)[source]: Open the file in text mode, read it, and close the file.

readlink()[source]: Return the path to which the symbolic link points.

rename(target)[source]

Rename this path to the target path.

The target path may be absolute or relative. Relative paths are interpreted relative to the current working directory, not the directory of the Path object.

Returns the new Path instance pointing to the target path.

replace(target)[source]

Rename this path to the target path, overwriting if that path exists.

The target path may be absolute or relative. Relative paths are interpreted relative to the current working directory, not the directory of the Path object.

Returns the new Path instance pointing to the target path.

resolve(strict=False)[source]: Make the path absolute, resolving all symlinks on the way and also normalizing it.

rglob(pattern)[source]: Recursively yield all existing files (of any kind, including directories) matching the given relative pattern, anywhere in this subtree.

rmdir()[source]: Remove this directory. The directory must be empty.

samefile(other_path)[source]: Return whether other_path is the same or not as this file (as returned by os.path.samefile()).

stat(*, follow_symlinks=True)[source]: Return the result of the stat() system call on this path, like os.stat() does.

symlink_to(target, target_is_directory=False)[source]: Make this path a symlink pointing to the target path. Note the order of arguments (link, target) is the reverse of os.symlink.

touch(mode=438, exist_ok=True)[source]: Create this file with the given access mode, if it doesn’t exist.

unlink(missing_ok=False)[source]: Remove this file or link. If the path is a directory, use rmdir() instead.

write_bytes(data)[source]: Open the file in bytes mode, write to it, and close the file.

write_text(data, encoding=None, errors=None, newline=None)[source]: Open the file in text mode, write to it, and close the file.

pyorps.core.cost_assumptions.calculate_column_statistics(gdf, columns, max_features_per_column=50)[source]

Calculate statistical properties of columns for feature selection.

Parameters:

gdf (GeoDataFrame) – GeoDataFrame to analyze
columns (list[str]) – list of column names to analyze
max_features_per_column (int) – Maximum number of unique values for a column to be
categorical (considered)

Return type:

dict[str, dict[str, Any]]

Returns:

dictionary with column statistics

Raises:

ColumnAnalysisError – When column analysis fails unexpectedly

pyorps.core.cost_assumptions.calculate_entropy_score(column_name, col_stats)[source]

Calculate combined entropy score for a column, weighing area entropy more heavily.

Parameters:

column_name (str) – Name of the column to calculate score for
col_stats (dict[str, dict[str, Any]]) – dictionary with column statistics

Return type:

float

Returns:

Combined entropy score

pyorps.core.cost_assumptions.calculate_geometry_area(geometries)[source]

Calculate the sum of areas for a collection of geometries.

Parameters:: geometries (GeoSeries) – Collection of geometry objects
Return type:: float
Returns:: Sum of areas of all geometries with area attribute

pyorps.core.cost_assumptions.column_shows_relationship_to_main_feature(gdf, main_feature, side_feature)[source]

Determine if a column adds meaningful information in relation to the main feature.

Parameters:

gdf (GeoDataFrame) – GeoDataFrame containing the data
main_feature (str) – Name of the main feature column
side_feature (str) – Name of the potential side feature column

Return type:

bool

Returns:

True if the column shows a meaningful relationship, False otherwise

pyorps.core.cost_assumptions.detect_feature_columns(gdf, max_features_per_column=50)[source]

Analyze columns in a geodataframe to identify the best candidates for main_feature and side_features based on statistical metrics.

Parameters:

gdf (GeoDataFrame) – GeoDataFrame to analyze
max_features_per_column (int) – Maximum number of unique values allowed in a
column (categorical)

Return type:

tuple[str, list[str]]

Returns:

tuple of (main_feature, side_features)

Raises:

NoSuitableColumnsError – When no suitable columns are found for feature selection

pyorps.core.cost_assumptions.find_side_features(gdf, main_feature, col_stats)[source]

Find suitable side feature columns that refine the main feature.

Parameters:

gdf (GeoDataFrame) – GeoDataFrame to analyze
main_feature (str) – Selected main feature column name
col_stats (dict[str, dict[str, Any]]) – dictionary with column statistics

Return type:

list[str]

Returns:

list of side feature column names

pyorps.core.cost_assumptions.get_zero_cost_assumptions(gdf, main_feature, side_features)[source]

Generate cost assumptions with zero values for all feature combinations.

Creates structures matching format for CostAssumptions: - Without side features: {main_feature: {val1: 0, val2: 0, …}} - With side features: {(main_feature, side_feature1, …): {(val1, val2, …): 0, …}}

Parameters:

gdf (GeoDataFrame) – GeoDataFrame with feature columns
main_feature (str) – Primary feature column name
side_features (list[str]) – List of secondary feature column names

Returns:

Instacne of zero-cost assumptions

Return type:

CostAssumptions

pyorps.core.cost_assumptions.save_empty_cost_assumptions(geo_dataset, save_path, main_feature=None, side_features=None, file_type='csv', **kwargs)[source]

Generate and save empty cost assumptions with zero values for a geo dataset.

This function analyzes the given dataset to detect appropriate feature columns, creates a CostAssumptions object with zero costs for all feature combinations, and saves it to the specified path in the requested format.

Parameters:

geo_dataset (Any) – GeoDataset object with a ‘data’ attribute containing a GeoDataFrame
save_path (Union[str, Path]) – File path where the cost assumptions should be saved
main_feature (Optional[str]) – Column name for the primary feature
side_features (Optional[list[str]]) – List containing a single column name for the secondary feature
file_type (str) – Output file format - one of ‘json’, ‘csv’, or ‘excel’ (default is ‘json’)

Raises:

TypeError – If file_type is not one of the supported formats
NoSuitableColumnsError – If no suitable columns can be detected in the dataset

Returns:

This function saves to a file and doesn’t return a value

Return type:

None

pyorps.core.cost_assumptions.select_main_feature(col_stats)[source]

Select the best main feature column based on statistics.

Parameters:: col_stats (dict[str, dict[str, Any]]) – dictionary with column statistics
Return type:: str
Returns:: Name of the best main feature column

pyorps.core.exceptions module

Exceptions for CostAssumptions

exception pyorps.core.exceptions.AlgorthmNotImplementedError(algorithm, graph_library)[source]

Bases: Exception

Custom exception if a specific algorithm is not implemented in the API or the graph library

exception pyorps.core.exceptions.ColumnAnalysisError[source]

Bases: FeatureColumnError

Exception raised when column analysis fails

exception pyorps.core.exceptions.CostAssumptionsError[source]

Bases: Exception

Base exception for CostAssumptions class.

exception pyorps.core.exceptions.FeatureColumnError[source]

Bases: Exception

Base exception for feature column detection errors

exception pyorps.core.exceptions.FileLoadError[source]

Bases: CostAssumptionsError

Exception raised when loading files fails.

exception pyorps.core.exceptions.FormatError[source]

Bases: CostAssumptionsError

Exception raised when data format is invalid.

exception pyorps.core.exceptions.InvalidSourceError[source]

Bases: CostAssumptionsError

Exception raised when the provided source is invalid.

exception pyorps.core.exceptions.NoPathFoundError(source, target)[source]

Bases: Exception

Custom exception if no path can be found in the graph for source and target

exception pyorps.core.exceptions.NoSuitableColumnsError[source]

Bases: FeatureColumnError

Exception raised when no suitable columns are found

exception pyorps.core.exceptions.PairwiseError[source]

Bases: Exception

Custom exception if pairwise computation fails

exception pyorps.core.exceptions.RasterShapeError(raster_shape)[source]

Bases: Exception

Custom exception if the raster shape is not supported

exception pyorps.core.exceptions.WFSConnectionError[source]

Bases: WFSError

Exception raised for connection issues with WFS services.

exception pyorps.core.exceptions.WFSError[source]

Bases: Exception

Base exception for WFS-related errors.

exception pyorps.core.exceptions.WFSLayerNotFoundError[source]

Bases: WFSError

Exception raised when a requested layer cannot be found.

exception pyorps.core.exceptions.WFSResponseParsingError[source]

Bases: WFSError

Exception raised when parsing WFS responses fails.

pyorps.core.path module

class pyorps.core.path.Any(*args, **kwargs)[source]

Bases: object

Special type indicating an unconstrained type.

Any is compatible with every type.
Any assumed to have all methods.
All values assumed to be instances of Any.

Note that all the above statements are true from the point of view of static type checkers. At runtime, Any should not be used with instance checks.

class pyorps.core.path.LineString(coordinates=None)[source]

Bases: BaseGeometry

A geometry type composed of one or more line segments.

A LineString is a one-dimensional feature and has a non-zero length but zero area. It may approximate a curve and need not be straight. A LineString may be closed.

Parameters:: coordinates (sequence) – A sequence of (x, y, [,z]) numeric coordinate pairs or triples, or an array-like with shape (N, 2) or (N, 3). Also can be a sequence of Point objects, or combination of both.

Examples

Create a LineString with two segments

>>> from shapely import LineString
>>> a = LineString([[0, 0], [1, 0], [1, 1]])
>>> a.length
2.0

offset_curve(distance, quad_segs=16, join_style=BufferJoinStyle.round, mitre_limit=5.0)[source]

Return a (Multi)LineString at a distance from the object.

The side, left or right, is determined by the sign of the distance parameter (negative for right side offset, positive for left side offset). The resolution of the buffer around each vertex of the object increases by increasing the quad_segs keyword parameter.

The join style is for outside corners between line segments. Accepted values are JOIN_STYLE.round (1), JOIN_STYLE.mitre (2), and JOIN_STYLE.bevel (3).

The mitre ratio limit is used for very sharp corners. It is the ratio of the distance from the corner to the end of the mitred offset corner. When two line segments meet at a sharp angle, a miter join will extend far beyond the original geometry. To prevent unreasonable geometry, the mitre limit allows controlling the maximum length of the join corner. Corners with a ratio which exceed the limit will be beveled.

Note: the behaviour regarding orientation of the resulting line depends on the GEOS version. With GEOS < 3.11, the line retains the same direction for a left offset (positive distance) or has reverse direction for a right offset (negative distance), and this behaviour was documented as such in previous Shapely versions. Starting with GEOS 3.11, the function tries to preserve the orientation of the original line.

parallel_offset(distance, side='right', resolution=16, join_style=BufferJoinStyle.round, mitre_limit=5.0)[source]

Alternative method to offset_curve() method.

Older alternative method to the offset_curve() method, but uses resolution instead of quad_segs and a side keyword (‘left’ or ‘right’) instead of sign of the distance. This method is kept for backwards compatibility for now, but is is recommended to use offset_curve() instead.

svg(scale_factor=1.0, stroke_color=None, opacity=None)[source]

Return SVG polyline element for the LineString geometry.

Parameters:

scale_factor (float) – Multiplication factor for the SVG stroke-width. Default is 1.
stroke_color (str, optional) – Hex string for stroke color. Default is to use “#66cc99” if geometry is valid, and “#ff3333” if invalid.
opacity (float) – Float number between 0 and 1 for color opacity. Default value is 0.8

property xy

Separate arrays of X and Y coordinate values.

Examples

>>> from shapely import LineString
>>> x, y = LineString([(0, 0), (1, 1)]).xy
>>> list(x)
[0.0, 1.0]
>>> list(y)
[0.0, 1.0]

class pyorps.core.path.Path(source, target, algorithm, graph_api, path_indices, path_coords, path_geometry, euclidean_distance, runtimes, path_id, search_space_buffer_m, neighborhood, total_length=None, total_cost=None, length_by_category=None, length_by_category_percent=None)[source]

Bases: object

Dataclass representing a path in a raster graph. Used as container for all path metrics and information.

algorithm: str

euclidean_distance: float

graph_api: str

length_by_category: Optional[dict[float, float]] = None

length_by_category_percent: Optional[dict[float, float]] = None

neighborhood: str

path_coords: list[Union[tuple[float, float], list[float]]]

path_geometry: LineString

path_id: int

path_indices: Union[list[Union[int, int32, int64, uint32, uint64]], ndarray[int]]

runtimes: dict[str, float]

search_space_buffer_m: float

source: Union[tuple[float, float], list[float]]

target: Union[tuple[float, float], list[float]]

to_geodataframe_dict()[source]

Convert Path object to a dictionary suitable for GeoDataFrame creation.

Return type:: dict
Returns:: dictionary with path data formatted for GeoDataFrame

total_cost: Optional[float] = None

total_length: Optional[float] = None

class pyorps.core.path.PathCollection[source]

Bases: object

Container for Path objects with O(1) retrieval by path ID and O(n) lookup for source and target information. Paths can be added with new id by replacing a Path object with the same ID already existing in th PathCollection.

_next_id: int

_paths: dict[int, Path]

add(path, replace=False)[source]

Add a path to the PathCollection. If the Path’s path_id is None or if replace is False, the path_id of the Path object will set to self._next_id and self._next_id will be incremented. If the Path’s path_id is not None and replace is True, a Path with the same path_id (if present) will be replaced with the new Path object.

Parameters:

path (Path) – A Path object which should be added to the PathCollection.
replace (bool) – Whether to replace an existing Path object with the same path_id (if present) or not.

Return type:

None

property all

Return all Path objects from the values of the PathCollection’s _paths dictionary as a list.

Returns:: A list of all Path objects in the PathCollection.

get(path_id=None, source=None, target=None)[source]

Retrieve a stored path by ID, or by source AND target.

Parameters:

path_id (int) – The ID of the Path object to retrieve (must be None if path should be found by source and target)
source (Any) – The source Path object to retrieve (only used if path_id is None and target os set too; neglected otherwise)
target (Any) – The target Path object to retrieve (only used if path_id is None and target os set too; neglected otherwise)

Return type:

Optional[Path]

Returns:

The Path object with the specified ID or source/target pair. None if no such path exists.

to_geodataframe_records()[source]

Convert all paths to a list of dictionaries suitable for a GeoDataFrame.

Return type:: list
Returns:: List of dictionaries with path data formatted for a GeoDataFrame

pyorps.core.path.dataclass(cls=None, /, *, init=True, repr=True, eq=True, order=False, unsafe_hash=False, frozen=False, match_args=True, kw_only=False, slots=False, weakref_slot=False)[source]

Add dunder methods based on the fields defined in the class.

Examines PEP 526 __annotations__ to determine fields.

If init is true, an __init__() method is added to the class. If repr is true, a __repr__() method is added. If order is true, rich comparison dunder methods are added. If unsafe_hash is true, a __hash__() method is added. If frozen is true, fields may not be assigned to after instance creation. If match_args is true, the __match_args__ tuple is added. If kw_only is true, then by default all fields are keyword-only. If slots is true, a new class with a __slots__ attribute is returned.

pyorps.core.types module

class pyorps.core.types.CostAssumptions(source=None)[source]

Bases: object

A class for handling cost assumptions for rasterization.

This class handles: - Loading cost assumptions from files (CSV, Excel, JSON) or generating of cost assumptions from a dictionary or a GeoDataFrame. - Mapping costs to features in a GeoDataFrame - Managing hierarchical cost structures

_apply_nested_costs(gdf, main_feature=None, side_features=None)[source]

Apply costs to the GeoDataFrame based on nested dictionary cost assumptions.

Parameters:

gdf (GeoDataFrame) – GeoDataFrame to update with cost values
main_feature (Optional[str]) – Column name for the primary feature
side_features (Optional[list[str]]) – List containing a single column name for the
feature (secondary)

Returns:

None (modifies gdf in-place)

_apply_tuple_costs(gdf, main_feature=None, side_features=None)[source]

Apply costs to the GeoDataFrame based on tuple keys in cost assumptions.

Parameters:

gdf (GeoDataFrame) – GeoDataFrame to update with cost values
main_feature (Optional[str]) – Column name for the primary feature
side_features (Optional[list[str]]) – List of column names for secondary features

Returns:

None (modifies gdf in-place)

static _convert_numeric_columns(df)[source]

Convert columns to numeric, handling different decimal separators.

Parameters:

df (DataFrame) – DataFrame with potential numeric columns that might use different
separators (decimal)

Return type:

DataFrame

Returns:

DataFrame with properly converted numeric columns

_load_csv_cost_assumptions(filepath)[source]

Load cost assumptions from a CSV file with auto-detection of encoding, delimiter, and decimal separator.

Parameters:: filepath (str) – Path to the CSV file
Return type:: dict
Returns:: dictionary of cost assumptions

_load_excel_cost_assumptions(filepath)[source]

Load cost assumptions from an Excel file, handling different decimal separators.

Parameters:: filepath (str) – Path to the Excel file
Return type:: dict
Returns:: dictionary of cost assumptions

_load_json_cost_assumptions(filepath)[source]

Load cost assumptions from a JSON file with auto-detection of encoding.

Parameters:: filepath (str) – Path to the JSON file
Return type:: dict
Returns:: dictionary of cost assumptions

apply_to_geodataframe(gdf, main_feature=None, side_features=None)[source]

Apply cost assumptions to a GeoDataFrame.

Parameters:

gdf (GeoDataFrame) – GeoDataFrame to apply costs to
main_feature (Optional[str]) – Main feature column name
side_features (Optional[list[str]]) – list of side feature column names or single side feature name

Returns:

GeoDataFrame with ‘cost’ column added

convert_df_to_cost_dict(df)[source]

Convert a DataFrame to a nested dictionary for cost assumptions.

Parameters:: df (DataFrame) – DataFrame containing cost assumptions with hierarchical structure
Return type:: dict
Returns:: dictionary of cost assumptions with nested structure based on DataFrame columns

Uses one numeric column for costs, and all other columns as a hierarchical index: - The first column is the ‘main_feature’ - All additional columns are ‘side_features’

cost_dict_to_df(cost_dict)[source]

Convert cost assumptions dictionary to DataFrame.

Parameters:: cost_dict (dict) – Dictionary of cost assumptions
Return type:: DataFrame
Returns:: DataFrame representation of cost assumptions

load(source)[source]

Load cost assumptions from a file or dictionary.

Parameters:: source (Union[str, dict]) – Path to a file or a dictionary containing cost assumptions
Return type:: dict
Returns:: dictionary of cost assumptions

to_csv(filepath, separator=';', decimal='.', encoding='ISO-8859-1')[source]

Save the cost assumptions to a CSV file.

Parameters:

filepath (str) – Path where to save the CSV file
separator (str) – Column separator character (default is ‘;’)
decimal (str) – Decimal separator character (default is ‘.’)
encoding (str) – The encoding of the file (default is ‘ISO-8859-1’)

Return type:

None

to_excel(filepath, sheet_name='CostAssumptions', index=False)[source]

Save the cost assumptions to an Excel file.

Parameters:

filepath (str) – Path where to save the Excel file
sheet_name (str) – Name of the worksheet (default is ‘CostAssumptions’)
index (bool) – Whether to write row indices (default is False)

Return type:

None

to_json(filepath, indent=2, encoding='ISO-8859-1')[source]

Save the cost assumptions to a JSON file.

Parameters:

filepath (str) – Path where to save the JSON file
indent (int) – Number of spaces for indentation (default is 2)
encoding (str) – The encoding of the file (default is ‘ISO-8859-1’)

Return type:

None

class pyorps.core.types.GeoDataFrame(data=None, *args, geometry=None, crs=None, **kwargs)[source]

Bases: GeoPandasBase, DataFrame

A GeoDataFrame object is a pandas.DataFrame that has one or more columns containing geometry. In addition to the standard DataFrame constructor arguments, GeoDataFrame also accepts the following keyword arguments:

Parameters:

crs (value (optional)) – Coordinate Reference System of the geometry objects. Can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.
geometry (str or array-like (optional)) –
Value to use as the active geometry column. If str, treated as column name to use. If array-like, it will be added as new column named ‘geometry’ on the GeoDataFrame and set as the active geometry column.

Note that if geometry is a (Geo)Series with a name, the name will not be used, a column named “geometry” will still be added. To preserve the name, you can use rename_geometry() to update the geometry column name.

Examples

Constructing GeoDataFrame from a dictionary.

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(d, crs="EPSG:4326")
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)

Notice that the inferred dtype of ‘geometry’ columns is geometry.

>>> gdf.dtypes
col1          object
geometry    geometry
dtype: object

Constructing GeoDataFrame from a pandas DataFrame with a column of WKT geometries:

>>> import pandas as pd
>>> d = {'col1': ['name1', 'name2'], 'wkt': ['POINT (1 2)', 'POINT (2 1)']}
>>> df = pd.DataFrame(d)
>>> gs = geopandas.GeoSeries.from_wkt(df['wkt'])
>>> gdf = geopandas.GeoDataFrame(df, geometry=gs, crs="EPSG:4326")
>>> gdf
    col1          wkt     geometry
0  name1  POINT (1 2)  POINT (1 2)
1  name2  POINT (2 1)  POINT (2 1)

See also

GeoSeries: Series object designed to store shapely geometry objects

_attrs: dict[Hashable, Any]

_cache: dict[str, Any]

property _constructor: Used when a manipulation result has the same dimensions as the original.

_constructor_from_mgr(mgr, axes)[source]

property _constructor_sliced

One-dimensional ndarray with axis labels (including time series).

Labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index. Statistical methods from ndarray have been overridden to automatically exclude missing data (currently represented as NaN).

Operations between Series (+, -, /, *, **) align values based on their associated index values– they need not be the same length. The result index will be the sorted union of the two indexes.

Parameters:

data (array-like, Iterable, dict, or scalar value) – Contains data stored in Series. If data is a dict, argument order is maintained.
index (array-like or Index (1d)) – Values must be hashable and have the same length as data. Non-unique index values are allowed. Will default to RangeIndex (0, 1, 2, …, n) if not provided. If data is dict-like and index is None, then the keys in the data are used as the index. If the index is not None, the resulting Series is reindexed with the index values.
dtype (str, numpy.dtype, or ExtensionDtype, optional) – Data type for the output Series. If not specified, this will be inferred from data. See the user guide for more usages.
name (Hashable, default None) – The name to give to the Series.
copy (bool, default False) – Copy input data. Only affects Series or 1d ndarray input. See examples.

Notes

Please reference the User Guide for more information.

Examples

Constructing Series from a dictionary with an Index specified

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['a', 'b', 'c'])
>>> ser
a   1
b   2
c   3
dtype: int64

The keys of the dictionary match with the Index values, hence the Index values have no effect.

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['x', 'y', 'z'])
>>> ser
x   NaN
y   NaN
z   NaN
dtype: float64

Note that the Index is first build with the keys from the dictionary. After this the Series is reindexed with the given Index values, hence we get all NaN as a result.

Constructing Series from a list with copy=False.

>>> r = [1, 2]
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
[1, 2]
>>> ser
0    999
1      2
dtype: int64

Due to input data type the Series has a copy of the original data even though copy=False, so the data is unchanged.

Constructing Series from a 1d ndarray with copy=False.

>>> r = np.array([1, 2])
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
array([999,   2])
>>> ser
0    999
1      2
dtype: int64

Due to input data type the Series has a view on the original data, so the data is changed as well.

_constructor_sliced_from_mgr(mgr, axes)[source]

_geometry_column_name = None

_get_geometry()[source]

_internal_names: list[str] = ['_mgr', '_cacher', '_item_cache', '_cache', '_is_copy', '_name', '_metadata', '_flags', 'geometry']

_internal_names_set: set[str] = {'_cache', '_cacher', '_flags', '_is_copy', '_item_cache', '_metadata', '_mgr', '_name', 'geometry'}

_metadata: list[str] = ['_geometry_column_name']

_mgr: BlockManager | ArrayManager

_persist_old_default_geometry_colname()[source]: Internal util to temporarily persist the default geometry column name of ‘geometry’ for backwards compatibility.

_set_geometry(col)[source]

property active_geometry_name

Return the name of the active geometry column

Returns a string name if a GeoDataFrame has an active geometry column set. Otherwise returns None. You can also access the active geometry column using the .geometry property. You can set a GeoSeries to be an active geometry using the set_geometry() method.

Returns:: name of an active geometry column or None
Return type:: str

See also

GeoDataFrame.set_geometry: set the active geometry

apply(func, axis=0, raw=False, result_type=None, args=(), **kwargs)[source]

Two-dimensional, size-mutable, potentially heterogeneous tabular data.

Data structure also contains labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure.

Parameters:

data (ndarray (structured or homogeneous), Iterable, dict, or DataFrame) –
Dict can contain Series, arrays, constants, dataclass or list-like objects. If data is a dict, column order follows insertion-order. If a dict contains Series which have an index defined, it is aligned by its index. This alignment also occurs if data is a Series or a DataFrame itself. Alignment is done on Series/DataFrame inputs.

If data is a list of dicts, column order follows insertion-order.
index (Index or array-like) – Index to use for resulting frame. Will default to RangeIndex if no indexing information part of input data and no index provided.
columns (Index or array-like) – Column labels to use for resulting frame when data does not have them, defaulting to RangeIndex(0, 1, 2, …, n). If data contains column labels, will perform column selection instead.
dtype (dtype, default None) – Data type to force. Only a single dtype is allowed. If None, infer.
copy (bool or None, default None) –
Copy data from inputs. For dict data, the default of None behaves like copy=True. For DataFrame or 2d ndarray input, the default of None behaves like copy=False. If data is a dict containing one or more Series (possibly of different dtypes), copy=False will ensure that these inputs are not copied.

Changed in version 1.3.0.

See also

DataFrame.from_records: Constructor from tuples, also record arrays.
DataFrame.from_dict: From dicts of Series, arrays, or dicts.
read_csv: Read a comma-separated values (csv) file into DataFrame.
read_table: Read general delimited file into DataFrame.
read_clipboard: Read text from clipboard into DataFrame.

Notes

Please reference the User Guide for more information.

Examples

Constructing DataFrame from a dictionary.

>>> d = {'col1': [1, 2], 'col2': [3, 4]}
>>> df = pd.DataFrame(data=d)
>>> df
   col1  col2
0     1     3
1     2     4

Notice that the inferred dtype is int64.

>>> df.dtypes
col1    int64
col2    int64
dtype: object

To enforce a single dtype:

>>> df = pd.DataFrame(data=d, dtype=np.int8)
>>> df.dtypes
col1    int8
col2    int8
dtype: object

Constructing DataFrame from a dictionary including Series:

>>> d = {'col1': [0, 1, 2, 3], 'col2': pd.Series([2, 3], index=[2, 3])}
>>> pd.DataFrame(data=d, index=[0, 1, 2, 3])
   col1  col2
0     0   NaN
1     1   NaN
2     2   2.0
3     3   3.0

Constructing DataFrame from numpy ndarray:

>>> df2 = pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]),
...                    columns=['a', 'b', 'c'])
>>> df2
   a  b  c
0  1  2  3
1  4  5  6
2  7  8  9

Constructing DataFrame from a numpy ndarray that has labeled columns:

>>> data = np.array([(1, 2, 3), (4, 5, 6), (7, 8, 9)],
...                 dtype=[("a", "i4"), ("b", "i4"), ("c", "i4")])
>>> df3 = pd.DataFrame(data, columns=['c', 'a'])
...
>>> df3
   c  a
0  3  1
1  6  4
2  9  7

Constructing DataFrame from dataclass:

>>> from dataclasses import make_dataclass
>>> Point = make_dataclass("Point", [("x", int), ("y", int)])
>>> pd.DataFrame([Point(0, 0), Point(0, 3), Point(2, 3)])
   x  y
0  0  0
1  0  3
2  2  3

Constructing DataFrame from Series/DataFrame:

>>> ser = pd.Series([1, 2, 3], index=["a", "b", "c"])
>>> df = pd.DataFrame(data=ser, index=["a", "c"])
>>> df
   0
a  1
c  3

>>> df1 = pd.DataFrame([1, 2, 3], index=["a", "b", "c"], columns=["x"])
>>> df2 = pd.DataFrame(data=df1, index=["a", "c"])
>>> df2
   x
a  1
c  3

astype(dtype, copy=None, errors='raise', **kwargs)[source]: Cast a pandas object to a specified dtype dtype. Returns a GeoDataFrame when the geometry column is kept as geometries, otherwise returns a pandas DataFrame. See the pandas.DataFrame.astype docstring for more details. :rtype: GeoDataFrame or DataFrame

clip(mask, keep_geom_type=False, sort=False)[source]

Clip points, lines, or polygon geometries to the mask extent.

Both layers must be in the same Coordinate Reference System (CRS). The GeoDataFrame will be clipped to the full extent of the mask object.

If there are multiple polygons in mask, data from the GeoDataFrame will be clipped to the total boundary of all polygons in mask.

Parameters:

mask (GeoDataFrame, GeoSeries, (Multi)Polygon, list-like) – Polygon vector layer used to clip the GeoDataFrame. The mask’s geometry is dissolved into one geometric feature and intersected with GeoDataFrame. If the mask is list-like with four elements (minx, miny, maxx, maxy), clip will use a faster rectangle clipping (clip_by_rect()), possibly leading to slightly different results.
keep_geom_type (boolean, default False) – If True, return only geometries of original type in case of intersection resulting in multiple geometry types or GeometryCollections. If False, return all resulting geometries (potentially mixed types).
sort (boolean, default False) – If True, the order of rows in the clipped GeoDataFrame will be preserved at small performance cost. If False the order of rows in the clipped GeoDataFrame will be random.

Returns:

Vector data (points, lines, polygons) from the GeoDataFrame clipped to polygon boundary from mask.

Return type:

GeoDataFrame

See also

clip: equivalent top-level function

Examples

Clip points (grocery stores) with polygons (the Near West Side community):

>>> import geodatasets
>>> chicago = geopandas.read_file(
...     geodatasets.get_path("geoda.chicago_health")
... )
>>> near_west_side = chicago[chicago["community"] == "NEAR WEST SIDE"]
>>> groceries = geopandas.read_file(
...     geodatasets.get_path("geoda.groceries")
... ).to_crs(chicago.crs)
>>> groceries.shape
(148, 8)

>>> nws_groceries = groceries.clip(near_west_side)
>>> nws_groceries.shape
(7, 8)

copy(deep=True)[source]

Two-dimensional, size-mutable, potentially heterogeneous tabular data.

Data structure also contains labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure.

Parameters:

data (ndarray (structured or homogeneous), Iterable, dict, or DataFrame) –
Dict can contain Series, arrays, constants, dataclass or list-like objects. If data is a dict, column order follows insertion-order. If a dict contains Series which have an index defined, it is aligned by its index. This alignment also occurs if data is a Series or a DataFrame itself. Alignment is done on Series/DataFrame inputs.

If data is a list of dicts, column order follows insertion-order.
index (Index or array-like) – Index to use for resulting frame. Will default to RangeIndex if no indexing information part of input data and no index provided.
columns (Index or array-like) – Column labels to use for resulting frame when data does not have them, defaulting to RangeIndex(0, 1, 2, …, n). If data contains column labels, will perform column selection instead.
dtype (dtype, default None) – Data type to force. Only a single dtype is allowed. If None, infer.
copy (bool or None, default None) –
Copy data from inputs. For dict data, the default of None behaves like copy=True. For DataFrame or 2d ndarray input, the default of None behaves like copy=False. If data is a dict containing one or more Series (possibly of different dtypes), copy=False will ensure that these inputs are not copied.

Changed in version 1.3.0.

See also

DataFrame.from_records: Constructor from tuples, also record arrays.
DataFrame.from_dict: From dicts of Series, arrays, or dicts.
read_csv: Read a comma-separated values (csv) file into DataFrame.
read_table: Read general delimited file into DataFrame.
read_clipboard: Read text from clipboard into DataFrame.

Notes

Please reference the User Guide for more information.

Examples

Constructing DataFrame from a dictionary.

>>> d = {'col1': [1, 2], 'col2': [3, 4]}
>>> df = pd.DataFrame(data=d)
>>> df
   col1  col2
0     1     3
1     2     4

Notice that the inferred dtype is int64.

>>> df.dtypes
col1    int64
col2    int64
dtype: object

To enforce a single dtype:

>>> df = pd.DataFrame(data=d, dtype=np.int8)
>>> df.dtypes
col1    int8
col2    int8
dtype: object

Constructing DataFrame from a dictionary including Series:

>>> d = {'col1': [0, 1, 2, 3], 'col2': pd.Series([2, 3], index=[2, 3])}
>>> pd.DataFrame(data=d, index=[0, 1, 2, 3])
   col1  col2
0     0   NaN
1     1   NaN
2     2   2.0
3     3   3.0

Constructing DataFrame from numpy ndarray:

>>> df2 = pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]),
...                    columns=['a', 'b', 'c'])
>>> df2
   a  b  c
0  1  2  3
1  4  5  6
2  7  8  9

Constructing DataFrame from a numpy ndarray that has labeled columns:

>>> data = np.array([(1, 2, 3), (4, 5, 6), (7, 8, 9)],
...                 dtype=[("a", "i4"), ("b", "i4"), ("c", "i4")])
>>> df3 = pd.DataFrame(data, columns=['c', 'a'])
...
>>> df3
   c  a
0  3  1
1  6  4
2  9  7

Constructing DataFrame from dataclass:

>>> from dataclasses import make_dataclass
>>> Point = make_dataclass("Point", [("x", int), ("y", int)])
>>> pd.DataFrame([Point(0, 0), Point(0, 3), Point(2, 3)])
   x  y
0  0  0
1  0  3
2  2  3

Constructing DataFrame from Series/DataFrame:

>>> ser = pd.Series([1, 2, 3], index=["a", "b", "c"])
>>> df = pd.DataFrame(data=ser, index=["a", "c"])
>>> df
   0
a  1
c  3

>>> df1 = pd.DataFrame([1, 2, 3], index=["a", "b", "c"], columns=["x"])
>>> df2 = pd.DataFrame(data=df1, index=["a", "c"])
>>> df2
   x
a  1
c  3

property crs

The Coordinate Reference System (CRS) represented as a pyproj.CRS object.

Returns None if the CRS is not set, and to set the value it :getter: Returns a pyproj.CRS or None. When setting, the value can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.

Examples

>>> gdf.crs
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

See also

GeoDataFrame.set_crs: assign CRS
GeoDataFrame.to_crs: re-project to another CRS

dissolve(by=None, aggfunc='first', as_index=True, level=None, sort=True, observed=False, dropna=True, method='unary', **kwargs)[source]

Dissolve geometries within groupby into single observation. This is accomplished by applying the union_all method to all geometries within a groupself.

Observations associated with each groupby group will be aggregated using the aggfunc.

Parameters:

by (str or list-like, default None) – Column(s) whose values define the groups to be dissolved. If None, the entire GeoDataFrame is considered as a single group. If a list-like object is provided, the values in the list are treated as categorical labels, and polygons will be combined based on the equality of these categorical labels.
aggfunc (function or string, default "first") –
Aggregation function for manipulation of data associated with each group. Passed to pandas groupby.agg method. Accepted combinations are:
- function
- string function name
- list of functions and/or function names, e.g. [np.sum, ‘mean’]
- dict of axis labels -> functions, function names or list of such.
as_index (boolean, default True) – If true, groupby columns become index of result.
level (int or str or sequence of int or sequence of str, default None) – If the axis is a MultiIndex (hierarchical), group by a particular level or levels.
sort (bool, default True) – Sort group keys. Get better performance by turning this off. Note this does not influence the order of observations within each group. Groupby preserves the order of rows within each group.
observed (bool, default False) – This only applies if any of the groupers are Categoricals. If True: only show observed values for categorical groupers. If False: show all values for categorical groupers.
dropna (bool, default True) – If True, and if group keys contain NA values, NA values together with row/column will be dropped. If False, NA values will also be treated as the key in groups.
method (str (default "unary")) –
The method to use for the union. Options are:
- "unary": use the unary union algorithm. This option is the most robust but can be slow for large numbers of geometries (default).
- "coverage": use the coverage union algorithm. This option is optimized for non-overlapping polygons and can be significantly faster than the unary union algorithm. However, it can produce invalid geometries if the polygons overlap.
**kwargs –
Keyword arguments to be passed to the pandas DataFrameGroupby.agg method which is used by dissolve. In particular, numeric_only may be supplied, which will be required in pandas 2.0 for certain aggfuncs.

Added in version 0.13.0.

Return type:

GeoDataFrame

Examples

>>> from shapely.geometry import Point
>>> d = {
...     "col1": ["name1", "name2", "name1"],
...     "geometry": [Point(1, 2), Point(2, 1), Point(0, 1)],
... }
>>> gdf = geopandas.GeoDataFrame(d, crs=4326)
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)
2  name1  POINT (0 1)

>>> dissolved = gdf.dissolve('col1')
>>> dissolved
                        geometry
col1
name1  MULTIPOINT ((0 1), (1 2))
name2                POINT (2 1)

See also

GeoDataFrame.explode: explode multi-part geometries into single geometries

estimate_utm_crs(datum_name='WGS 84')[source]

Returns the estimated UTM CRS based on the bounds of the dataset.

Added in version 0.9.

Parameters:: datum_name (str, optional) – The name of the datum to use in the query. Default is WGS 84.
Return type:: pyproj.CRS

Examples

>>> import geodatasets
>>> df = geopandas.read_file(
...     geodatasets.get_path("geoda.chicago_health")
... )
>>> df.estimate_utm_crs()
<Derived Projected CRS: EPSG:32616>
Name: WGS 84 / UTM zone 16N
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: Between 90°W and 84°W, northern hemisphere between equator and 84°N...
- bounds: (-90.0, 0.0, -84.0, 84.0)
Coordinate Operation:
- name: UTM zone 16N
- method: Transverse Mercator
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

explode(column=None, ignore_index=False, index_parts=False, **kwargs)[source]

Explode multi-part geometries into multiple single geometries.

Each row containing a multi-part geometry will be split into multiple rows with single geometries, thereby increasing the vertical size of the GeoDataFrame.

Parameters:

column (string, default None) – Column to explode. In the case of a geometry column, multi-part geometries are converted to single-part. If None, the active geometry column is used.
ignore_index (bool, default False) – If True, the resulting index will be labelled 0, 1, …, n - 1, ignoring index_parts.
index_parts (boolean, default False) – If True, the resulting index will be a multi-index (original index with an additional level indicating the multiple geometries: a new zero-based index for each single part geometry per multi-part geometry).

Returns:

Exploded geodataframe with each single geometry as a separate entry in the geodataframe.

Return type:

GeoDataFrame

Examples

>>> from shapely.geometry import MultiPoint
>>> d = {
...     "col1": ["name1", "name2"],
...     "geometry": [
...         MultiPoint([(1, 2), (3, 4)]),
...         MultiPoint([(2, 1), (0, 0)]),
...     ],
... }
>>> gdf = geopandas.GeoDataFrame(d, crs=4326)
>>> gdf
    col1               geometry
0  name1  MULTIPOINT ((1 2), (3 4))
1  name2  MULTIPOINT ((2 1), (0 0))

>>> exploded = gdf.explode(index_parts=True)
>>> exploded
      col1     geometry
0 0  name1  POINT (1 2)
  1  name1  POINT (3 4)
1 0  name2  POINT (2 1)
  1  name2  POINT (0 0)

>>> exploded = gdf.explode(index_parts=False)
>>> exploded
    col1     geometry
0  name1  POINT (1 2)
0  name1  POINT (3 4)
1  name2  POINT (2 1)
1  name2  POINT (0 0)

>>> exploded = gdf.explode(ignore_index=True)
>>> exploded
    col1     geometry
0  name1  POINT (1 2)
1  name1  POINT (3 4)
2  name2  POINT (2 1)
3  name2  POINT (0 0)

See also

GeoDataFrame.dissolve: dissolve geometries into a single observation.

explore(*args, **kwargs)[source]

Interactive map based on GeoPandas and folium/leaflet.js

Generate an interactive leaflet map based on GeoDataFrame

Parameters:

column (str, np.array, pd.Series (default None)) – The name of the dataframe column, numpy.array, or pandas.Series to be plotted. If numpy.array or pandas.Series are used then it must have same length as dataframe.
cmap (str, matplotlib.Colormap, branca.colormap or function (default None)) –
The name of a colormap recognized by matplotlib, a list-like of colors, matplotlib.colors.Colormap, a branca.colormap.ColorMap or function that returns a named color or hex based on the column value, e.g.:
```
def my_colormap(value):  # scalar value defined in 'column'
    if value > 1:
        return "green"
    return "red"
```
color (str, array-like (default None)) – Named color or a list-like of colors (named or hex).
m (folium.Map (default None)) – Existing map instance on which to draw the plot.
tiles (str, xyzservices.TileProvider (default 'OpenStreetMap Mapnik')) –
Map tileset to use. Can choose from the list supported by folium, query a xyzservices.TileProvider by a name from xyzservices.providers, pass xyzservices.TileProvider object or pass custom XYZ URL. The current list of built-in providers (when xyzservices is not available):

["OpenStreetMap", "CartoDB positron", “CartoDB dark_matter"]

You can pass a custom tileset to Folium by passing a Leaflet-style URL to the tiles parameter: http://{s}.yourtiles.com/{z}/{x}/{y}.png. Be sure to check their terms and conditions and to provide attribution with the attr keyword.
attr (str (default None)) – Map tile attribution; only required if passing custom tile URL.
tooltip (bool, str, int, list (default True)) – Display GeoDataFrame attributes when hovering over the object. True includes all columns. False removes tooltip. Pass string or list of strings to specify a column(s). Integer specifies first n columns to be included. Defaults to True.
popup (bool, str, int, list (default False)) – Input GeoDataFrame attributes for object displayed when clicking. True includes all columns. False removes popup. Pass string or list of strings to specify a column(s). Integer specifies first n columns to be included. Defaults to False.
highlight (bool (default True)) – Enable highlight functionality when hovering over a geometry.
categorical (bool (default False)) – If False, cmap will reflect numerical values of the column being plotted. For non-numerical columns, this will be set to True.
legend (bool (default True)) – Plot a legend in choropleth plots. Ignored if no column is given.
scheme (str (default None)) – Name of a choropleth classification scheme (requires mapclassify >= 2.4.0). A mapclassify.classify() will be used under the hood. Supported are all schemes provided by mapclassify (e.g. 'BoxPlot', 'EqualInterval', 'FisherJenks', 'FisherJenksSampled', 'HeadTailBreaks', 'JenksCaspall', 'JenksCaspallForced', 'JenksCaspallSampled', 'MaxP', 'MaximumBreaks', 'NaturalBreaks', 'Quantiles', 'Percentiles', 'StdMean', 'UserDefined'). Arguments can be passed in classification_kwds.
k (int (default 5)) – Number of classes
vmin (None or float (default None)) – Minimum value of cmap. If None, the minimum data value in the column to be plotted is used.
vmax (None or float (default None)) – Maximum value of cmap. If None, the maximum data value in the column to be plotted is used.
width (pixel int or percentage string (default: '100%')) – Width of the folium Map. If the argument m is given explicitly, width is ignored.
height (pixel int or percentage string (default: '100%')) – Height of the folium Map. If the argument m is given explicitly, height is ignored.
categories (list-like) – Ordered list-like object of categories to be used for categorical plot.
classification_kwds (dict (default None)) – Keyword arguments to pass to mapclassify
control_scale (bool, (default True)) – Whether to add a control scale on the map.
marker_type (str, folium.Circle, folium.CircleMarker, folium.Marker (default None)) – Allowed string options are (‘marker’, ‘circle’, ‘circle_marker’). Defaults to folium.CircleMarker.
marker_kwds (dict (default {})) –
Additional keywords to be passed to the selected marker_type, e.g.:

radiusfloat (default 2 for circle_marker and 50 for circle))
Radius of the circle, in meters (for circle) or pixels (for circle_marker).

fillbool (default True)
Whether to fill the circle or circle_marker with color.

iconfolium.map.Icon
the folium.map.Icon object to use to render the marker.

draggablebool (default False)
Set to True to be able to drag the marker around the map.
style_kwds (dict (default {})) –
Additional style to be passed to folium style_function:
strokebool (default True)
Whether to draw stroke along the path. Set it to False to disable borders on polygons or circles.

colorstr
Stroke color

weightint
Stroke width in pixels

opacityfloat (default 1.0)
Stroke opacity

fillboolean (default True)
Whether to fill the path with color. Set it to False to disable filling on polygons or circles.

fillColorstr
Fill color. Defaults to the value of the color option

fillOpacityfloat (default 0.5)
Fill opacity.

style_functioncallable
Function mapping a GeoJson Feature to a style dict.
- Style properties folium.vector_layers.path_options()
- GeoJson features GeoDataFrame.__geo_interface__
e.g.:
lambda x: {"color":"red" if x["properties"]["gdp_md_est"]<10**6 else "blue"}
Plus all supported by folium.vector_layers.path_options(). See the documentation of folium.features.GeoJson for details.
highlight_kwds (dict (default {})) – Style to be passed to folium highlight_function. Uses the same keywords as style_kwds. When empty, defaults to {"fillOpacity": 0.75}.
tooltip_kwds (dict (default {})) – Additional keywords to be passed to folium.features.GeoJsonTooltip, e.g. aliases, labels, or sticky.
popup_kwds (dict (default {})) – Additional keywords to be passed to folium.features.GeoJsonPopup, e.g. aliases or labels.
legend_kwds (dict (default {})) –
Additional keywords to be passed to the legend.

Currently supported customisation:

captionstring
Custom caption of the legend. Defaults to the column name.

Additional accepted keywords when scheme is specified:

colorbarbool (default True)
An option to control the style of the legend. If True, continuous colorbar will be used. If False, categorical legend will be used for bins.

scalebool (default True)
Scale bins along the colorbar axis according to the bin edges (True) or use the equal length for each bin (False)

fmtstring (default “{:.2f}”)
A formatting specification for the bin edges of the classes in the legend. For example, to have no decimals: {"fmt": "{:.0f}"}. Applies if colorbar=False.

labelslist-like
A list of legend labels to override the auto-generated labels. Needs to have the same number of elements as the number of classes (k). Applies if colorbar=False.

intervalboolean (default False)
An option to control brackets from mapclassify legend. If True, open/closed interval brackets are shown in the legend. Applies if colorbar=False.

max_labelsint, default 10
Maximum number of colorbar tick labels (requires branca>=0.5.0)
map_kwds (dict (default {})) – Additional keywords to be passed to folium Map, e.g. dragging, or scrollWheelZoom.

**kwargsdict: Additional options to be passed on to the folium object.

Returns:: m – folium Map instance
Return type:: folium.folium.Map

Examples

>>> import geodatasets
>>> df = geopandas.read_file(
...     geodatasets.get_path("geoda.chicago_health")
... )
>>> df.head(2)
   ComAreaID  ...                                           geometry
0         35  ...  POLYGON ((-87.60914 41.84469, -87.60915 41.844...
1         36  ...  POLYGON ((-87.59215 41.81693, -87.59231 41.816...

[2 rows x 87 columns]

>>> df.explore("Pop2012", cmap="Blues")

classmethod from_arrow(table, geometry=None)[source]

Construct a GeoDataFrame from a Arrow table object based on GeoArrow extension types.

See https://geoarrow.org/ for details on the GeoArrow specification.

This functions accepts any tabular Arrow object implementing the Arrow PyCapsule Protocol (i.e. having an __arrow_c_array__ or __arrow_c_stream__ method).

Added in version 1.0.

Parameters:

table (pyarrow.Table or Arrow-compatible table) – Any tabular object implementing the Arrow PyCapsule Protocol (i.e. has an __arrow_c_array__ or __arrow_c_stream__ method). This table should have at least one column with a geoarrow geometry type.
geometry (str, default None) – The name of the geometry column to set as the active geometry column. If None, the first geometry column found will be used.

Return type:

GeoDataFrame

classmethod from_dict(data, geometry=None, crs=None, **kwargs)[source]

Construct GeoDataFrame from dict of array-like or dicts by overriding DataFrame.from_dict method with geometry and crs

Parameters:

data (dict) – Of the form {field : array-like} or {field : dict}.
geometry (str or array (optional)) – If str, column to use as geometry. If array, will be set as ‘geometry’ column on GeoDataFrame.
crs (str or dict (optional)) – Coordinate reference system to set on the resulting frame.
kwargs (key-word arguments) – These arguments are passed to DataFrame.from_dict

Return type:

GeoDataFrame

classmethod from_features(features, crs=None, columns=None)[source]

Alternate constructor to create GeoDataFrame from an iterable of features or a feature collection.

Parameters:

features –
- Iterable of features, where each element must be a feature dictionary or implement the __geo_interface__.
- Feature collection, where the ‘features’ key contains an iterable of features.
- Object holding a feature collection that implements the __geo_interface__.
crs (str or dict (optional)) – Coordinate reference system to set on the resulting frame.
columns (list of column names, optional) – Optionally specify the column names to include in the output frame. This does not overwrite the property names of the input, but can ensure a consistent output format.

Return type:

GeoDataFrame

Notes

For more information about the __geo_interface__, see https://gist.github.com/sgillies/2217756

Examples

>>> feature_coll = {
...     "type": "FeatureCollection",
...     "features": [
...         {
...             "id": "0",
...             "type": "Feature",
...             "properties": {"col1": "name1"},
...             "geometry": {"type": "Point", "coordinates": (1.0, 2.0)},
...             "bbox": (1.0, 2.0, 1.0, 2.0),
...         },
...         {
...             "id": "1",
...             "type": "Feature",
...             "properties": {"col1": "name2"},
...             "geometry": {"type": "Point", "coordinates": (2.0, 1.0)},
...             "bbox": (2.0, 1.0, 2.0, 1.0),
...         },
...     ],
...     "bbox": (1.0, 1.0, 2.0, 2.0),
... }
>>> df = geopandas.GeoDataFrame.from_features(feature_coll)
>>> df
      geometry   col1
0  POINT (1 2)  name1
1  POINT (2 1)  name2

classmethod from_file(filename, **kwargs)[source]

Alternate constructor to create a GeoDataFrame from a file.

It is recommended to use geopandas.read_file() instead.

Can load a GeoDataFrame from a file in any format recognized by pyogrio. See http://pyogrio.readthedocs.io/ for details.

Parameters:

filename (str) – File path or file handle to read from. Depending on which kwargs are included, the content of filename may vary. See pyogrio.read_dataframe() for usage details.
kwargs (key-word arguments) – These arguments are passed to pyogrio.read_dataframe(), and can be used to access multi-layer data, data stored within archives (zip files), etc.

Examples

>>> import geodatasets
>>> path = geodatasets.get_path('nybb')
>>> gdf = geopandas.GeoDataFrame.from_file(path)
>>> gdf
   BoroCode       BoroName     Shape_Leng    Shape_Area                                           geometry
0         5  Staten Island  330470.010332  1.623820e+09  MULTIPOLYGON (((970217.022 145643.332, 970227....
1         4         Queens  896344.047763  3.045213e+09  MULTIPOLYGON (((1029606.077 156073.814, 102957...
2         3       Brooklyn  741080.523166  1.937479e+09  MULTIPOLYGON (((1021176.479 151374.797, 102100...
3         1      Manhattan  359299.096471  6.364715e+08  MULTIPOLYGON (((981219.056 188655.316, 980940....
4         2          Bronx  464392.991824  1.186925e+09  MULTIPOLYGON (((1012821.806 229228.265, 101278...

The recommended method of reading files is geopandas.read_file():

>>> gdf = geopandas.read_file(path)

See also

read_file: read file to GeoDataFame
GeoDataFrame.to_file: write GeoDataFrame to file

classmethod from_postgis(sql, con, geom_col='geom', crs=None, index_col=None, coerce_float=True, parse_dates=None, params=None, chunksize=None)[source]

Alternate constructor to create a GeoDataFrame from a sql query containing a geometry column in WKB representation.

Parameters:

sql (string)
con (sqlalchemy.engine.Connection or sqlalchemy.engine.Engine)
geom_col (string, default 'geom') – column name to convert to shapely geometries
crs (optional) – Coordinate reference system to use for the returned GeoDataFrame
index_col (string or list of strings, optional, default: None) – Column(s) to set as index(MultiIndex)
coerce_float (boolean, default True) – Attempt to convert values of non-string, non-numeric objects (like decimal.Decimal) to floating point, useful for SQL result sets
parse_dates (list or dict, default None) –
- List of column names to parse as dates.
- Dict of {column_name: format string} where format string is strftime compatible in case of parsing string times, or is one of (D, s, ns, ms, us) in case of parsing integer timestamps.
- Dict of {column_name: arg dict}, where the arg dict corresponds to the keyword arguments of pandas.to_datetime(). Especially useful with databases without native Datetime support, such as SQLite.
params (list, tuple or dict, optional, default None) – List of parameters to pass to execute method.
chunksize (int, default None) – If specified, return an iterator where chunksize is the number of rows to include in each chunk.

Examples

PostGIS

>>> from sqlalchemy import create_engine
>>> db_connection_url = "postgresql://myusername:mypassword@myhost:5432/mydb"
>>> con = create_engine(db_connection_url)
>>> sql = "SELECT geom, highway FROM roads"
>>> df = geopandas.GeoDataFrame.from_postgis(sql, con)

SpatiaLite

>>> sql = "SELECT ST_Binary(geom) AS geom, highway FROM roads"
>>> df = geopandas.GeoDataFrame.from_postgis(sql, con)

The recommended method of reading from PostGIS is geopandas.read_postgis():

>>> df = geopandas.read_postgis(sql, con)

See also

geopandas.read_postgis: read PostGIS database to GeoDataFrame

property geometry: Geometry data for GeoDataFrame

iterfeatures(na='null', show_bbox=False, drop_id=False)[source]

Returns an iterator that yields feature dictionaries that comply with __geo_interface__

Parameters:

na (str, optional) –
Options are {‘null’, ‘drop’, ‘keep’}, default ‘null’. Indicates how to output missing (NaN) values in the GeoDataFrame
- null: output the missing entries as JSON null
- drop: remove the property from the feature. This applies to each feature individually so that features may have different properties
- keep: output the missing entries as NaN
show_bbox (bool, optional) – Include bbox (bounds) in the geojson. Default False.
drop_id (bool, default: False) – Whether to retain the index of the GeoDataFrame as the id property in the generated GeoJSON. Default is False, but may want True if the index is just arbitrary row numbers.

Examples

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(d, crs="EPSG:4326")
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)

>>> feature = next(gdf.iterfeatures())
>>> feature
{'id': '0', 'type': 'Feature', 'properties': {'col1': 'name1'}, 'geometry': {'type': 'Point', 'coordinates': (1.0, 2.0)}}

overlay(right, how='intersection', keep_geom_type=None, make_valid=True)[source]

Perform spatial overlay between GeoDataFrames.

Currently only supports data GeoDataFrames with uniform geometry types, i.e. containing only (Multi)Polygons, or only (Multi)Points, or a combination of (Multi)LineString and LinearRing shapes. Implements several methods that are all effectively subsets of the union.

See the User Guide page ../../user_guide/set_operations for details.

Parameters:

right (GeoDataFrame)
how (string) – Method of spatial overlay: ‘intersection’, ‘union’, ‘identity’, ‘symmetric_difference’ or ‘difference’.
keep_geom_type (bool) – If True, return only geometries of the same geometry type the GeoDataFrame has, if False, return all resulting geometries. Default is None, which will set keep_geom_type to True but warn upon dropping geometries.
make_valid (bool, default True) – If True, any invalid input geometries are corrected with a call to make_valid(), if False, a ValueError is raised if any input geometries are invalid.

Returns:

df – GeoDataFrame with new set of polygons and attributes resulting from the overlay

Return type:

GeoDataFrame

Examples

>>> from shapely.geometry import Polygon
>>> polys1 = geopandas.GeoSeries([Polygon([(0,0), (2,0), (2,2), (0,2)]),
...                               Polygon([(2,2), (4,2), (4,4), (2,4)])])
>>> polys2 = geopandas.GeoSeries([Polygon([(1,1), (3,1), (3,3), (1,3)]),
...                               Polygon([(3,3), (5,3), (5,5), (3,5)])])
>>> df1 = geopandas.GeoDataFrame({'geometry': polys1, 'df1_data':[1,2]})
>>> df2 = geopandas.GeoDataFrame({'geometry': polys2, 'df2_data':[1,2]})

>>> df1.overlay(df2, how='union')
   df1_data  df2_data                                           geometry
     1.0       1.0                POLYGON ((2 2, 2 1, 1 1, 1 2, 2 2))
     2.0       1.0                POLYGON ((2 2, 2 3, 3 3, 3 2, 2 2))
     2.0       2.0                POLYGON ((4 4, 4 3, 3 3, 3 4, 4 4))
     1.0       NaN      POLYGON ((2 0, 0 0, 0 2, 1 2, 1 1, 2 1, 2 0))
     2.0       NaN  MULTIPOLYGON (((3 4, 3 3, 2 3, 2 4, 3 4)), ((4...
     NaN       1.0  MULTIPOLYGON (((2 3, 2 2, 1 2, 1 3, 2 3)), ((3...
     NaN       2.0      POLYGON ((3 5, 5 5, 5 3, 4 3, 4 4, 3 4, 3 5))

>>> df1.overlay(df2, how='intersection')
   df1_data  df2_data                             geometry
0         1         1  POLYGON ((2 2, 2 1, 1 1, 1 2, 2 2))
1         2         1  POLYGON ((2 2, 2 3, 3 3, 3 2, 2 2))
2         2         2  POLYGON ((4 4, 4 3, 3 3, 3 4, 4 4))

>>> df1.overlay(df2, how='symmetric_difference')
   df1_data  df2_data                                           geometry
0       1.0       NaN      POLYGON ((2 0, 0 0, 0 2, 1 2, 1 1, 2 1, 2 0))
1       2.0       NaN  MULTIPOLYGON (((3 4, 3 3, 2 3, 2 4, 3 4)), ((4...
2       NaN       1.0  MULTIPOLYGON (((2 3, 2 2, 1 2, 1 3, 2 3)), ((3...
3       NaN       2.0      POLYGON ((3 5, 5 5, 5 3, 4 3, 4 4, 3 4, 3 5))

>>> df1.overlay(df2, how='difference')
                                            geometry  df1_data
0      POLYGON ((2 0, 0 0, 0 2, 1 2, 1 1, 2 1, 2 0))         1
1  MULTIPOLYGON (((3 4, 3 3, 2 3, 2 4, 3 4)), ((4...         2

>>> df1.overlay(df2, how='identity')
   df1_data  df2_data                                           geometry
     1.0       1.0                POLYGON ((2 2, 2 1, 1 1, 1 2, 2 2))
     2.0       1.0                POLYGON ((2 2, 2 3, 3 3, 3 2, 2 2))
     2.0       2.0                POLYGON ((4 4, 4 3, 3 3, 3 4, 4 4))
     1.0       NaN      POLYGON ((2 0, 0 0, 0 2, 1 2, 1 1, 2 1, 2 0))
     2.0       NaN  MULTIPOLYGON (((3 4, 3 3, 2 3, 2 4, 3 4)), ((4...

See also

GeoDataFrame.sjoin: spatial join
overlay: equivalent top-level function

Notes

Every operation in GeoPandas is planar, i.e. the potential third dimension is not taken into account.

plot: alias of GeoplotAccessor

rename_geometry(col, inplace=False)[source]

Renames the GeoDataFrame geometry column to the specified name. By default yields a new object.

The original geometry column is replaced with the input.

Parameters:

col (new geometry column label)
inplace (boolean, default False) – Modify the GeoDataFrame in place (do not create a new object)

Examples

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> df = geopandas.GeoDataFrame(d, crs="EPSG:4326")
>>> df1 = df.rename_geometry('geom1')
>>> df1.geometry.name
'geom1'
>>> df.rename_geometry('geom1', inplace=True)
>>> df.geometry.name
'geom1'

Returns:: geodataframe
Return type:: GeoDataFrame

See also

GeoDataFrame.set_geometry: set the active geometry

set_crs(crs=None, epsg=None, inplace=False, allow_override=False)[source]

Set the Coordinate Reference System (CRS) of the GeoDataFrame.

If there are multiple geometry columns within the GeoDataFrame, only the CRS of the active geometry column is set.

Pass None to remove CRS from the active geometry column.

Notes

The underlying geometries are not transformed to this CRS. To transform the geometries to a new CRS, use the to_crs method.

Parameters:

crs (pyproj.CRS | None, optional) – The value can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.
epsg (int, optional) – EPSG code specifying the projection.
inplace (bool, default False) – If True, the CRS of the GeoDataFrame will be changed in place (while still returning the result) instead of making a copy of the GeoDataFrame.
allow_override (bool, default False) – If the the GeoDataFrame already has a CRS, allow to replace the existing CRS, even when both are not equal.

Examples

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(d)
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)

Setting CRS to a GeoDataFrame without one:

>>> gdf.crs is None
True

>>> gdf = gdf.set_crs('epsg:3857')
>>> gdf.crs
<Projected CRS: EPSG:3857>
Name: WGS 84 / Pseudo-Mercator
Axis Info [cartesian]:
- X[east]: Easting (metre)
- Y[north]: Northing (metre)
Area of Use:
- name: World - 85°S to 85°N
- bounds: (-180.0, -85.06, 180.0, 85.06)
Coordinate Operation:
- name: Popular Visualisation Pseudo-Mercator
- method: Popular Visualisation Pseudo Mercator
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

Overriding existing CRS:

>>> gdf = gdf.set_crs(4326, allow_override=True)

Without allow_override=True, set_crs returns an error if you try to override CRS.

See also

GeoDataFrame.to_crs: re-project to another CRS

set_geometry(col, drop=None, inplace=False, crs=None)[source]

Set the GeoDataFrame geometry using either an existing column or the specified input. By default yields a new object.

The original geometry column is replaced with the input.

Parameters:

col (column label or array-like) – An existing column name or values to set as the new geometry column. If values (array-like, (Geo)Series) are passed, then if they are named (Series) the new geometry column will have the corresponding name, otherwise the existing geometry column will be replaced. If there is no existing geometry column, the new geometry column will use the default name “geometry”.
drop (boolean, default False) –
When specifying a named Series or an existing column name for col, controls if the previous geometry column should be dropped from the result. The default of False keeps both the old and new geometry column.

Deprecated since version 1.0.0.
inplace (boolean, default False) – Modify the GeoDataFrame in place (do not create a new object)
crs (pyproj.CRS, optional) – Coordinate system to use. The value can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string. If passed, overrides both DataFrame and col’s crs. Otherwise, tries to get crs from passed col values or DataFrame.

Examples

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(d, crs="EPSG:4326")
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)

Passing an array:

>>> df1 = gdf.set_geometry([Point(0,0), Point(1,1)])
>>> df1
    col1     geometry
0  name1  POINT (0 0)
1  name2  POINT (1 1)

Using existing column:

>>> gdf["buffered"] = gdf.buffer(2)
>>> df2 = gdf.set_geometry("buffered")
>>> df2.geometry
0    POLYGON ((3 2, 2.99037 1.80397, 2.96157 1.6098...
1    POLYGON ((4 1, 3.99037 0.80397, 3.96157 0.6098...
Name: buffered, dtype: geometry

Return type:: GeoDataFrame

See also

GeoDataFrame.rename_geometry: rename an active geometry column

sjoin(df, *args, **kwargs)[source]

Spatial join of two GeoDataFrames.

See the User Guide page ../../user_guide/mergingdata for details.

Parameters:

df (GeoDataFrame)
how (string, default 'inner') –
The type of join:
- ’left’: use keys from left_df; retain only left_df geometry column
- ’right’: use keys from right_df; retain only right_df geometry column
- ’inner’: use intersection of keys from both dfs; retain only left_df geometry column
predicate (string, default 'intersects') – Binary predicate. Valid values are determined by the spatial index used. You can check the valid values in left_df or right_df as left_df.sindex.valid_query_predicates or right_df.sindex.valid_query_predicates
lsuffix (string, default 'left') – Suffix to apply to overlapping column names (left GeoDataFrame).
rsuffix (string, default 'right') – Suffix to apply to overlapping column names (right GeoDataFrame).
distance (number or array_like, optional) – Distance(s) around each input geometry within which to query the tree for the ‘dwithin’ predicate. If array_like, must be one-dimesional with length equal to length of left GeoDataFrame. Required if predicate='dwithin'.
on_attribute (string, list or tuple) – Column name(s) to join on as an additional join restriction on top of the spatial predicate. These must be found in both DataFrames. If set, observations are joined only if the predicate applies and values in specified columns match.

Examples

>>> import geodatasets
>>> chicago = geopandas.read_file(
...     geodatasets.get_path("geoda.chicago_commpop")
... )
>>> groceries = geopandas.read_file(
...     geodatasets.get_path("geoda.groceries")
... ).to_crs(chicago.crs)

>>> chicago.head()
         community  ...                                           geometry
        DOUGLAS  ...  MULTIPOLYGON (((-87.60914 41.84469, -87.60915 ...
        OAKLAND  ...  MULTIPOLYGON (((-87.59215 41.81693, -87.59231 ...
    FULLER PARK  ...  MULTIPOLYGON (((-87.62880 41.80189, -87.62879 ...
GRAND BOULEVARD  ...  MULTIPOLYGON (((-87.60671 41.81681, -87.60670 ...
        KENWOOD  ...  MULTIPOLYGON (((-87.59215 41.81693, -87.59215 ...

[5 rows x 9 columns]

>>> groceries.head()
   OBJECTID     Ycoord  ...  Category                           geometry
0        16  41.973266  ...       NaN  MULTIPOINT ((-87.65661 41.97321))
1        18  41.696367  ...       NaN  MULTIPOINT ((-87.68136 41.69713))
2        22  41.868634  ...       NaN  MULTIPOINT ((-87.63918 41.86847))
3        23  41.877590  ...       new  MULTIPOINT ((-87.65495 41.87783))
4        27  41.737696  ...       NaN  MULTIPOINT ((-87.62715 41.73623))
[5 rows x 8 columns]

>>> groceries_w_communities = groceries.sjoin(chicago)
>>> groceries_w_communities[["OBJECTID", "community", "geometry"]].head()
   OBJECTID       community                           geometry
0        16          UPTOWN  MULTIPOINT ((-87.65661 41.97321))
1        18     MORGAN PARK  MULTIPOINT ((-87.68136 41.69713))
2        22  NEAR WEST SIDE  MULTIPOINT ((-87.63918 41.86847))
3        23  NEAR WEST SIDE  MULTIPOINT ((-87.65495 41.87783))
4        27         CHATHAM  MULTIPOINT ((-87.62715 41.73623))

Notes

Every operation in GeoPandas is planar, i.e. the potential third dimension is not taken into account.

See also

GeoDataFrame.sjoin_nearest: nearest neighbor join
sjoin: equivalent top-level function

sjoin_nearest(right, how='inner', max_distance=None, lsuffix='left', rsuffix='right', distance_col=None, exclusive=False)[source]

Spatial join of two GeoDataFrames based on the distance between their geometries.

Results will include multiple output records for a single input record where there are multiple equidistant nearest or intersected neighbors.

See the User Guide page https://geopandas.readthedocs.io/en/latest/docs/user_guide/mergingdata.html for more details.

Parameters:

right (GeoDataFrame)
how (string, default 'inner') –
The type of join:
- ’left’: use keys from left_df; retain only left_df geometry column
- ’right’: use keys from right_df; retain only right_df geometry column
- ’inner’: use intersection of keys from both dfs; retain only left_df geometry column
max_distance (float, default None) – Maximum distance within which to query for nearest geometry. Must be greater than 0. The max_distance used to search for nearest items in the tree may have a significant impact on performance by reducing the number of input geometries that are evaluated for nearest items in the tree.
lsuffix (string, default 'left') – Suffix to apply to overlapping column names (left GeoDataFrame).
rsuffix (string, default 'right') – Suffix to apply to overlapping column names (right GeoDataFrame).
distance_col (string, default None) – If set, save the distances computed between matching geometries under a column of this name in the joined GeoDataFrame.
exclusive (bool, optional, default False) – If True, the nearest geometries that are equal to the input geometry will not be returned, default False. Requires Shapely >= 2.0

Examples

>>> import geodatasets
>>> groceries = geopandas.read_file(
...     geodatasets.get_path("geoda.groceries")
... )
>>> chicago = geopandas.read_file(
...     geodatasets.get_path("geoda.chicago_health")
... ).to_crs(groceries.crs)

>>> chicago.head()
   ComAreaID  ...                                           geometry
0         35  ...  POLYGON ((-87.60914 41.84469, -87.60915 41.844...
1         36  ...  POLYGON ((-87.59215 41.81693, -87.59231 41.816...
2         37  ...  POLYGON ((-87.62880 41.80189, -87.62879 41.801...
3         38  ...  POLYGON ((-87.60671 41.81681, -87.60670 41.816...
4         39  ...  POLYGON ((-87.59215 41.81693, -87.59215 41.816...
[5 rows x 87 columns]

>>> groceries.head()
   OBJECTID     Ycoord  ...  Category                           geometry
0        16  41.973266  ...       NaN  MULTIPOINT ((-87.65661 41.97321))
1        18  41.696367  ...       NaN  MULTIPOINT ((-87.68136 41.69713))
2        22  41.868634  ...       NaN  MULTIPOINT ((-87.63918 41.86847))
3        23  41.877590  ...       new  MULTIPOINT ((-87.65495 41.87783))
4        27  41.737696  ...       NaN  MULTIPOINT ((-87.62715 41.73623))
[5 rows x 8 columns]

>>> groceries_w_communities = groceries.sjoin_nearest(chicago)
>>> groceries_w_communities[["Chain", "community", "geometry"]].head(2)
               Chain    community                                geometry
0     VIET HOA PLAZA       UPTOWN   MULTIPOINT ((1168268.672 1933554.35))
1  COUNTY FAIR FOODS  MORGAN PARK  MULTIPOINT ((1162302.618 1832900.224))

To include the distances:

>>> groceries_w_communities = groceries.sjoin_nearest(chicago, distance_col="distances")
>>> groceries_w_communities[["Chain", "community", "distances"]].head(2)
               Chain    community  distances
0     VIET HOA PLAZA       UPTOWN        0.0
1  COUNTY FAIR FOODS  MORGAN PARK        0.0

In the following example, we get multiple groceries for Uptown because all results are equidistant (in this case zero because they intersect). In fact, we get 4 results in total:

>>> chicago_w_groceries = groceries.sjoin_nearest(chicago, distance_col="distances", how="right")
>>> uptown_results = chicago_w_groceries[chicago_w_groceries["community"] == "UPTOWN"]
>>> uptown_results[["Chain", "community"]]
            Chain community
30  VIET HOA PLAZA    UPTOWN
30      JEWEL OSCO    UPTOWN
30          TARGET    UPTOWN
30       Mariano's    UPTOWN

See also

GeoDataFrame.sjoin: binary predicate joins
sjoin_nearest: equivalent top-level function

Notes

Since this join relies on distances, results will be inaccurate if your geometries are in a geographic CRS.

Every operation in GeoPandas is planar, i.e. the potential third dimension is not taken into account.

to_arrow(*, index=None, geometry_encoding='WKB', interleaved=True, include_z=None)[source]

Encode a GeoDataFrame to GeoArrow format.

See https://geoarrow.org/ for details on the GeoArrow specification.

This functions returns a generic Arrow data object implementing the Arrow PyCapsule Protocol (i.e. having an __arrow_c_stream__ method). This object can then be consumed by your Arrow implementation of choice that supports this protocol.

Added in version 1.0.

Parameters:

index (bool, default None) – If True, always include the dataframe’s index(es) as columns in the file output. If False, the index(es) will not be written to the file. If None, the index(ex) will be included as columns in the file output except RangeIndex which is stored as metadata only.
geometry_encoding ({'WKB', 'geoarrow' }, default 'WKB') – The GeoArrow encoding to use for the data conversion.
interleaved (bool, default True) – Only relevant for ‘geoarrow’ encoding. If True, the geometries’ coordinates are interleaved in a single fixed size list array. If False, the coordinates are stored as separate arrays in a struct type.
include_z (bool, default None) – Only relevant for ‘geoarrow’ encoding (for WKB, the dimensionality of the individial geometries is preserved). If False, return 2D geometries. If True, include the third dimension in the output (if a geometry has no third dimension, the z-coordinates will be NaN). By default, will infer the dimensionality from the input geometries. Note that this inference can be unreliable with empty geometries (for a guaranteed result, it is recommended to specify the keyword).

Returns:

A generic Arrow table object with geometry columns encoded to GeoArrow.

Return type:

ArrowTable

Examples

>>> from shapely.geometry import Point
>>> data = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(data)
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)

>>> arrow_table = gdf.to_arrow()
>>> arrow_table
<geopandas.io._geoarrow.ArrowTable object at ...>

The returned data object needs to be consumed by a library implementing the Arrow PyCapsule Protocol. For example, wrapping the data as a pyarrow.Table (requires pyarrow >= 14.0):

>>> import pyarrow as pa
>>> table = pa.table(arrow_table)
>>> table
pyarrow.Table
col1: string
geometry: binary
----
col1: [["name1","name2"]]
geometry: [[0101000000000000000000F03F0000000000000040,01010000000000000000000040000000000000F03F]]

to_crs(crs=None, epsg=None, inplace=False)[source]

Transform geometries to a new coordinate reference system.

Transform all geometries in an active geometry column to a different coordinate reference system. The crs attribute on the current GeoSeries must be set. Either crs or epsg may be specified for output.

This method will transform all points in all objects. It has no notion of projecting entire geometries. All segments joining points are assumed to be lines in the current projection, not geodesics. Objects crossing the dateline (or other projection boundary) will have undesirable behavior.

Parameters:

crs (pyproj.CRS, optional if epsg is specified) – The value can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.
epsg (int, optional if crs is specified) – EPSG code specifying output projection.
inplace (bool, optional, default: False) – Whether to return a new GeoDataFrame or do the transformation in place.

Return type:

GeoDataFrame

Examples

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(d, crs=4326)
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)
>>> gdf.crs
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

>>> gdf = gdf.to_crs(3857)
>>> gdf
    col1                       geometry
0  name1  POINT (111319.491 222684.209)
1  name2  POINT (222638.982 111325.143)
>>> gdf.crs
<Projected CRS: EPSG:3857>
Name: WGS 84 / Pseudo-Mercator
Axis Info [cartesian]:
- X[east]: Easting (metre)
- Y[north]: Northing (metre)
Area of Use:
- name: World - 85°S to 85°N
- bounds: (-180.0, -85.06, 180.0, 85.06)
Coordinate Operation:
- name: Popular Visualisation Pseudo-Mercator
- method: Popular Visualisation Pseudo Mercator
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

See also

GeoDataFrame.set_crs: assign CRS without re-projection

to_feather(path, index=None, compression=None, schema_version=None, **kwargs)[source]

Write a GeoDataFrame to the Feather format.

Any geometry columns present are serialized to WKB format in the file.

Requires ‘pyarrow’ >= 0.17.

Added in version 0.8.

Parameters:

path (str, path object)
index (bool, default None) – If True, always include the dataframe’s index(es) as columns in the file output. If False, the index(es) will not be written to the file. If None, the index(ex) will be included as columns in the file output except RangeIndex which is stored as metadata only.
compression ({'zstd', 'lz4', 'uncompressed'}, optional) – Name of the compression to use. Use "uncompressed" for no compression. By default uses LZ4 if available, otherwise uncompressed.
schema_version ({'0.1.0', '0.4.0', '1.0.0', None}) – GeoParquet specification version; if not provided will default to latest supported version.
kwargs – Additional keyword arguments passed to to pyarrow.feather.write_feather().

Examples

>>> gdf.to_feather('data.feather')

See also

GeoDataFrame.to_parquet: write GeoDataFrame to parquet
GeoDataFrame.to_file: write GeoDataFrame to file

to_file(filename, driver=None, schema=None, index=None, **kwargs)[source]

Write the GeoDataFrame to a file.

By default, an ESRI shapefile is written, but any OGR data source supported by Pyogrio or Fiona can be written. A dictionary of supported OGR providers is available via:

>>> import pyogrio
>>> pyogrio.list_drivers()

Parameters:

filename (string) – File path or file handle to write to. The path may specify a GDAL VSI scheme.
driver (string, default None) – The OGR format driver used to write the vector file. If not specified, it attempts to infer it from the file extension. If no extension is specified, it saves ESRI Shapefile to a folder.
schema (dict, default None) – If specified, the schema dictionary is passed to Fiona to better control how the file is written. If None, GeoPandas will determine the schema based on each column’s dtype. Not supported for the “pyogrio” engine.
index (bool, default None) –
If True, write index into one or more columns (for MultiIndex). Default None writes the index into one or more columns only if the index is named, is a MultiIndex, or has a non-integer data type. If False, no index is written.

Added in version 0.7: Previously the index was not written.
mode (string, default 'w') – The write mode, ‘w’ to overwrite the existing file and ‘a’ to append. Not all drivers support appending. The drivers that support appending are listed in fiona.supported_drivers or https://github.com/Toblerity/Fiona/blob/master/fiona/drvsupport.py
crs (pyproj.CRS, default None) – If specified, the CRS is passed to Fiona to better control how the file is written. If None, GeoPandas will determine the crs based on crs df attribute. The value can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string. The keyword is not supported for the “pyogrio” engine.
engine (str, "pyogrio" or "fiona") – The underlying library that is used to write the file. Currently, the supported options are “pyogrio” and “fiona”. Defaults to “pyogrio” if installed, otherwise tries “fiona”.
metadata (dict[str, str], default None) – Optional metadata to be stored in the file. Keys and values must be strings. Supported only for “GPKG” driver.
**kwargs – Keyword args to be passed to the engine, and can be used to write to multi-layer data, store data within archives (zip files), etc. In case of the “pyogrio” engine, the keyword arguments are passed to pyogrio.write_dataframe. In case of the “fiona” engine, the keyword arguments are passed to fiona.open`. For more information on possible keywords, type: import pyogrio; help(pyogrio.write_dataframe).

Notes

The format drivers will attempt to detect the encoding of your data, but may fail. In this case, the proper encoding can be specified explicitly by using the encoding keyword parameter, e.g. encoding='utf-8'.

See also

GeoSeries.to_file

GeoDataFrame.to_postgis: write GeoDataFrame to PostGIS database
GeoDataFrame.to_parquet: write GeoDataFrame to parquet
GeoDataFrame.to_feather: write GeoDataFrame to feather

Examples

>>> gdf.to_file('dataframe.shp')

>>> gdf.to_file('dataframe.gpkg', driver='GPKG', layer='name')

>>> gdf.to_file('dataframe.geojson', driver='GeoJSON')

With selected drivers you can also append to a file with mode=”a”:

>>> gdf.to_file('dataframe.shp', mode="a")

Using the engine-specific keyword arguments it is possible to e.g. create a spatialite file with a custom layer name:

>>> gdf.to_file(
...     'dataframe.sqlite', driver='SQLite', spatialite=True, layer='test'
... )

to_geo_dict(na='null', show_bbox=False, drop_id=False)[source]

Returns a python feature collection representation of the GeoDataFrame as a dictionary with a list of features based on the __geo_interface__ GeoJSON-like specification.

Parameters:

na (str, optional) –
Options are {‘null’, ‘drop’, ‘keep’}, default ‘null’. Indicates how to output missing (NaN) values in the GeoDataFrame
- null: output the missing entries as JSON null
- drop: remove the property from the feature. This applies to each feature individually so that features may have different properties
- keep: output the missing entries as NaN
show_bbox (bool, optional) – Include bbox (bounds) in the geojson. Default False.
drop_id (bool, default: False) – Whether to retain the index of the GeoDataFrame as the id property in the generated dictionary. Default is False, but may want True if the index is just arbitrary row numbers.

Examples

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(d)
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)

>>> gdf.to_geo_dict()
{'type': 'FeatureCollection', 'features': [{'id': '0', 'type': 'Feature', 'properties': {'col1': 'name1'}, 'geometry': {'type': 'Point', 'coordinates': (1.0, 2.0)}}, {'id': '1', 'type': 'Feature', 'properties': {'col1': 'name2'}, 'geometry': {'type': 'Point', 'coordinates': (2.0, 1.0)}}]}

See also

GeoDataFrame.to_json: return a GeoDataFrame as a GeoJSON string

to_json(na='null', show_bbox=False, drop_id=False, to_wgs84=False, **kwargs)[source]

Returns a GeoJSON representation of the GeoDataFrame as a string.

Parameters:

na ({'null', 'drop', 'keep'}, default 'null') – Indicates how to output missing (NaN) values in the GeoDataFrame. See below.
show_bbox (bool, optional, default: False) – Include bbox (bounds) in the geojson
drop_id (bool, default: False) – Whether to retain the index of the GeoDataFrame as the id property in the generated GeoJSON. Default is False, but may want True if the index is just arbitrary row numbers.
to_wgs84 (bool, optional, default: False) –
If the CRS is set on the active geometry column it is exported as WGS84 (EPSG:4326) to meet the 2016 GeoJSON specification. Set to True to force re-projection and set to False to ignore CRS. False by default.

Notes

The remaining kwargs are passed to json.dumps().

Missing (NaN) values in the GeoDataFrame can be represented as follows:

null: output the missing entries as JSON null.
drop: remove the property from the feature. This applies to each feature individually so that features may have different properties.
keep: output the missing entries as NaN.

If the GeoDataFrame has a defined CRS, its definition will be included in the output unless it is equal to WGS84 (default GeoJSON CRS) or not possible to represent in the URN OGC format, or unless to_wgs84=True is specified.

Examples

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(d, crs="EPSG:3857")
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)

>>> gdf.to_json()
'{"type": "FeatureCollection", "features": [{"id": "0", "type": "Feature", "properties": {"col1": "name1"}, "geometry": {"type": "Point", "coordinates": [1.0, 2.0]}}, {"id": "1", "type": "Feature", "properties": {"col1": "name2"}, "geometry": {"type": "Point", "coordinates": [2.0, 1.0]}}], "crs": {"type": "name", "properties": {"name": "urn:ogc:def:crs:EPSG::3857"}}}'

Alternatively, you can write GeoJSON to file:

>>> gdf.to_file(path, driver="GeoJSON")

See also

GeoDataFrame.to_file: write GeoDataFrame to file

to_parquet(path, index=None, compression='snappy', geometry_encoding='WKB', write_covering_bbox=False, schema_version=None, **kwargs)[source]

Write a GeoDataFrame to the Parquet format.

By default, all geometry columns present are serialized to WKB format in the file.

Requires ‘pyarrow’.

Added in version 0.8.

Parameters:

path (str, path object)
index (bool, default None) – If True, always include the dataframe’s index(es) as columns in the file output. If False, the index(es) will not be written to the file. If None, the index(ex) will be included as columns in the file output except RangeIndex which is stored as metadata only.
compression ({'snappy', 'gzip', 'brotli', None}, default 'snappy') – Name of the compression to use. Use None for no compression.
geometry_encoding ({'WKB', 'geoarrow'}, default 'WKB') – The encoding to use for the geometry columns. Defaults to “WKB” for maximum interoperability. Specify “geoarrow” to use one of the native GeoArrow-based single-geometry type encodings. Note: the “geoarrow” option is part of the newer GeoParquet 1.1 specification, should be considered as experimental, and may not be supported by all readers.
write_covering_bbox (bool, default False) – Writes the bounding box column for each row entry with column name ‘bbox’. Writing a bbox column can be computationally expensive, but allows you to specify a bbox in : func:read_parquet for filtered reading. Note: this bbox column is part of the newer GeoParquet 1.1 specification and should be considered as experimental. While writing the column is backwards compatible, using it for filtering may not be supported by all readers.
schema_version ({'0.1.0', '0.4.0', '1.0.0', '1.1.0', None}) – GeoParquet specification version; if not provided, will default to latest supported stable version (1.0.0).
kwargs – Additional keyword arguments passed to pyarrow.parquet.write_table().

Examples

>>> gdf.to_parquet('data.parquet')

See also

GeoDataFrame.to_feather: write GeoDataFrame to feather
GeoDataFrame.to_file: write GeoDataFrame to file

to_postgis(name, con, schema=None, if_exists='fail', index=False, index_label=None, chunksize=None, dtype=None)[source]

Upload GeoDataFrame into PostGIS database.

This method requires SQLAlchemy and GeoAlchemy2, and a PostgreSQL Python driver (psycopg or psycopg2) to be installed.

It is also possible to use to_file() to write to a database. Especially for file geodatabases like GeoPackage or SpatiaLite this can be easier.

Parameters:

name (str) – Name of the target table.
con (sqlalchemy.engine.Connection or sqlalchemy.engine.Engine) – Active connection to the PostGIS database.
if_exists ({'fail', 'replace', 'append'}, default 'fail') –
How to behave if the table already exists:
- fail: Raise a ValueError.
- replace: Drop the table before inserting new values.
- append: Insert new values to the existing table.
schema (string, optional) – Specify the schema. If None, use default schema: ‘public’.
index (bool, default False) – Write DataFrame index as a column. Uses index_label as the column name in the table.
index_label (string or sequence, default None) – Column label for index column(s). If None is given (default) and index is True, then the index names are used.
chunksize (int, optional) – Rows will be written in batches of this size at a time. By default, all rows will be written at once.
dtype (dict of column name to SQL type, default None) – Specifying the datatype for columns. The keys should be the column names and the values should be the SQLAlchemy types.

Examples

>>> from sqlalchemy import create_engine
>>> engine = create_engine("postgresql://myusername:mypassword@myhost:5432/mydatabase")
>>> gdf.to_postgis("my_table", engine)

See also

GeoDataFrame.to_file: write GeoDataFrame to file
read_postgis: read PostGIS database to GeoDataFrame

to_wkb(hex=False, **kwargs)[source]

Encode all geometry columns in the GeoDataFrame to WKB.

Parameters:

hex (bool) – If true, export the WKB as a hexadecimal string. The default is to return a binary bytes object.
kwargs – Additional keyword args will be passed to shapely.to_wkb().

Returns:

geometry columns are encoded to WKB

Return type:

DataFrame

to_wkt(**kwargs)[source]

Encode all geometry columns in the GeoDataFrame to WKT.

Parameters:: kwargs – Keyword args will be passed to shapely.to_wkt().
Returns:: geometry columns are encoded to WKT
Return type:: DataFrame

class pyorps.core.types.GeoSeries(data=None, index=None, crs=None, **kwargs)[source]

Bases: GeoPandasBase, Series

A Series object designed to store shapely geometry objects.

Parameters:

data (array-like, dict, scalar value) – The geometries to store in the GeoSeries.
index (array-like or Index) – The index for the GeoSeries.
crs (value (optional)) – Coordinate Reference System of the geometry objects. Can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.
kwargs –

Additional arguments passed to the Series constructor,
e.g. name.

Examples

>>> from shapely.geometry import Point
>>> s = geopandas.GeoSeries([Point(1, 1), Point(2, 2), Point(3, 3)])
>>> s
0    POINT (1 1)
1    POINT (2 2)
2    POINT (3 3)
dtype: geometry

>>> s = geopandas.GeoSeries(
...     [Point(1, 1), Point(2, 2), Point(3, 3)], crs="EPSG:3857"
... )
>>> s.crs
<Projected CRS: EPSG:3857>
Name: WGS 84 / Pseudo-Mercator
Axis Info [cartesian]:
- X[east]: Easting (metre)
- Y[north]: Northing (metre)
Area of Use:
- name: World - 85°S to 85°N
- bounds: (-180.0, -85.06, 180.0, 85.06)
Coordinate Operation:
- name: Popular Visualisation Pseudo-Mercator
- method: Popular Visualisation Pseudo Mercator
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

>>> s = geopandas.GeoSeries(
...    [Point(1, 1), Point(2, 2), Point(3, 3)], index=["a", "b", "c"], crs=4326
... )
>>> s
a    POINT (1 1)
b    POINT (2 2)
c    POINT (3 3)
dtype: geometry

>>> s.crs
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

See also

GeoDataFrame, pandas.Series

property _constructor: Used when a manipulation result has the same dimensions as the original.

property _constructor_expanddim: Used when a manipulation result has one higher dimension as the original, such as Series.to_frame()

_constructor_expanddim_from_mgr(mgr, axes)[source]

_constructor_from_mgr(mgr, axes)[source]

classmethod _from_wkb_or_wkt(from_wkb_or_wkt_function, data, index=None, crs=None, on_invalid='raise', **kwargs)[source]

Create a GeoSeries from either WKT or WKB values

Return type:: GeoSeries

_wrapped_pandas_method(mtd, *args, **kwargs)[source]: Wrap a generic pandas method to ensure it returns a GeoSeries

append(*args, **kwargs)[source]

Return type:: GeoSeries

apply(func, convert_dtype=None, args=(), **kwargs)[source]

One-dimensional ndarray with axis labels (including time series).

Labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index. Statistical methods from ndarray have been overridden to automatically exclude missing data (currently represented as NaN).

Operations between Series (+, -, /, *, **) align values based on their associated index values– they need not be the same length. The result index will be the sorted union of the two indexes.

Parameters:

data (array-like, Iterable, dict, or scalar value) – Contains data stored in Series. If data is a dict, argument order is maintained.
index (array-like or Index (1d)) – Values must be hashable and have the same length as data. Non-unique index values are allowed. Will default to RangeIndex (0, 1, 2, …, n) if not provided. If data is dict-like and index is None, then the keys in the data are used as the index. If the index is not None, the resulting Series is reindexed with the index values.
dtype (str, numpy.dtype, or ExtensionDtype, optional) – Data type for the output Series. If not specified, this will be inferred from data. See the user guide for more usages.
name (Hashable, default None) – The name to give to the Series.
copy (bool, default False) – Copy input data. Only affects Series or 1d ndarray input. See examples.

Notes

Please reference the User Guide for more information.

Examples

Constructing Series from a dictionary with an Index specified

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['a', 'b', 'c'])
>>> ser
a   1
b   2
c   3
dtype: int64

The keys of the dictionary match with the Index values, hence the Index values have no effect.

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['x', 'y', 'z'])
>>> ser
x   NaN
y   NaN
z   NaN
dtype: float64

Note that the Index is first build with the keys from the dictionary. After this the Series is reindexed with the given Index values, hence we get all NaN as a result.

Constructing Series from a list with copy=False.

>>> r = [1, 2]
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
[1, 2]
>>> ser
0    999
1      2
dtype: int64

Due to input data type the Series has a copy of the original data even though copy=False, so the data is unchanged.

Constructing Series from a 1d ndarray with copy=False.

>>> r = np.array([1, 2])
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
array([999,   2])
>>> ser
0    999
1      2
dtype: int64

Due to input data type the Series has a view on the original data, so the data is changed as well.

clip(mask, keep_geom_type=False, sort=False)[source]

Clip points, lines, or polygon geometries to the mask extent.

Both layers must be in the same Coordinate Reference System (CRS). The GeoSeries will be clipped to the full extent of the mask object.

If there are multiple polygons in mask, data from the GeoSeries will be clipped to the total boundary of all polygons in mask.

Parameters:

mask (GeoDataFrame, GeoSeries, (Multi)Polygon, list-like) – Polygon vector layer used to clip gdf. The mask’s geometry is dissolved into one geometric feature and intersected with GeoSeries. If the mask is list-like with four elements (minx, miny, maxx, maxy), clip will use a faster rectangle clipping (clip_by_rect()), possibly leading to slightly different results.
keep_geom_type (boolean, default False) – If True, return only geometries of original type in case of intersection resulting in multiple geometry types or GeometryCollections. If False, return all resulting geometries (potentially mixed-types).
sort (boolean, default False) – If True, the order of rows in the clipped GeoSeries will be preserved at small performance cost. If False the order of rows in the clipped GeoSeries will be random.

Returns:

Vector data (points, lines, polygons) from gdf clipped to polygon boundary from mask.

Return type:

GeoSeries

See also

clip: top-level function for clip

Examples

Clip points (grocery stores) with polygons (the Near West Side community):

>>> import geodatasets
>>> chicago = geopandas.read_file(
...     geodatasets.get_path("geoda.chicago_health")
... )
>>> near_west_side = chicago[chicago["community"] == "NEAR WEST SIDE"]
>>> groceries = geopandas.read_file(
...     geodatasets.get_path("geoda.groceries")
... ).to_crs(chicago.crs)
>>> groceries.shape
(148, 8)

>>> nws_groceries = groceries.geometry.clip(near_west_side)
>>> nws_groceries.shape
(7,)

property crs

The Coordinate Reference System (CRS) represented as a pyproj.CRS object.

Returns None if the CRS is not set, and to set the value it :getter: Returns a pyproj.CRS or None. When setting, the value can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.

Examples

>>> s.crs
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

See also

GeoSeries.set_crs: assign CRS
GeoSeries.to_crs: re-project to another CRS

estimate_utm_crs(datum_name='WGS 84')[source]

Returns the estimated UTM CRS based on the bounds of the dataset.

Added in version 0.9.

Parameters:: datum_name (str, optional) – The name of the datum to use in the query. Default is WGS 84.
Return type:: pyproj.CRS

Examples

>>> import geodatasets
>>> df = geopandas.read_file(
...     geodatasets.get_path("geoda.chicago_health")
... )
>>> df.geometry.estimate_utm_crs()
<Derived Projected CRS: EPSG:32616>
Name: WGS 84 / UTM zone 16N
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: Between 90°W and 84°W, northern hemisphere between equator and 84°N, ...
- bounds: (-90.0, 0.0, -84.0, 84.0)
Coordinate Operation:
- name: UTM zone 16N
- method: Transverse Mercator
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

explode(ignore_index=False, index_parts=False)[source]

Explode multi-part geometries into multiple single geometries.

Single rows can become multiple rows. This is analogous to PostGIS’s ST_Dump(). The ‘path’ index is the second level of the returned MultiIndex

Parameters:

ignore_index (bool, default False) – If True, the resulting index will be labelled 0, 1, …, n - 1, ignoring index_parts.
index_parts (boolean, default False) – If True, the resulting index will be a multi-index (original index with an additional level indicating the multiple geometries: a new zero-based index for each single part geometry per multi-part geometry).

Return type:

GeoSeries

Returns:

A GeoSeries with a MultiIndex. The levels of the MultiIndex are the
original index and a zero-based integer index that counts the
number of single geometries within a multi-part geometry.

Examples

>>> from shapely.geometry import MultiPoint
>>> s = geopandas.GeoSeries(
...     [MultiPoint([(0, 0), (1, 1)]), MultiPoint([(2, 2), (3, 3), (4, 4)])]
... )
>>> s
0           MULTIPOINT ((0 0), (1 1))
1    MULTIPOINT ((2 2), (3 3), (4 4))
dtype: geometry

>>> s.explode(index_parts=True)
0    POINT (0 0)
  POINT (1 1)
0    POINT (2 2)
  POINT (3 3)
  POINT (4 4)
dtype: geometry

See also

GeoDataFrame.explode

explore(*args, **kwargs)[source]

Interactive map based on folium/leaflet.jsInteractive map based on GeoPandas and folium/leaflet.js

Generate an interactive leaflet map based on GeoSeries

Parameters:

color (str, array-like (default None)) – Named color or a list-like of colors (named or hex).
m (folium.Map (default None)) – Existing map instance on which to draw the plot.
tiles (str, xyzservices.TileProvider (default 'OpenStreetMap Mapnik')) –
Map tileset to use. Can choose from the list supported by folium, query a xyzservices.TileProvider by a name from xyzservices.providers, pass xyzservices.TileProvider object or pass custom XYZ URL. The current list of built-in providers (when xyzservices is not available):

["OpenStreetMap", "CartoDB positron", “CartoDB dark_matter"]

You can pass a custom tileset to Folium by passing a Leaflet-style URL to the tiles parameter: http://{s}.yourtiles.com/{z}/{x}/{y}.png. Be sure to check their terms and conditions and to provide attribution with the attr keyword.
attr (str (default None)) – Map tile attribution; only required if passing custom tile URL.
highlight (bool (default True)) – Enable highlight functionality when hovering over a geometry.
width (pixel int or percentage string (default: '100%')) – Width of the folium Map. If the argument m is given explicitly, width is ignored.
height (pixel int or percentage string (default: '100%')) – Height of the folium Map. If the argument m is given explicitly, height is ignored.
control_scale (bool, (default True)) – Whether to add a control scale on the map.
marker_type (str, folium.Circle, folium.CircleMarker, folium.Marker (default None)) – Allowed string options are (‘marker’, ‘circle’, ‘circle_marker’). Defaults to folium.Marker.
marker_kwds (dict (default {})) –
Additional keywords to be passed to the selected marker_type, e.g.:

radiusfloat
Radius of the circle, in meters (for 'circle') or pixels (for circle_marker).

iconfolium.map.Icon
the folium.map.Icon object to use to render the marker.

draggablebool (default False)
Set to True to be able to drag the marker around the map.
style_kwds –
Additional style to be passed to folium style_function:
strokebool (default True)
Whether to draw stroke along the path. Set it to False to disable borders on polygons or circles.

colorstr
Stroke color

weightint
Stroke width in pixels

opacityfloat (default 1.0)
Stroke opacity

fillboolean (default True)
Whether to fill the path with color. Set it to False to disable filling on polygons or circles.

fillColorstr
Fill color. Defaults to the value of the color option

fillOpacityfloat (default 0.5)
Fill opacity.

style_functioncallable
Function mapping a GeoJson Feature to a style dict.
- Style properties folium.vector_layers.path_options()
- GeoJson features GeoSeries.__geo_interface__
e.g.:
lambda x: {"color":"red" if x["properties"]["gdp_md_est"]<10**6 else "blue"}

highlight_kwdsdict (default {}): Style to be passed to folium highlight_function. Uses the same keywords as style_kwds. When empty, defaults to {"fillOpacity": 0.75}.
map_kwdsdict (default {}): Additional keywords to be passed to folium Map, e.g. dragging, or scrollWheelZoom.
**kwargsdict: Additional options to be passed on to the folium.

Returns:: m – folium Map instance
Return type:: folium.folium.Map

fillna(value=None, inplace=False, limit=None, **kwargs)[source]

Fill NA values with geometry (or geometries).

Parameters:

value (shapely geometry or GeoSeries, default None) – If None is passed, NA values will be filled with GEOMETRYCOLLECTION EMPTY. If a shapely geometry object is passed, it will be used to fill all missing values. If a GeoSeries or GeometryArray are passed, missing values will be filled based on the corresponding index locations. If pd.NA or np.nan are passed, values will be filled with None (not GEOMETRYCOLLECTION EMPTY).
limit (int, default None) – This is the maximum number of entries along the entire axis where NaNs will be filled. Must be greater than 0 if not None.

Return type:

GeoSeries

Examples

>>> from shapely.geometry import Polygon
>>> s = geopandas.GeoSeries(
...     [
...         Polygon([(0, 0), (1, 1), (0, 1)]),
...         None,
...         Polygon([(0, 0), (-1, 1), (0, -1)]),
...     ]
... )
>>> s
0      POLYGON ((0 0, 1 1, 0 1, 0 0))
1                                None
2    POLYGON ((0 0, -1 1, 0 -1, 0 0))
dtype: geometry

Filled with an empty polygon.

>>> s.fillna()
0      POLYGON ((0 0, 1 1, 0 1, 0 0))
1            GEOMETRYCOLLECTION EMPTY
2    POLYGON ((0 0, -1 1, 0 -1, 0 0))
dtype: geometry

Filled with a specific polygon.

>>> s.fillna(Polygon([(0, 1), (2, 1), (1, 2)]))
0      POLYGON ((0 0, 1 1, 0 1, 0 0))
1      POLYGON ((0 1, 2 1, 1 2, 0 1))
2    POLYGON ((0 0, -1 1, 0 -1, 0 0))
dtype: geometry

Filled with another GeoSeries.

>>> from shapely.geometry import Point
>>> s_fill = geopandas.GeoSeries(
...     [
...         Point(0, 0),
...         Point(1, 1),
...         Point(2, 2),
...     ]
... )
>>> s.fillna(s_fill)
0      POLYGON ((0 0, 1 1, 0 1, 0 0))
1                         POINT (1 1)
2    POLYGON ((0 0, -1 1, 0 -1, 0 0))
dtype: geometry

See also

GeoSeries.isna: detect missing values

classmethod from_arrow(arr, **kwargs)[source]

Construct a GeoSeries from a Arrow array object with a GeoArrow extension type.

See https://geoarrow.org/ for details on the GeoArrow specification.

This functions accepts any Arrow array object implementing the Arrow PyCapsule Protocol (i.e. having an __arrow_c_array__ method).

Added in version 1.0.

Parameters:

arr (pyarrow.Array, Arrow array) – Any array object implementing the Arrow PyCapsule Protocol (i.e. has an __arrow_c_array__ or __arrow_c_stream__ method). The type of the array should be one of the geoarrow geometry types.
**kwargs – Other parameters passed to the GeoSeries constructor.

Return type:

GeoSeries

classmethod from_file(filename, **kwargs)[source]

Alternate constructor to create a GeoSeries from a file.

Can load a GeoSeries from a file from any format recognized by pyogrio. See http://pyogrio.readthedocs.io/ for details. From a file with attributes loads only geometry column. Note that to do that, GeoPandas first loads the whole GeoDataFrame.

Parameters:

filename (str) – File path or file handle to read from. Depending on which kwargs are included, the content of filename may vary. See pyogrio.read_dataframe() for usage details.
kwargs (key-word arguments) – These arguments are passed to pyogrio.read_dataframe(), and can be used to access multi-layer data, data stored within archives (zip files), etc.

Return type:

GeoSeries

Examples

>>> import geodatasets
>>> path = geodatasets.get_path('nybb')
>>> s = geopandas.GeoSeries.from_file(path)
>>> s
0    MULTIPOLYGON (((970217.022 145643.332, 970227....
1    MULTIPOLYGON (((1029606.077 156073.814, 102957...
2    MULTIPOLYGON (((1021176.479 151374.797, 102100...
3    MULTIPOLYGON (((981219.056 188655.316, 980940....
4    MULTIPOLYGON (((1012821.806 229228.265, 101278...
Name: geometry, dtype: geometry

See also

read_file: read file to GeoDataFrame

classmethod from_wkb(data, index=None, crs=None, on_invalid='raise', **kwargs)[source]

Alternate constructor to create a GeoSeries from a list or array of WKB objects

Parameters:

data (array-like or Series) – Series, list or array of WKB objects
index (array-like or Index) – The index for the GeoSeries.
crs (value, optional) – Coordinate Reference System of the geometry objects. Can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.
on_invalid ({"raise", "warn", "ignore"}, default "raise") –
- raise: an exception will be raised if a WKB input geometry is invalid.
- warn: a warning will be raised and invalid WKB geometries will be returned as None.
- ignore: invalid WKB geometries will be returned as None without a warning.
kwargs – Additional arguments passed to the Series constructor, e.g. name.

Return type:

GeoSeries

See also

GeoSeries.from_wkt

classmethod from_wkt(data, index=None, crs=None, on_invalid='raise', **kwargs)[source]

Alternate constructor to create a GeoSeries from a list or array of WKT objects

Parameters:

data (array-like, Series) – Series, list, or array of WKT objects
index (array-like or Index) – The index for the GeoSeries.
crs (value, optional) – Coordinate Reference System of the geometry objects. Can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.
on_invalid ({"raise", "warn", "ignore"}, default "raise") –
- raise: an exception will be raised if a WKT input geometry is invalid.
- warn: a warning will be raised and invalid WKT geometries will be returned as None.
- ignore: invalid WKT geometries will be returned as None without a warning.
kwargs – Additional arguments passed to the Series constructor, e.g. name.

Return type:

GeoSeries

See also

GeoSeries.from_wkb

Examples

>>> wkts = [
... 'POINT (1 1)',
... 'POINT (2 2)',
... 'POINT (3 3)',
... ]
>>> s = geopandas.GeoSeries.from_wkt(wkts)
>>> s
0    POINT (1 1)
1    POINT (2 2)
2    POINT (3 3)
dtype: geometry

classmethod from_xy(x, y, z=None, index=None, crs=None, **kwargs)[source]

Alternate constructor to create a GeoSeries of Point geometries from lists or arrays of x, y(, z) coordinates

In case of geographic coordinates, it is assumed that longitude is captured by x coordinates and latitude by y.

Parameters:

x (iterable)
y (iterable)
z (iterable)
index (array-like or Index, optional) – The index for the GeoSeries. If not given and all coordinate inputs are Series with an equal index, that index is used.
crs (value, optional) – Coordinate Reference System of the geometry objects. Can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.
**kwargs – Additional arguments passed to the Series constructor, e.g. name.

Return type:

GeoSeries

See also

GeoSeries.from_wkt, points_from_xy

Examples

>>> x = [2.5, 5, -3.0]
>>> y = [0.5, 1, 1.5]
>>> s = geopandas.GeoSeries.from_xy(x, y, crs="EPSG:4326")
>>> s
0    POINT (2.5 0.5)
1    POINT (5 1)
2    POINT (-3 1.5)
dtype: geometry

property geometry: GeoSeries

isna()[source]

Detect missing values.

Historically, NA values in a GeoSeries could be represented by empty geometric objects, in addition to standard representations such as None and np.nan. This behaviour is changed in version 0.6.0, and now only actual missing values return True. To detect empty geometries, use GeoSeries.is_empty instead.

Return type:

Series

Returns:

A boolean pandas Series of the same size as the GeoSeries,
True where a value is NA.

Examples

>>> from shapely.geometry import Polygon
>>> s = geopandas.GeoSeries(
...     [Polygon([(0, 0), (1, 1), (0, 1)]), None, Polygon([])]
... )
>>> s
0    POLYGON ((0 0, 1 1, 0 1, 0 0))
1                              None
2                     POLYGON EMPTY
dtype: geometry

>>> s.isna()
0    False
1     True
2    False
dtype: bool

See also

GeoSeries.notna: inverse of isna
GeoSeries.is_empty: detect empty geometries

isnull()[source]

Alias for isna method. See isna for more detail.

Return type:: Series

notna()[source]

Detect non-missing values.

Historically, NA values in a GeoSeries could be represented by empty geometric objects, in addition to standard representations such as None and np.nan. This behaviour is changed in version 0.6.0, and now only actual missing values return False. To detect empty geometries, use ~GeoSeries.is_empty instead.

Return type:

Series

Returns:

A boolean pandas Series of the same size as the GeoSeries,
False where a value is NA.

Examples

>>> from shapely.geometry import Polygon
>>> s = geopandas.GeoSeries(
...     [Polygon([(0, 0), (1, 1), (0, 1)]), None, Polygon([])]
... )
>>> s
0    POLYGON ((0 0, 1 1, 0 1, 0 0))
1                              None
2                     POLYGON EMPTY
dtype: geometry

>>> s.notna()
0     True
1    False
2     True
dtype: bool

See also

GeoSeries.isna: inverse of notna
GeoSeries.is_empty: detect empty geometries

notnull()[source]

Alias for notna method. See notna for more detail.

Return type:: Series

plot(*args, **kwargs)[source]

Plot a GeoSeries.

Generate a plot of a GeoSeries geometry with matplotlib.

Parameters:

s (Series) – The GeoSeries to be plotted. Currently Polygon, MultiPolygon, LineString, MultiLineString, Point and MultiPoint geometries can be plotted.
cmap (str (default None)) –
The name of a colormap recognized by matplotlib. Any colormap will work, but categorical colormaps are generally recommended. Examples of useful discrete colormaps include:

tab10, tab20, Accent, Dark2, Paired, Pastel1, Set1, Set2
color (str, np.array, pd.Series, List (default None)) – If specified, all objects will be colored uniformly.
ax (matplotlib.pyplot.Artist (default None)) – axes on which to draw the plot
figsize (pair of floats (default None)) – Size of the resulting matplotlib.figure.Figure. If the argument ax is given explicitly, figsize is ignored.
aspect ('auto', 'equal', None or float (default 'auto')) – Set aspect of axis. If ‘auto’, the default aspect for map plots is ‘equal’; if however data are not projected (coordinates are long/lat), the aspect is by default set to 1/cos(s_y * pi/180) with s_y the y coordinate of the middle of the GeoSeries (the mean of the y range of bounding box) so that a long/lat square appears square in the middle of the plot. This implies an Equirectangular projection. If None, the aspect of ax won’t be changed. It can also be set manually (float) as the ratio of y-unit to x-unit.
autolim (bool (default True)) – Update axes data limits to contain the new geometries.
**style_kwds (dict) – Color options to be passed on to the actual plot function, such as edgecolor, facecolor, linewidth, markersize, alpha.

Returns:

ax

Return type:

matplotlib axes instance

select(*args, **kwargs)[source]

One-dimensional ndarray with axis labels (including time series).

Labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index. Statistical methods from ndarray have been overridden to automatically exclude missing data (currently represented as NaN).

Operations between Series (+, -, /, *, **) align values based on their associated index values– they need not be the same length. The result index will be the sorted union of the two indexes.

Parameters:

data (array-like, Iterable, dict, or scalar value) – Contains data stored in Series. If data is a dict, argument order is maintained.
index (array-like or Index (1d)) – Values must be hashable and have the same length as data. Non-unique index values are allowed. Will default to RangeIndex (0, 1, 2, …, n) if not provided. If data is dict-like and index is None, then the keys in the data are used as the index. If the index is not None, the resulting Series is reindexed with the index values.
dtype (str, numpy.dtype, or ExtensionDtype, optional) – Data type for the output Series. If not specified, this will be inferred from data. See the user guide for more usages.
name (Hashable, default None) – The name to give to the Series.
copy (bool, default False) – Copy input data. Only affects Series or 1d ndarray input. See examples.

Notes

Please reference the User Guide for more information.

Examples

Constructing Series from a dictionary with an Index specified

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['a', 'b', 'c'])
>>> ser
a   1
b   2
c   3
dtype: int64

The keys of the dictionary match with the Index values, hence the Index values have no effect.

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['x', 'y', 'z'])
>>> ser
x   NaN
y   NaN
z   NaN
dtype: float64

Note that the Index is first build with the keys from the dictionary. After this the Series is reindexed with the given Index values, hence we get all NaN as a result.

Constructing Series from a list with copy=False.

>>> r = [1, 2]
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
[1, 2]
>>> ser
0    999
1      2
dtype: int64

Due to input data type the Series has a copy of the original data even though copy=False, so the data is unchanged.

Constructing Series from a 1d ndarray with copy=False.

>>> r = np.array([1, 2])
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
array([999,   2])
>>> ser
0    999
1      2
dtype: int64

Due to input data type the Series has a view on the original data, so the data is changed as well.

set_crs(**kwargs)

sort_index(*args, **kwargs)[source]

One-dimensional ndarray with axis labels (including time series).

Labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index. Statistical methods from ndarray have been overridden to automatically exclude missing data (currently represented as NaN).

Operations between Series (+, -, /, *, **) align values based on their associated index values– they need not be the same length. The result index will be the sorted union of the two indexes.

Parameters:

data (array-like, Iterable, dict, or scalar value) – Contains data stored in Series. If data is a dict, argument order is maintained.
index (array-like or Index (1d)) – Values must be hashable and have the same length as data. Non-unique index values are allowed. Will default to RangeIndex (0, 1, 2, …, n) if not provided. If data is dict-like and index is None, then the keys in the data are used as the index. If the index is not None, the resulting Series is reindexed with the index values.
dtype (str, numpy.dtype, or ExtensionDtype, optional) – Data type for the output Series. If not specified, this will be inferred from data. See the user guide for more usages.
name (Hashable, default None) – The name to give to the Series.
copy (bool, default False) – Copy input data. Only affects Series or 1d ndarray input. See examples.

Notes

Please reference the User Guide for more information.

Examples

Constructing Series from a dictionary with an Index specified

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['a', 'b', 'c'])
>>> ser
a   1
b   2
c   3
dtype: int64

The keys of the dictionary match with the Index values, hence the Index values have no effect.

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['x', 'y', 'z'])
>>> ser
x   NaN
y   NaN
z   NaN
dtype: float64

Note that the Index is first build with the keys from the dictionary. After this the Series is reindexed with the given Index values, hence we get all NaN as a result.

Constructing Series from a list with copy=False.

>>> r = [1, 2]
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
[1, 2]
>>> ser
0    999
1      2
dtype: int64

Due to input data type the Series has a copy of the original data even though copy=False, so the data is unchanged.

Constructing Series from a 1d ndarray with copy=False.

>>> r = np.array([1, 2])
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
array([999,   2])
>>> ser
0    999
1      2
dtype: int64

Due to input data type the Series has a view on the original data, so the data is changed as well.

take(*args, **kwargs)[source]

One-dimensional ndarray with axis labels (including time series).

Labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index. Statistical methods from ndarray have been overridden to automatically exclude missing data (currently represented as NaN).

Operations between Series (+, -, /, *, **) align values based on their associated index values– they need not be the same length. The result index will be the sorted union of the two indexes.

Parameters:

data (array-like, Iterable, dict, or scalar value) – Contains data stored in Series. If data is a dict, argument order is maintained.
index (array-like or Index (1d)) – Values must be hashable and have the same length as data. Non-unique index values are allowed. Will default to RangeIndex (0, 1, 2, …, n) if not provided. If data is dict-like and index is None, then the keys in the data are used as the index. If the index is not None, the resulting Series is reindexed with the index values.
dtype (str, numpy.dtype, or ExtensionDtype, optional) – Data type for the output Series. If not specified, this will be inferred from data. See the user guide for more usages.
name (Hashable, default None) – The name to give to the Series.
copy (bool, default False) – Copy input data. Only affects Series or 1d ndarray input. See examples.

Notes

Please reference the User Guide for more information.

Examples

Constructing Series from a dictionary with an Index specified

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['a', 'b', 'c'])
>>> ser
a   1
b   2
c   3
dtype: int64

The keys of the dictionary match with the Index values, hence the Index values have no effect.

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['x', 'y', 'z'])
>>> ser
x   NaN
y   NaN
z   NaN
dtype: float64

Note that the Index is first build with the keys from the dictionary. After this the Series is reindexed with the given Index values, hence we get all NaN as a result.

Constructing Series from a list with copy=False.

>>> r = [1, 2]
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
[1, 2]
>>> ser
0    999
1      2
dtype: int64

Due to input data type the Series has a copy of the original data even though copy=False, so the data is unchanged.

Constructing Series from a 1d ndarray with copy=False.

>>> r = np.array([1, 2])
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
array([999,   2])
>>> ser
0    999
1      2
dtype: int64

Due to input data type the Series has a view on the original data, so the data is changed as well.

to_arrow(geometry_encoding='WKB', interleaved=True, include_z=None)[source]

Encode a GeoSeries to GeoArrow format.

See https://geoarrow.org/ for details on the GeoArrow specification.

This functions returns a generic Arrow array object implementing the Arrow PyCapsule Protocol (i.e. having an __arrow_c_array__ method). This object can then be consumed by your Arrow implementation of choice that supports this protocol.

Added in version 1.0.

Parameters:

geometry_encoding ({'WKB', 'geoarrow' }, default 'WKB') – The GeoArrow encoding to use for the data conversion.
interleaved (bool, default True) – Only relevant for ‘geoarrow’ encoding. If True, the geometries’ coordinates are interleaved in a single fixed size list array. If False, the coordinates are stored as separate arrays in a struct type.
include_z (bool, default None) – Only relevant for ‘geoarrow’ encoding (for WKB, the dimensionality of the individial geometries is preserved). If False, return 2D geometries. If True, include the third dimension in the output (if a geometry has no third dimension, the z-coordinates will be NaN). By default, will infer the dimensionality from the input geometries. Note that this inference can be unreliable with empty geometries (for a guaranteed result, it is recommended to specify the keyword).

Returns:

A generic Arrow array object with geometry data encoded to GeoArrow.

Return type:

GeoArrowArray

Examples

>>> from shapely.geometry import Point
>>> gser = geopandas.GeoSeries([Point(1, 2), Point(2, 1)])
>>> gser
0    POINT (1 2)
1    POINT (2 1)
dtype: geometry

>>> arrow_array = gser.to_arrow()
>>> arrow_array
<geopandas.io._geoarrow.GeoArrowArray object at ...>

The returned array object needs to be consumed by a library implementing the Arrow PyCapsule Protocol. For example, wrapping the data as a pyarrow.Array (requires pyarrow >= 14.0):

>>> import pyarrow as pa
>>> array = pa.array(arrow_array)
>>> array
<pyarrow.lib.BinaryArray object at ...>
[
  0101000000000000000000F03F0000000000000040,
  01010000000000000000000040000000000000F03F
]

to_crs(crs=None, epsg=None)[source]

Returns a GeoSeries with all geometries transformed to a new coordinate reference system.

Transform all geometries in a GeoSeries to a different coordinate reference system. The crs attribute on the current GeoSeries must be set. Either crs or epsg may be specified for output.

This method will transform all points in all objects. It has no notion of projecting entire geometries. All segments joining points are assumed to be lines in the current projection, not geodesics. Objects crossing the dateline (or other projection boundary) will have undesirable behavior.

Parameters:

crs (pyproj.CRS, optional if epsg is specified) – The value can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.
epsg (int, optional if crs is specified) – EPSG code specifying output projection.

Return type:

GeoSeries

Examples

>>> from shapely.geometry import Point
>>> s = geopandas.GeoSeries([Point(1, 1), Point(2, 2), Point(3, 3)], crs=4326)
>>> s
0    POINT (1 1)
1    POINT (2 2)
2    POINT (3 3)
dtype: geometry
>>> s.crs
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

>>> s = s.to_crs(3857)
>>> s
0    POINT (111319.491 111325.143)
1    POINT (222638.982 222684.209)
2    POINT (333958.472 334111.171)
dtype: geometry
>>> s.crs
<Projected CRS: EPSG:3857>
Name: WGS 84 / Pseudo-Mercator
Axis Info [cartesian]:
- X[east]: Easting (metre)
- Y[north]: Northing (metre)
Area of Use:
- name: World - 85°S to 85°N
- bounds: (-180.0, -85.06, 180.0, 85.06)
Coordinate Operation:
- name: Popular Visualisation Pseudo-Mercator
- method: Popular Visualisation Pseudo Mercator
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

See also

GeoSeries.set_crs: assign CRS

to_file(filename, driver=None, index=None, **kwargs)[source]

Write the GeoSeries to a file.

By default, an ESRI shapefile is written, but any OGR data source supported by Pyogrio or Fiona can be written.

Parameters:

filename (string) – File path or file handle to write to. The path may specify a GDAL VSI scheme.
driver (string, default None) – The OGR format driver used to write the vector file. If not specified, it attempts to infer it from the file extension. If no extension is specified, it saves ESRI Shapefile to a folder.
index (bool, default None) –
If True, write index into one or more columns (for MultiIndex). Default None writes the index into one or more columns only if the index is named, is a MultiIndex, or has a non-integer data type. If False, no index is written.

Added in version 0.7: Previously the index was not written.
mode (string, default 'w') – The write mode, ‘w’ to overwrite the existing file and ‘a’ to append. Not all drivers support appending. The drivers that support appending are listed in fiona.supported_drivers or https://github.com/Toblerity/Fiona/blob/master/fiona/drvsupport.py
crs (pyproj.CRS, default None) – If specified, the CRS is passed to Fiona to better control how the file is written. If None, GeoPandas will determine the crs based on crs df attribute. The value can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string. The keyword is not supported for the “pyogrio” engine.
engine (str, "pyogrio" or "fiona") – The underlying library that is used to write the file. Currently, the supported options are “pyogrio” and “fiona”. Defaults to “pyogrio” if installed, otherwise tries “fiona”.
**kwargs – Keyword args to be passed to the engine, and can be used to write to multi-layer data, store data within archives (zip files), etc. In case of the “pyogrio” engine, the keyword arguments are passed to pyogrio.write_dataframe. In case of the “fiona” engine, the keyword arguments are passed to fiona.open`. For more information on possible keywords, type: import pyogrio; help(pyogrio.write_dataframe).

See also

GeoDataFrame.to_file: write GeoDataFrame to file
read_file: read file to GeoDataFrame

Examples

>>> s.to_file('series.shp')

>>> s.to_file('series.gpkg', driver='GPKG', layer='name1')

>>> s.to_file('series.geojson', driver='GeoJSON')

to_json(show_bbox=True, drop_id=False, to_wgs84=False, **kwargs)[source]

Returns a GeoJSON string representation of the GeoSeries.

Parameters:

show_bbox (bool, optional, default: True) – Include bbox (bounds) in the geojson
drop_id (bool, default: False) – Whether to retain the index of the GeoSeries as the id property in the generated GeoJSON. Default is False, but may want True if the index is just arbitrary row numbers.
to_wgs84 (bool, optional, default: False) –
If the CRS is set on the active geometry column it is exported as WGS84 (EPSG:4326) to meet the 2016 GeoJSON specification. Set to True to force re-projection and set to False to ignore CRS. False by default.
json.dumps(). (*kwargs* that will be passed to)

Return type:

JSON string

Examples

>>> from shapely.geometry import Point
>>> s = geopandas.GeoSeries([Point(1, 1), Point(2, 2), Point(3, 3)])
>>> s
0    POINT (1 1)
1    POINT (2 2)
2    POINT (3 3)
dtype: geometry

>>> s.to_json()
'{"type": "FeatureCollection", "features": [{"id": "0", "type": "Feature", "properties": {}, "geometry": {"type": "Point", "coordinates": [1.0, 1.0]}, "bbox": [1.0, 1.0, 1.0, 1.0]}, {"id": "1", "type": "Feature", "properties": {}, "geometry": {"type": "Point", "coordinates": [2.0, 2.0]}, "bbox": [2.0, 2.0, 2.0, 2.0]}, {"id": "2", "type": "Feature", "properties": {}, "geometry": {"type": "Point", "coordinates": [3.0, 3.0]}, "bbox": [3.0, 3.0, 3.0, 3.0]}], "bbox": [1.0, 1.0, 3.0, 3.0]}'

See also

GeoSeries.to_file: write GeoSeries to file

to_wkb(hex=False, **kwargs)[source]

Convert GeoSeries geometries to WKB

Parameters:

hex (bool) – If true, export the WKB as a hexadecimal string. The default is to return a binary bytes object.
kwargs – Additional keyword args will be passed to shapely.to_wkb().

Returns:

WKB representations of the geometries

Return type:

Series

See also

GeoSeries.to_wkt

to_wkt(**kwargs)[source]

Convert GeoSeries geometries to WKT

Parameters:: kwargs – Keyword args will be passed to shapely.to_wkt().
Returns:: WKT representations of the geometries
Return type:: Series

Examples

>>> from shapely.geometry import Point
>>> s = geopandas.GeoSeries([Point(1, 1), Point(2, 2), Point(3, 3)])
>>> s
0    POINT (1 1)
1    POINT (2 2)
2    POINT (3 3)
dtype: geometry

>>> s.to_wkt()
0    POINT (1 1)
1    POINT (2 2)
2    POINT (3 3)
dtype: object

See also

GeoSeries.to_wkb

property x: Series

Return the x location of point geometries in a GeoSeries

Return type:: pandas.Series

Examples

>>> from shapely.geometry import Point
>>> s = geopandas.GeoSeries([Point(1, 1), Point(2, 2), Point(3, 3)])
>>> s.x
0    1.0
1    2.0
2    3.0
dtype: float64

See also

GeoSeries.y, GeoSeries.z

property y: Series

Return the y location of point geometries in a GeoSeries

Return type:: pandas.Series

Examples

>>> from shapely.geometry import Point
>>> s = geopandas.GeoSeries([Point(1, 1), Point(2, 2), Point(3, 3)])
>>> s.y
0    1.0
1    2.0
2    3.0
dtype: float64

See also

GeoSeries.x, GeoSeries.z

property z: Series

Return the z location of point geometries in a GeoSeries

Return type:: pandas.Series

Examples

>>> from shapely.geometry import Point
>>> s = geopandas.GeoSeries([Point(1, 1, 1), Point(2, 2, 2), Point(3, 3, 3)])
>>> s.z
0    1.0
1    2.0
2    3.0
dtype: float64

See also

GeoSeries.x, GeoSeries.y

class pyorps.core.types.MultiPoint(points=None)[source]

Bases: BaseMultipartGeometry

A collection of one or more Points.

A MultiPoint has zero area and zero length.

Parameters:: points (sequence) – A sequence of Points, or a sequence of (x, y [,z]) numeric coordinate pairs or triples, or an array-like of shape (N, 2) or (N, 3).

geoms

A sequence of Points

Type:: sequence

Examples

Construct a MultiPoint containing two Points

>>> from shapely import MultiPoint, Point
>>> ob = MultiPoint([[0.0, 0.0], [1.0, 2.0]])
>>> len(ob.geoms)
2
>>> type(ob.geoms[0]) == Point
True

svg(scale_factor=1.0, fill_color=None, opacity=None)[source]

Return a group of SVG circle elements for the MultiPoint geometry.

Parameters:

scale_factor (float) – Multiplication factor for the SVG circle diameters. Default is 1.
fill_color (str, optional) – Hex string for fill color. Default is to use “#66cc99” if geometry is valid, and “#ff3333” if invalid.
opacity (float) – Float number between 0 and 1 for color opacity. Default value is 0.6

class pyorps.core.types.Point(*args)[source]

Bases: BaseGeometry

A geometry type that represents a single coordinate.

Each coordinate has x, y and possibly z and/or m values.

A point is a zero-dimensional feature and has zero length and zero area.

Parameters:

args (float, or sequence of floats) –

The coordinates can either be passed as a single parameter, or as individual float values using multiple parameters:

1 parameter: a sequence or array-like of with 2 or 3 values.
2 or 3 parameters (float): x, y, and possibly z.

x, y, z, m

Coordinate values

Type:: float

Examples

Constructing the Point using separate parameters for x and y:

>>> from shapely import Point
>>> p = Point(1.0, -1.0)

Constructing the Point using a list of x, y coordinates:

>>> p = Point([1.0, -1.0])
>>> print(p)
POINT (1 -1)
>>> p.y
-1.0
>>> p.x
1.0

property m: Return m coordinate.

Added in version 2.1.0: Also requires GEOS 3.12.0 or later.

svg(scale_factor=1.0, fill_color=None, opacity=None)[source]

Return SVG circle element for the Point geometry.

Parameters:

scale_factor (float) – Multiplication factor for the SVG circle diameter. Default is 1.
fill_color (str, optional) – Hex string for fill color. Default is to use “#66cc99” if geometry is valid, and “#ff3333” if invalid.
opacity (float) – Float number between 0 and 1 for color opacity. Default value is 0.6

property x: Return x coordinate.

property xy

Separate arrays of X and Y coordinate values.

Examples

>>> from shapely import Point
>>> x, y = Point(0, 0).xy
>>> list(x)
[0.0]
>>> list(y)
[0.0]

property y: Return y coordinate.

property z: Return z coordinate.

class pyorps.core.types.Polygon(shell=None, holes=None)[source]

Bases: BaseGeometry

A geometry type representing an area that is enclosed by a linear ring.

A polygon is a two-dimensional feature and has a non-zero area. It may have one or more negative-space “holes” which are also bounded by linear rings. If any rings cross each other, the feature is invalid and operations on it may fail.

Parameters:

shell (sequence) – A sequence of (x, y [,z]) numeric coordinate pairs or triples, or an array-like with shape (N, 2) or (N, 3). Also can be a sequence of Point objects.
holes (sequence) – A sequence of objects which satisfy the same requirements as the shell parameters above

exterior

The ring which bounds the positive space of the polygon.

Type:: LinearRing

interiors

A sequence of rings which bound all existing holes.

Type:: sequence

Examples

Create a square polygon with no holes

>>> from shapely import Polygon
>>> coords = ((0., 0.), (0., 1.), (1., 1.), (1., 0.), (0., 0.))
>>> polygon = Polygon(coords)
>>> polygon.area
1.0

property coords: Not implemented for polygons.

property exterior: Return the exterior ring of the polygon.

classmethod from_bounds(xmin, ymin, xmax, ymax)[source]: Construct a Polygon() from spatial bounds.

property interiors: Return the sequence of interior rings of the polygon.

svg(scale_factor=1.0, fill_color=None, opacity=None)[source]

Return SVG path element for the Polygon geometry.

Parameters:

scale_factor (float) – Multiplication factor for the SVG stroke-width. Default is 1.
fill_color (str, optional) – Hex string for fill color. Default is to use “#66cc99” if geometry is valid, and “#ff3333” if invalid.
opacity (float) – Float number between 0 and 1 for color opacity. Default value is 0.6

class pyorps.core.types.int32

Bases: signedinteger

Signed integer type, compatible with C int.

Character code:: 'i'
Canonical name:: numpy.intc
Alias on this platform (Linux x86_64):: numpy.int32: 32-bit signed integer (-2_147_483_648 to 2_147_483_647).

bit_count() → int

Computes the number of 1-bits in the absolute value of the input. Analogous to the builtin int.bit_count or popcount in C++.

Examples

>>> np.int32(127).bit_count()
7
>>> np.int32(-127).bit_count()
7

class pyorps.core.types.int64

Bases: signedinteger

Default signed integer type, 64bit on 64bit systems and 32bit on 32bit systems.

Character code:: 'l'
Canonical name:: numpy.int_
Alias on this platform (Linux x86_64):: numpy.int64: 64-bit signed integer (-9_223_372_036_854_775_808 to 9_223_372_036_854_775_807).
Alias on this platform (Linux x86_64):: numpy.intp: Signed integer large enough to fit pointer, compatible with C intptr_t.

bit_count() → int

Computes the number of 1-bits in the absolute value of the input. Analogous to the builtin int.bit_count or popcount in C++.

Examples

>>> np.int64(127).bit_count()
7
>>> np.int64(-127).bit_count()
7

class pyorps.core.types.ndarray

Bases: object

ndarray(shape, dtype=float, buffer=None, offset=0,: strides=None, order=None)

An array object represents a multidimensional, homogeneous array of fixed-size items. An associated data-type object describes the format of each element in the array (its byte-order, how many bytes it occupies in memory, whether it is an integer, a floating point number, or something else, etc.)

Arrays should be constructed using array, zeros or empty (refer to the See Also section below). The parameters given here refer to a low-level method (ndarray(…)) for instantiating an array.

For more information, refer to the numpy module and examine the methods and attributes of an array.

Parameters:

below) ((for the __new__ method; see Notes)
shape (tuple of ints) – Shape of created array.
dtype (data-type, optional) – Any object that can be interpreted as a numpy data type.
buffer (object exposing buffer interface, optional) – Used to fill the array with data.
offset (int, optional) – Offset of array data in buffer.
strides (tuple of ints, optional) – Strides of data in memory.
order ({'C', 'F'}, optional) – Row-major (C-style) or column-major (Fortran-style) order.

T

Transpose of the array.

Type:: ndarray

data

The array’s elements, in memory.

Type:: buffer

dtype

Describes the format of the elements in the array.

Type:: dtype object

flags

Dictionary containing information related to memory use, e.g., ‘C_CONTIGUOUS’, ‘OWNDATA’, ‘WRITEABLE’, etc.

Type:: dict

flat

Flattened version of the array as an iterator. The iterator allows assignments, e.g., x.flat = 3 (See ndarray.flat for assignment examples; TODO).

Type:: numpy.flatiter object

imag

Imaginary part of the array.

Type:: ndarray

real

Real part of the array.

Type:: ndarray

size

Number of elements in the array.

Type:: int

itemsize

The memory use of each array element in bytes.

Type:: int

nbytes

The total number of bytes required to store the array data, i.e., itemsize * size.

Type:: int

ndim

The array’s number of dimensions.

Type:: int

shape

Shape of the array.

Type:: tuple of ints

strides

The step-size required to move from one element to the next in memory. For example, a contiguous (3, 4) array of type int16 in C-order has strides (8, 2). This implies that to move from element to element in memory requires jumps of 2 bytes. To move from row-to-row, one needs to jump 8 bytes at a time (2 * 4).

Type:: tuple of ints

ctypes

Class containing properties of the array needed for interaction with ctypes.

Type:: ctypes object

base

If the array is a view into another array, that array is its base (unless that array is also a view). The base array is where the array data is actually stored.

Type:: ndarray

See also

array: Construct an array.
zeros: Create an array, each element of which is zero.
empty: Create an array, but leave its allocated memory unchanged (i.e., it contains “garbage”).
dtype: Create a data-type.
numpy.typing.NDArray: An ndarray alias generic w.r.t. its dtype.type <numpy.dtype.type>.

Notes

There are two modes of creating an array using __new__:

If buffer is None, then only shape, dtype, and order are used.
If buffer is an object exposing the buffer interface, then all keywords are interpreted.

No __init__ method is needed because the array is fully initialized after the __new__ method.

Examples

These examples illustrate the low-level ndarray constructor. Refer to the See Also section above for easier ways of constructing an ndarray.

First mode, buffer is None:

>>> import numpy as np
>>> np.ndarray(shape=(2,2), dtype=float, order='F')
array([[0.0e+000, 0.0e+000], # random
       [     nan, 2.5e-323]])

Second mode:

>>> np.ndarray((2,), buffer=np.array([1,2,3]),
...            offset=np.int_().itemsize,
...            dtype=int) # offset = 1*itemsize, i.e. skip first element
array([2, 3])

T

View of the transposed array.

Same as self.transpose().

Examples

>>> import numpy as np
>>> a = np.array([[1, 2], [3, 4]])
>>> a
array([[1, 2],
       [3, 4]])
>>> a.T
array([[1, 3],
       [2, 4]])

>>> a = np.array([1, 2, 3, 4])
>>> a
array([1, 2, 3, 4])
>>> a.T
array([1, 2, 3, 4])

See also

transpose

all(axis=None, out=None, keepdims=False, *, where=True)

Returns True if all elements evaluate to True.

Refer to numpy.all for full documentation.

See also

numpy.all: equivalent function

any(axis=None, out=None, keepdims=False, *, where=True)

Returns True if any of the elements of a evaluate to True.

Refer to numpy.any for full documentation.

See also

numpy.any: equivalent function

argmax(axis=None, out=None, *, keepdims=False)

Return indices of the maximum values along the given axis.

Refer to numpy.argmax for full documentation.

See also

numpy.argmax: equivalent function

argmin(axis=None, out=None, *, keepdims=False)

Return indices of the minimum values along the given axis.

Refer to numpy.argmin for detailed documentation.

See also

numpy.argmin: equivalent function

argpartition(kth, axis=-1, kind='introselect', order=None)

Returns the indices that would partition this array.

Refer to numpy.argpartition for full documentation.

See also

numpy.argpartition: equivalent function

argsort(axis=-1, kind=None, order=None)

Returns the indices that would sort this array.

Refer to numpy.argsort for full documentation.

See also

numpy.argsort: equivalent function

astype(dtype, order='K', casting='unsafe', subok=True, copy=True)

Copy of the array, cast to a specified type.

Parameters:

dtype (str or dtype) – Typecode or data-type to which the array is cast.
order ({'C', 'F', 'A', 'K'}, optional) – Controls the memory layout order of the result. ‘C’ means C order, ‘F’ means Fortran order, ‘A’ means ‘F’ order if all the arrays are Fortran contiguous, ‘C’ order otherwise, and ‘K’ means as close to the order the array elements appear in memory as possible. Default is ‘K’.
casting ({'no', 'equiv', 'safe', 'same_kind', 'unsafe'}, optional) –
Controls what kind of data casting may occur. Defaults to ‘unsafe’ for backwards compatibility.
- ’no’ means the data types should not be cast at all.
- ’equiv’ means only byte-order changes are allowed.
- ’safe’ means only casts which can preserve values are allowed.
- ’same_kind’ means only safe casts or casts within a kind, like float64 to float32, are allowed.
- ’unsafe’ means any data conversions may be done.
subok (bool, optional) – If True, then sub-classes will be passed-through (default), otherwise the returned array will be forced to be a base-class array.
copy (bool, optional) – By default, astype always returns a newly allocated array. If this is set to false, and the dtype, order, and subok requirements are satisfied, the input array is returned instead of a copy.

Returns:

arr_t – Unless copy is False and the other conditions for returning the input array are satisfied (see description for copy input parameter), arr_t is a new array of the same shape as the input array, with dtype, order given by dtype, order.

Return type:

ndarray

Raises:

ComplexWarning – When casting from complex to float or int. To avoid this, one should use a.real.astype(t).

Examples

>>> import numpy as np
>>> x = np.array([1, 2, 2.5])
>>> x
array([1. ,  2. ,  2.5])

>>> x.astype(int)
array([1, 2, 2])

base

Base object if memory is from some other object.

Examples

The base of an array that owns its memory is None:

>>> import numpy as np
>>> x = np.array([1,2,3,4])
>>> x.base is None
True

Slicing creates a view, whose memory is shared with x:

>>> y = x[2:]
>>> y.base is x
True

byteswap(inplace=False)

Swap the bytes of the array elements

Toggle between low-endian and big-endian data representation by returning a byteswapped array, optionally swapped in-place. Arrays of byte-strings are not swapped. The real and imaginary parts of a complex number are swapped individually.

Parameters:: inplace (bool, optional) – If True, swap bytes in-place, default is False.
Returns:: out – The byteswapped array. If inplace is True, this is a view to self.
Return type:: ndarray

Examples

>>> import numpy as np
>>> A = np.array([1, 256, 8755], dtype=np.int16)
>>> list(map(hex, A))
['0x1', '0x100', '0x2233']
>>> A.byteswap(inplace=True)
array([  256,     1, 13090], dtype=int16)
>>> list(map(hex, A))
['0x100', '0x1', '0x3322']

Arrays of byte-strings are not swapped

>>> A = np.array([b'ceg', b'fac'])
>>> A.byteswap()
array([b'ceg', b'fac'], dtype='|S3')

A.view(A.dtype.newbyteorder()).byteswap() produces an array with the same values but different representation in memory

>>> A = np.array([1, 2, 3],dtype=np.int64)
>>> A.view(np.uint8)
array([1, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0,
       0, 0], dtype=uint8)
>>> A.view(A.dtype.newbyteorder()).byteswap(inplace=True)
array([1, 2, 3], dtype='>i8')
>>> A.view(np.uint8)
array([0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0,
       0, 3], dtype=uint8)

choose(choices, out=None, mode='raise')

Use an index array to construct a new array from a set of choices.

Refer to numpy.choose for full documentation.

See also

numpy.choose: equivalent function

clip(min=None, max=None, out=None, **kwargs)

Return an array whose values are limited to [min, max]. One of max or min must be given.

Refer to numpy.clip for full documentation.

See also

numpy.clip: equivalent function

compress(condition, axis=None, out=None)

Return selected slices of this array along given axis.

Refer to numpy.compress for full documentation.

See also

numpy.compress: equivalent function

conj()

Complex-conjugate all elements.

Refer to numpy.conjugate for full documentation.

See also

numpy.conjugate: equivalent function

conjugate()

Return the complex conjugate, element-wise.

Refer to numpy.conjugate for full documentation.

See also

numpy.conjugate: equivalent function

copy(order='C')

Return a copy of the array.

Parameters:: order ({'C', 'F', 'A', 'K'}, optional) – Controls the memory layout of the copy. ‘C’ means C-order, ‘F’ means F-order, ‘A’ means ‘F’ if a is Fortran contiguous, ‘C’ otherwise. ‘K’ means match the layout of a as closely as possible. (Note that this function and numpy.copy() are very similar but have different default values for their order= arguments, and this function always passes sub-classes through.)

See also

numpy.copy: Similar function with different default behavior

numpy.copyto

Notes

This function is the preferred method for creating an array copy. The function numpy.copy() is similar, but it defaults to using order ‘K’, and will not pass sub-classes through by default.

Examples

>>> import numpy as np
>>> x = np.array([[1,2,3],[4,5,6]], order='F')

>>> y = x.copy()

>>> x.fill(0)

>>> x
array([[0, 0, 0],
       [0, 0, 0]])

>>> y
array([[1, 2, 3],
       [4, 5, 6]])

>>> y.flags['C_CONTIGUOUS']
True

For arrays containing Python objects (e.g. dtype=object), the copy is a shallow one. The new array will contain the same object which may lead to surprises if that object can be modified (is mutable):

>>> a = np.array([1, 'm', [2, 3, 4]], dtype=object)
>>> b = a.copy()
>>> b[2][0] = 10
>>> a
array([1, 'm', list([10, 3, 4])], dtype=object)

To ensure all elements within an object array are copied, use copy.deepcopy:

>>> import copy
>>> a = np.array([1, 'm', [2, 3, 4]], dtype=object)
>>> c = copy.deepcopy(a)
>>> c[2][0] = 10
>>> c
array([1, 'm', list([10, 3, 4])], dtype=object)
>>> a
array([1, 'm', list([2, 3, 4])], dtype=object)

ctypes

An object to simplify the interaction of the array with the ctypes module.

This attribute creates an object that makes it easier to use arrays when calling shared libraries with the ctypes module. The returned object has, among others, data, shape, and strides attributes (see Notes below) which themselves return ctypes objects that can be used as arguments to a shared library.

Parameters:: None
Returns:: c – Possessing attributes data, shape, strides, etc.
Return type:: Python object

See also

numpy.ctypeslib

Notes

Below are the public attributes of this object which were documented in “Guide to NumPy” (we have omitted undocumented public attributes, as well as documented private attributes):

_ctypes.data

A pointer to the memory area of the array as a Python integer. This memory area may contain data that is not aligned, or not in correct byte-order. The memory area may not even be writeable. The array flags and data-type of this array should be respected when passing this attribute to arbitrary C-code to avoid trouble that can include Python crashing. User Beware! The value of this attribute is exactly the same as: self._array_interface_['data'][0].

Note that unlike data_as, a reference won’t be kept to the array: code like ctypes.c_void_p((a + b).ctypes.data) will result in a pointer to a deallocated array, and should be spelt (a + b).ctypes.data_as(ctypes.c_void_p)

_ctypes.shape

A ctypes array of length self.ndim where the basetype is the C-integer corresponding to dtype('p') on this platform (see ~numpy.ctypeslib.c_intp). This base-type could be ctypes.c_int, ctypes.c_long, or ctypes.c_longlong depending on the platform. The ctypes array contains the shape of the underlying array.

Type:: (c_intp*self.ndim)

_ctypes.strides

A ctypes array of length self.ndim where the basetype is the same as for the shape attribute. This ctypes array contains the strides information from the underlying array. This strides information is important for showing how many bytes must be jumped to get to the next element in the array.

Type:: (c_intp*self.ndim)

_ctypes.data_as(obj)

Return the data pointer cast to a particular c-types object. For example, calling self._as_parameter_ is equivalent to self.data_as(ctypes.c_void_p). Perhaps you want to use the data as a pointer to a ctypes array of floating-point data: self.data_as(ctypes.POINTER(ctypes.c_double)).

The returned pointer will keep a reference to the array.

_ctypes.shape_as(obj): Return the shape tuple as an array of some other c-types type. For example: self.shape_as(ctypes.c_short).

_ctypes.strides_as(obj): Return the strides tuple as an array of some other c-types type. For example: self.strides_as(ctypes.c_longlong).

If the ctypes module is not available, then the ctypes attribute of array objects still returns something useful, but ctypes objects are not returned and errors may be raised instead. In particular, the object will still have the as_parameter attribute which will return an integer equal to the data attribute.

Examples

>>> import numpy as np
>>> import ctypes
>>> x = np.array([[0, 1], [2, 3]], dtype=np.int32)
>>> x
array([[0, 1],
       [2, 3]], dtype=int32)
>>> x.ctypes.data
31962608 # may vary
>>> x.ctypes.data_as(ctypes.POINTER(ctypes.c_uint32))
<__main__.LP_c_uint object at 0x7ff2fc1fc200> # may vary
>>> x.ctypes.data_as(ctypes.POINTER(ctypes.c_uint32)).contents
c_uint(0)
>>> x.ctypes.data_as(ctypes.POINTER(ctypes.c_uint64)).contents
c_ulong(4294967296)
>>> x.ctypes.shape
<numpy._core._internal.c_long_Array_2 object at 0x7ff2fc1fce60> # may vary
>>> x.ctypes.strides
<numpy._core._internal.c_long_Array_2 object at 0x7ff2fc1ff320> # may vary

cumprod(axis=None, dtype=None, out=None)

Return the cumulative product of the elements along the given axis.

Refer to numpy.cumprod for full documentation.

See also

numpy.cumprod: equivalent function

cumsum(axis=None, dtype=None, out=None)

Return the cumulative sum of the elements along the given axis.

Refer to numpy.cumsum for full documentation.

See also

numpy.cumsum: equivalent function

data: Python buffer object pointing to the start of the array’s data.

device

diagonal(offset=0, axis1=0, axis2=1)

Return specified diagonals. In NumPy 1.9 the returned array is a read-only view instead of a copy as in previous NumPy versions. In a future version the read-only restriction will be removed.

Refer to numpy.diagonal() for full documentation.

See also

numpy.diagonal: equivalent function

dot()

dtype

Data-type of the array’s elements.

Warning

Setting arr.dtype is discouraged and may be deprecated in the future. Setting will replace the dtype without modifying the memory (see also ndarray.view and ndarray.astype).

Parameters:: None
Returns:: d
Return type:: numpy dtype object

See also

ndarray.astype: Cast the values contained in the array to a new data-type.
ndarray.view: Create a view of the same data but a different data-type.

numpy.dtype

Examples

>>> x
array([[0, 1],
       [2, 3]])
>>> x.dtype
dtype('int32')
>>> type(x.dtype)
<type 'numpy.dtype'>

dump(file)

Dump a pickle of the array to the specified file. The array can be read back with pickle.load or numpy.load.

Parameters:: file (str or Path) – A string naming the dump file.

dumps()

Returns the pickle of the array as a string. pickle.loads will convert the string back to an array.

Parameters:: None

fill(value)

Fill the array with a scalar value.

Parameters:: value (scalar) – All elements of a will be assigned this value.

Examples

>>> import numpy as np
>>> a = np.array([1, 2])
>>> a.fill(0)
>>> a
array([0, 0])
>>> a = np.empty(2)
>>> a.fill(1)
>>> a
array([1.,  1.])

Fill expects a scalar value and always behaves the same as assigning to a single array element. The following is a rare example where this distinction is important:

>>> a = np.array([None, None], dtype=object)
>>> a[0] = np.array(3)
>>> a
array([array(3), None], dtype=object)
>>> a.fill(np.array(3))
>>> a
array([array(3), array(3)], dtype=object)

Where other forms of assignments will unpack the array being assigned:

>>> a[...] = np.array(3)
>>> a
array([3, 3], dtype=object)

flags

Information about the memory layout of the array.

C_CONTIGUOUS(C): The data is in a single, C-style contiguous segment.

F_CONTIGUOUS(F): The data is in a single, Fortran-style contiguous segment.

OWNDATA(O): The array owns the memory it uses or borrows it from another object.

WRITEABLE(W): The data area can be written to. Setting this to False locks the data, making it read-only. A view (slice, etc.) inherits WRITEABLE from its base array at creation time, but a view of a writeable array may be subsequently locked while the base array remains writeable. (The opposite is not true, in that a view of a locked array may not be made writeable. However, currently, locking a base object does not lock any views that already reference it, so under that circumstance it is possible to alter the contents of a locked array via a previously created writeable view onto it.) Attempting to change a non-writeable array raises a RuntimeError exception.

ALIGNED(A): The data and all elements are aligned appropriately for the hardware.

WRITEBACKIFCOPY(X): This array is a copy of some other array. The C-API function PyArray_ResolveWritebackIfCopy must be called before deallocating to the base array will be updated with the contents of this array.

FNC: F_CONTIGUOUS and not C_CONTIGUOUS.

FORC: F_CONTIGUOUS or C_CONTIGUOUS (one-segment test).

BEHAVED(B): ALIGNED and WRITEABLE.

CARRAY(CA): BEHAVED and C_CONTIGUOUS.

FARRAY(FA): BEHAVED and F_CONTIGUOUS and not C_CONTIGUOUS.

Notes

The flags object can be accessed dictionary-like (as in a.flags['WRITEABLE']), or by using lowercased attribute names (as in a.flags.writeable). Short flag names are only supported in dictionary access.

Only the WRITEBACKIFCOPY, WRITEABLE, and ALIGNED flags can be changed by the user, via direct assignment to the attribute or dictionary entry, or by calling ndarray.setflags.

The array flags cannot be set arbitrarily:

WRITEBACKIFCOPY can only be set False.
ALIGNED can only be set True if the data is truly aligned.
WRITEABLE can only be set True if the array owns its own memory or the ultimate owner of the memory exposes a writeable buffer interface or is a string.

Arrays can be both C-style and Fortran-style contiguous simultaneously. This is clear for 1-dimensional arrays, but can also be true for higher dimensional arrays.

Even for contiguous arrays a stride for a given dimension arr.strides[dim] may be arbitrary if arr.shape[dim] == 1 or the array has no elements. It does not generally hold that self.strides[-1] == self.itemsize for C-style contiguous arrays or self.strides[0] == self.itemsize for Fortran-style contiguous arrays is true.

flat

A 1-D iterator over the array.

This is a numpy.flatiter instance, which acts similarly to, but is not a subclass of, Python’s built-in iterator object.

See also

flatten: Return a copy of the array collapsed into one dimension.

flatiter

Examples

>>> import numpy as np
>>> x = np.arange(1, 7).reshape(2, 3)
>>> x
array([[1, 2, 3],
       [4, 5, 6]])
>>> x.flat[3]
4
>>> x.T
array([[1, 4],
       [2, 5],
       [3, 6]])
>>> x.T.flat[3]
5
>>> type(x.flat)
<class 'numpy.flatiter'>

An assignment example:

>>> x.flat = 3; x
array([[3, 3, 3],
       [3, 3, 3]])
>>> x.flat[[1,4]] = 1; x
array([[3, 1, 3],
       [3, 1, 3]])

flatten(order='C')

Return a copy of the array collapsed into one dimension.

Parameters:: order ({'C', 'F', 'A', 'K'}, optional) – ‘C’ means to flatten in row-major (C-style) order. ‘F’ means to flatten in column-major (Fortran- style) order. ‘A’ means to flatten in column-major order if a is Fortran contiguous in memory, row-major order otherwise. ‘K’ means to flatten a in the order the elements occur in memory. The default is ‘C’.
Returns:: y – A copy of the input array, flattened to one dimension.
Return type:: ndarray

See also

ravel: Return a flattened array.
flat: A 1-D flat iterator over the array.

Examples

>>> import numpy as np
>>> a = np.array([[1,2], [3,4]])
>>> a.flatten()
array([1, 2, 3, 4])
>>> a.flatten('F')
array([1, 3, 2, 4])

getfield(dtype, offset=0)

Returns a field of the given array as a certain type.

A field is a view of the array data with a given data-type. The values in the view are determined by the given type and the offset into the current array in bytes. The offset needs to be such that the view dtype fits in the array dtype; for example an array of dtype complex128 has 16-byte elements. If taking a view with a 32-bit integer (4 bytes), the offset needs to be between 0 and 12 bytes.

Parameters:

dtype (str or dtype) – The data type of the view. The dtype size of the view can not be larger than that of the array itself.
offset (int) – Number of bytes to skip before beginning the element view.

Examples

>>> import numpy as np
>>> x = np.diag([1.+1.j]*2)
>>> x[1, 1] = 2 + 4.j
>>> x
array([[1.+1.j,  0.+0.j],
       [0.+0.j,  2.+4.j]])
>>> x.getfield(np.float64)
array([[1.,  0.],
       [0.,  2.]])

By choosing an offset of 8 bytes we can select the complex part of the array for our view:

>>> x.getfield(np.float64, offset=8)
array([[1.,  0.],
       [0.,  4.]])

imag

The imaginary part of the array.

Examples

>>> import numpy as np
>>> x = np.sqrt([1+0j, 0+1j])
>>> x.imag
array([ 0.        ,  0.70710678])
>>> x.imag.dtype
dtype('float64')

item(*args)

Copy an element of an array to a standard Python scalar and return it.

Parameters:

*args (Arguments (variable number and type)) –

none: in this case, the method only works for arrays with one element (a.size == 1), which element is copied into a standard Python scalar object and returned.
int_type: this argument is interpreted as a flat index into the array, specifying which element to copy and return.
tuple of int_types: functions as does a single int_type argument, except that the argument is interpreted as an nd-index into the array.

Returns:

z – A copy of the specified element of the array as a suitable Python scalar

Return type:

Standard Python scalar object

Notes

When the data type of a is longdouble or clongdouble, item() returns a scalar array object because there is no available Python scalar that would not lose information. Void arrays return a buffer object for item(), unless fields are defined, in which case a tuple is returned.

item is very similar to a[args], except, instead of an array scalar, a standard Python scalar is returned. This can be useful for speeding up access to elements of the array and doing arithmetic on elements of the array using Python’s optimized math.

Examples

>>> import numpy as np
>>> np.random.seed(123)
>>> x = np.random.randint(9, size=(3, 3))
>>> x
array([[2, 2, 6],
       [1, 3, 6],
       [1, 0, 1]])
>>> x.item(3)
1
>>> x.item(7)
0
>>> x.item((0, 1))
2
>>> x.item((2, 2))
1

For an array with object dtype, elements are returned as-is.

>>> a = np.array([np.int64(1)], dtype=object)
>>> a.item() #return np.int64
np.int64(1)

itemset

itemsize

Length of one array element in bytes.

Examples

>>> import numpy as np
>>> x = np.array([1,2,3], dtype=np.float64)
>>> x.itemsize
8
>>> x = np.array([1,2,3], dtype=np.complex128)
>>> x.itemsize
16

mT

View of the matrix transposed array.

The matrix transpose is the transpose of the last two dimensions, even if the array is of higher dimension.

Added in version 2.0.

Raises:: ValueError – If the array is of dimension less than 2.

Examples

>>> import numpy as np
>>> a = np.array([[1, 2], [3, 4]])
>>> a
array([[1, 2],
       [3, 4]])
>>> a.mT
array([[1, 3],
       [2, 4]])

>>> a = np.arange(8).reshape((2, 2, 2))
>>> a
array([[[0, 1],
        [2, 3]],

       [[4, 5],
        [6, 7]]])
>>> a.mT
array([[[0, 2],
        [1, 3]],

       [[4, 6],
        [5, 7]]])

max(axis=None, out=None, keepdims=False, initial=<no value>, where=True)

Return the maximum along a given axis.

Refer to numpy.amax for full documentation.

See also

numpy.amax: equivalent function

mean(axis=None, dtype=None, out=None, keepdims=False, *, where=True)

Returns the average of the array elements along given axis.

Refer to numpy.mean for full documentation.

See also

numpy.mean: equivalent function

min(axis=None, out=None, keepdims=False, initial=<no value>, where=True)

Return the minimum along a given axis.

Refer to numpy.amin for full documentation.

See also

numpy.amin: equivalent function

nbytes

Total bytes consumed by the elements of the array.

Notes

Does not include memory consumed by non-element attributes of the array object.

See also

sys.getsizeof: Memory consumed by the object itself without parents in case view. This does include memory consumed by non-element attributes.

Examples

>>> import numpy as np
>>> x = np.zeros((3,5,2), dtype=np.complex128)
>>> x.nbytes
480
>>> np.prod(x.shape) * x.itemsize
480

ndim

Number of array dimensions.

Examples

>>> import numpy as np
>>> x = np.array([1, 2, 3])
>>> x.ndim
1
>>> y = np.zeros((2, 3, 4))
>>> y.ndim
3

newbyteorder

nonzero()

Return the indices of the elements that are non-zero.

Refer to numpy.nonzero for full documentation.

See also

numpy.nonzero: equivalent function

partition(kth, axis=-1, kind='introselect', order=None)

Partially sorts the elements in the array in such a way that the value of the element in k-th position is in the position it would be in a sorted array. In the output array, all elements smaller than the k-th element are located to the left of this element and all equal or greater are located to its right. The ordering of the elements in the two partitions on the either side of the k-th element in the output array is undefined.

Parameters:

kth (int or sequence of ints) –
Element index to partition by. The kth element value will be in its final sorted position and all smaller elements will be moved before it and all equal or greater elements behind it. The order of all elements in the partitions is undefined. If provided with a sequence of kth it will partition all elements indexed by kth of them into their sorted position at once.

Deprecated since version 1.22.0: Passing booleans as index is deprecated.
axis (int, optional) – Axis along which to sort. Default is -1, which means sort along the last axis.
kind ({'introselect'}, optional) – Selection algorithm. Default is ‘introselect’.
order (str or list of str, optional) – When a is an array with fields defined, this argument specifies which fields to compare first, second, etc. A single field can be specified as a string, and not all fields need to be specified, but unspecified fields will still be used, in the order in which they come up in the dtype, to break ties.

See also

numpy.partition: Return a partitioned copy of an array.
argpartition: Indirect partition.
sort: Full sort.

Notes

See np.partition for notes on the different algorithms.

Examples

>>> import numpy as np
>>> a = np.array([3, 4, 2, 1])
>>> a.partition(3)
>>> a
array([2, 1, 3, 4]) # may vary

>>> a.partition((1, 3))
>>> a
array([1, 2, 3, 4])

prod()

a.prod(axis=None, dtype=None, out=None, keepdims=False,: initial=1, where=True)

Return the product of the array elements over the given axis

Refer to numpy.prod for full documentation.

See also

numpy.prod: equivalent function

ptp

put(indices, values, mode='raise')

Set a.flat[n] = values[n] for all n in indices.

Refer to numpy.put for full documentation.

See also

numpy.put: equivalent function

ravel([order])

Return a flattened array.

Refer to numpy.ravel for full documentation.

See also

numpy.ravel: equivalent function
ndarray.flat: a flat iterator on the array.

real

The real part of the array.

Examples

>>> import numpy as np
>>> x = np.sqrt([1+0j, 0+1j])
>>> x.real
array([ 1.        ,  0.70710678])
>>> x.real.dtype
dtype('float64')

See also

numpy.real: equivalent function

repeat(repeats, axis=None)

Repeat elements of an array.

Refer to numpy.repeat for full documentation.

See also

numpy.repeat: equivalent function

reshape(shape, /, *, order='C', copy=None)

Returns an array containing the same data with a new shape.

Refer to numpy.reshape for full documentation.

See also

numpy.reshape: equivalent function

Notes

Unlike the free function numpy.reshape, this method on ndarray allows the elements of the shape parameter to be passed in as separate arguments. For example, a.reshape(10, 11) is equivalent to a.reshape((10, 11)).

resize(new_shape, refcheck=True)

Change shape and size of array in-place.

Parameters:

new_shape (tuple of ints, or n ints) – Shape of resized array.
refcheck (bool, optional) – If False, reference count will not be checked. Default is True.

Return type:

None

Raises:

ValueError – If a does not own its own data or references or views to it exist, and the data memory must be changed. PyPy only: will always raise if the data memory must be changed, since there is no reliable way to determine if references or views to it exist.
SystemError – If the order keyword argument is specified. This behaviour is a bug in NumPy.

See also

resize: Return a new array with the specified shape.

Notes

This reallocates space for the data area if necessary.

Only contiguous arrays (data elements consecutive in memory) can be resized.

The purpose of the reference count check is to make sure you do not use this array as a buffer for another Python object and then reallocate the memory. However, reference counts can increase in other ways so if you are sure that you have not shared the memory for this array with another Python object, then you may safely set refcheck to False.

Examples

Shrinking an array: array is flattened (in the order that the data are stored in memory), resized, and reshaped:

>>> import numpy as np

>>> a = np.array([[0, 1], [2, 3]], order='C')
>>> a.resize((2, 1))
>>> a
array([[0],
       [1]])

>>> a = np.array([[0, 1], [2, 3]], order='F')
>>> a.resize((2, 1))
>>> a
array([[0],
       [2]])

Enlarging an array: as above, but missing entries are filled with zeros:

>>> b = np.array([[0, 1], [2, 3]])
>>> b.resize(2, 3) # new_shape parameter doesn't have to be a tuple
>>> b
array([[0, 1, 2],
       [3, 0, 0]])

Referencing an array prevents resizing…

>>> c = a
>>> a.resize((1, 1))
Traceback (most recent call last):
...
ValueError: cannot resize an array that references or is referenced ...

Unless refcheck is False:

>>> a.resize((1, 1), refcheck=False)
>>> a
array([[0]])
>>> c
array([[0]])

round(decimals=0, out=None)

Return a with each element rounded to the given number of decimals.

Refer to numpy.around for full documentation.

See also

numpy.around: equivalent function

searchsorted(v, side='left', sorter=None)

Find indices where elements of v should be inserted in a to maintain order.

For full documentation, see numpy.searchsorted

See also

numpy.searchsorted: equivalent function

setfield(val, dtype, offset=0)

Put a value into a specified place in a field defined by a data-type.

Place val into a’s field defined by dtype and beginning offset bytes into the field.

Parameters:

val (object) – Value to be placed in field.
dtype (dtype object) – Data-type of the field in which to place val.
offset (int, optional) – The number of bytes into the field at which to place val.

Return type:

None

See also

getfield

Examples

>>> import numpy as np
>>> x = np.eye(3)
>>> x.getfield(np.float64)
array([[1.,  0.,  0.],
       [0.,  1.,  0.],
       [0.,  0.,  1.]])
>>> x.setfield(3, np.int32)
>>> x.getfield(np.int32)
array([[3, 3, 3],
       [3, 3, 3],
       [3, 3, 3]], dtype=int32)
>>> x
array([[1.0e+000, 1.5e-323, 1.5e-323],
       [1.5e-323, 1.0e+000, 1.5e-323],
       [1.5e-323, 1.5e-323, 1.0e+000]])
>>> x.setfield(np.eye(3), np.int32)
>>> x
array([[1.,  0.,  0.],
       [0.,  1.,  0.],
       [0.,  0.,  1.]])

setflags(write=None, align=None, uic=None)

Set array flags WRITEABLE, ALIGNED, WRITEBACKIFCOPY, respectively.

These Boolean-valued flags affect how numpy interprets the memory area used by a (see Notes below). The ALIGNED flag can only be set to True if the data is actually aligned according to the type. The WRITEBACKIFCOPY flag can never be set to True. The flag WRITEABLE can only be set to True if the array owns its own memory, or the ultimate owner of the memory exposes a writeable buffer interface, or is a string. (The exception for string is made so that unpickling can be done without copying memory.)

Parameters:

write (bool, optional) – Describes whether or not a can be written to.
align (bool, optional) – Describes whether or not a is aligned properly for its type.
uic (bool, optional) – Describes whether or not a is a copy of another “base” array.

Notes

Array flags provide information about how the memory area used for the array is to be interpreted. There are 7 Boolean flags in use, only three of which can be changed by the user: WRITEBACKIFCOPY, WRITEABLE, and ALIGNED.

WRITEABLE (W) the data area can be written to;

ALIGNED (A) the data and strides are aligned appropriately for the hardware (as determined by the compiler);

WRITEBACKIFCOPY (X) this array is a copy of some other array (referenced by .base). When the C-API function PyArray_ResolveWritebackIfCopy is called, the base array will be updated with the contents of this array.

All flags can be accessed using the single (upper case) letter as well as the full name.

Examples

>>> import numpy as np
>>> y = np.array([[3, 1, 7],
...               [2, 0, 0],
...               [8, 5, 9]])
>>> y
array([[3, 1, 7],
       [2, 0, 0],
       [8, 5, 9]])
>>> y.flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False
>>> y.setflags(write=0, align=0)
>>> y.flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : False
  ALIGNED : False
  WRITEBACKIFCOPY : False
>>> y.setflags(uic=1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: cannot set WRITEBACKIFCOPY flag to True

shape

Tuple of array dimensions.

The shape property is usually used to get the current shape of an array, but may also be used to reshape the array in-place by assigning a tuple of array dimensions to it. As with numpy.reshape, one of the new shape dimensions can be -1, in which case its value is inferred from the size of the array and the remaining dimensions. Reshaping an array in-place will fail if a copy is required.

Warning

Setting arr.shape is discouraged and may be deprecated in the future. Using ndarray.reshape is the preferred approach.

Examples

>>> import numpy as np
>>> x = np.array([1, 2, 3, 4])
>>> x.shape
(4,)
>>> y = np.zeros((2, 3, 4))
>>> y.shape
(2, 3, 4)
>>> y.shape = (3, 8)
>>> y
array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])
>>> y.shape = (3, 6)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: total size of new array must be unchanged
>>> np.zeros((4,2))[::2].shape = (-1,)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: Incompatible shape for in-place modification. Use
`.reshape()` to make a copy with the desired shape.

See also

numpy.shape: Equivalent getter function.
numpy.reshape: Function similar to setting shape.
ndarray.reshape: Method similar to setting shape.

size

Number of elements in the array.

Equal to np.prod(a.shape), i.e., the product of the array’s dimensions.

Notes

a.size returns a standard arbitrary precision Python integer. This may not be the case with other methods of obtaining the same value (like the suggested np.prod(a.shape), which returns an instance of np.int_), and may be relevant if the value is used further in calculations that may overflow a fixed size integer type.

Examples

>>> import numpy as np
>>> x = np.zeros((3, 5, 2), dtype=np.complex128)
>>> x.size
30
>>> np.prod(x.shape)
30

sort(axis=-1, kind=None, order=None)

Sort an array in-place. Refer to numpy.sort for full documentation.

Parameters:

axis (int, optional) – Axis along which to sort. Default is -1, which means sort along the last axis.
kind ({'quicksort', 'mergesort', 'heapsort', 'stable'}, optional) – Sorting algorithm. The default is ‘quicksort’. Note that both ‘stable’ and ‘mergesort’ use timsort under the covers and, in general, the actual implementation will vary with datatype. The ‘mergesort’ option is retained for backwards compatibility.
order (str or list of str, optional) – When a is an array with fields defined, this argument specifies which fields to compare first, second, etc. A single field can be specified as a string, and not all fields need be specified, but unspecified fields will still be used, in the order in which they come up in the dtype, to break ties.

See also

numpy.sort: Return a sorted copy of an array.
numpy.argsort: Indirect sort.
numpy.lexsort: Indirect stable sort on multiple keys.
numpy.searchsorted: Find elements in sorted array.
numpy.partition: Partial sort.

Notes

See numpy.sort for notes on the different sorting algorithms.

Examples

>>> import numpy as np
>>> a = np.array([[1,4], [3,1]])
>>> a.sort(axis=1)
>>> a
array([[1, 4],
       [1, 3]])
>>> a.sort(axis=0)
>>> a
array([[1, 3],
       [1, 4]])

Use the order keyword to specify a field to use when sorting a structured array:

>>> a = np.array([('a', 2), ('c', 1)], dtype=[('x', 'S1'), ('y', int)])
>>> a.sort(order='y')
>>> a
array([(b'c', 1), (b'a', 2)],
      dtype=[('x', 'S1'), ('y', '<i8')])

squeeze(axis=None)

Remove axes of length one from a.

Refer to numpy.squeeze for full documentation.

See also

numpy.squeeze: equivalent function

std(axis=None, dtype=None, out=None, ddof=0, keepdims=False, *, where=True)

Returns the standard deviation of the array elements along given axis.

Refer to numpy.std for full documentation.

See also

numpy.std: equivalent function

strides

Tuple of bytes to step in each dimension when traversing an array.

The byte offset of element (i[0], i[1], ..., i[n]) in an array a is:

offset = sum(np.array(i) * a.strides)

A more detailed explanation of strides can be found in arrays.ndarray.

Warning

Setting arr.strides is discouraged and may be deprecated in the future. numpy.lib.stride_tricks.as_strided should be preferred to create a new view of the same data in a safer way.

Notes

Imagine an array of 32-bit integers (each 4 bytes):

x = np.array([[0, 1, 2, 3, 4],
              [5, 6, 7, 8, 9]], dtype=np.int32)

This array is stored in memory as 40 bytes, one after the other (known as a contiguous block of memory). The strides of an array tell us how many bytes we have to skip in memory to move to the next position along a certain axis. For example, we have to skip 4 bytes (1 value) to move to the next column, but 20 bytes (5 values) to get to the same position in the next row. As such, the strides for the array x will be (20, 4).

See also

numpy.lib.stride_tricks.as_strided

Examples

>>> import numpy as np
>>> y = np.reshape(np.arange(2*3*4), (2,3,4))
>>> y
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],
       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])
>>> y.strides
(48, 16, 4)
>>> y[1,1,1]
17
>>> offset=sum(y.strides * np.array((1,1,1)))
>>> offset/y.itemsize
17

>>> x = np.reshape(np.arange(5*6*7*8), (5,6,7,8)).transpose(2,3,1,0)
>>> x.strides
(32, 4, 224, 1344)
>>> i = np.array([3,5,2,2])
>>> offset = sum(i * x.strides)
>>> x[3,5,2,2]
813
>>> offset / x.itemsize
813

sum(axis=None, dtype=None, out=None, keepdims=False, initial=0, where=True)

Return the sum of the array elements over the given axis.

Refer to numpy.sum for full documentation.

See also

numpy.sum: equivalent function

swapaxes(axis1, axis2)

Return a view of the array with axis1 and axis2 interchanged.

Refer to numpy.swapaxes for full documentation.

See also

numpy.swapaxes: equivalent function

take(indices, axis=None, out=None, mode='raise')

Return an array formed from the elements of a at the given indices.

Refer to numpy.take for full documentation.

See also

numpy.take: equivalent function

to_device()

tobytes(order='C')

Construct Python bytes containing the raw data bytes in the array.

Constructs Python bytes showing a copy of the raw contents of data memory. The bytes object is produced in C-order by default. This behavior is controlled by the order parameter.

Parameters:: order ({'C', 'F', 'A'}, optional) – Controls the memory layout of the bytes object. ‘C’ means C-order, ‘F’ means F-order, ‘A’ (short for Any) means ‘F’ if a is Fortran contiguous, ‘C’ otherwise. Default is ‘C’.
Returns:: s – Python bytes exhibiting a copy of a’s raw data.
Return type:: bytes

See also

frombuffer: Inverse of this operation, construct a 1-dimensional array from Python bytes.

Examples

>>> import numpy as np
>>> x = np.array([[0, 1], [2, 3]], dtype='<u2')
>>> x.tobytes()
b'\x00\x00\x01\x00\x02\x00\x03\x00'
>>> x.tobytes('C') == x.tobytes()
True
>>> x.tobytes('F')
b'\x00\x00\x02\x00\x01\x00\x03\x00'

tofile(fid, sep='', format='%s')

Write array to a file as text or binary (default).

Data is always written in ‘C’ order, independent of the order of a. The data produced by this method can be recovered using the function fromfile().

Parameters:

fid (file or str or Path) – An open file object, or a string containing a filename.
sep (str) – Separator between array items for text output. If “” (empty), a binary file is written, equivalent to file.write(a.tobytes()).
format (str) – Format string for text file output. Each entry in the array is formatted to text by first converting it to the closest Python type, and then using “format” % item.

Notes

This is a convenience function for quick storage of array data. Information on endianness and precision is lost, so this method is not a good choice for files intended to archive data or transport data between machines with different endianness. Some of these problems can be overcome by outputting the data as text files, at the expense of speed and file size.

When fid is a file object, array contents are directly written to the file, bypassing the file object’s write method. As a result, tofile cannot be used with files objects supporting compression (e.g., GzipFile) or file-like objects that do not support fileno() (e.g., BytesIO).

tolist()

Return the array as an a.ndim-levels deep nested list of Python scalars.

Return a copy of the array data as a (nested) Python list. Data items are converted to the nearest compatible builtin Python type, via the ~numpy.ndarray.item function.

If a.ndim is 0, then since the depth of the nested list is 0, it will not be a list at all, but a simple Python scalar.

Parameters:: none
Returns:: y – The possibly nested list of array elements.
Return type:: object, or list of object, or list of list of object, or …

Notes

The array may be recreated via a = np.array(a.tolist()), although this may sometimes lose precision.

Examples

For a 1D array, a.tolist() is almost the same as list(a), except that tolist changes numpy scalars to Python scalars:

>>> import numpy as np
>>> a = np.uint32([1, 2])
>>> a_list = list(a)
>>> a_list
[np.uint32(1), np.uint32(2)]
>>> type(a_list[0])
<class 'numpy.uint32'>
>>> a_tolist = a.tolist()
>>> a_tolist
[1, 2]
>>> type(a_tolist[0])
<class 'int'>

Additionally, for a 2D array, tolist applies recursively:

>>> a = np.array([[1, 2], [3, 4]])
>>> list(a)
[array([1, 2]), array([3, 4])]
>>> a.tolist()
[[1, 2], [3, 4]]

The base case for this recursion is a 0D array:

>>> a = np.array(1)
>>> list(a)
Traceback (most recent call last):
  ...
TypeError: iteration over a 0-d array
>>> a.tolist()
1

tostring(order='C')

A compatibility alias for ~ndarray.tobytes, with exactly the same behavior.

Despite its name, it returns bytes not strs.

Deprecated since version 1.19.0.

trace(offset=0, axis1=0, axis2=1, dtype=None, out=None)

Return the sum along diagonals of the array.

Refer to numpy.trace for full documentation.

See also

numpy.trace: equivalent function

transpose(*axes)

Returns a view of the array with axes transposed.

Refer to numpy.transpose for full documentation.

Parameters:

axes (None, tuple of ints, or n ints) –

None or no argument: reverses the order of the axes.
tuple of ints: i in the j-th place in the tuple means that the array’s i-th axis becomes the transposed array’s j-th axis.
n ints: same as an n-tuple of the same ints (this form is intended simply as a “convenience” alternative to the tuple form).

Returns:

p – View of the array with its axes suitably permuted.

Return type:

ndarray

See also

transpose: Equivalent function.
ndarray.T: Array property returning the array transposed.
ndarray.reshape: Give a new shape to an array without changing its data.

Examples

>>> import numpy as np
>>> a = np.array([[1, 2], [3, 4]])
>>> a
array([[1, 2],
       [3, 4]])
>>> a.transpose()
array([[1, 3],
       [2, 4]])
>>> a.transpose((1, 0))
array([[1, 3],
       [2, 4]])
>>> a.transpose(1, 0)
array([[1, 3],
       [2, 4]])

>>> a = np.array([1, 2, 3, 4])
>>> a
array([1, 2, 3, 4])
>>> a.transpose()
array([1, 2, 3, 4])

var(axis=None, dtype=None, out=None, ddof=0, keepdims=False, *, where=True)

Returns the variance of the array elements, along given axis.

Refer to numpy.var for full documentation.

See also

numpy.var: equivalent function

view([dtype][, type])

New view of array with the same data.

Note

Passing None for dtype is different from omitting the parameter, since the former invokes dtype(None) which is an alias for dtype('float64').

Parameters:

dtype (data-type or ndarray sub-class, optional) – Data-type descriptor of the returned view, e.g., float32 or int16. Omitting it results in the view having the same data-type as a. This argument can also be specified as an ndarray sub-class, which then specifies the type of the returned object (this is equivalent to setting the type parameter).
type (Python type, optional) – Type of the returned view, e.g., ndarray or matrix. Again, omission of the parameter results in type preservation.

Notes

a.view() is used two different ways:

a.view(some_dtype) or a.view(dtype=some_dtype) constructs a view of the array’s memory with a different data-type. This can cause a reinterpretation of the bytes of memory.

a.view(ndarray_subclass) or a.view(type=ndarray_subclass) just returns an instance of ndarray_subclass that looks at the same array (same shape, dtype, etc.) This does not cause a reinterpretation of the memory.

For a.view(some_dtype), if some_dtype has a different number of bytes per entry than the previous dtype (for example, converting a regular array to a structured array), then the last axis of a must be contiguous. This axis will be resized in the result.

Changed in version 1.23.0: Only the last axis needs to be contiguous. Previously, the entire array had to be C-contiguous.

Examples

>>> import numpy as np
>>> x = np.array([(-1, 2)], dtype=[('a', np.int8), ('b', np.int8)])

Viewing array data using a different type and dtype:

>>> nonneg = np.dtype([("a", np.uint8), ("b", np.uint8)])
>>> y = x.view(dtype=nonneg, type=np.recarray)
>>> x["a"]
array([-1], dtype=int8)
>>> y.a
array([255], dtype=uint8)

Creating a view on a structured array so it can be used in calculations

>>> x = np.array([(1, 2),(3,4)], dtype=[('a', np.int8), ('b', np.int8)])
>>> xv = x.view(dtype=np.int8).reshape(-1,2)
>>> xv
array([[1, 2],
       [3, 4]], dtype=int8)
>>> xv.mean(0)
array([2.,  3.])

Making changes to the view changes the underlying array

>>> xv[0,1] = 20
>>> x
array([(1, 20), (3,  4)], dtype=[('a', 'i1'), ('b', 'i1')])

Using a view to convert an array to a recarray:

>>> z = x.view(np.recarray)
>>> z.a
array([1, 3], dtype=int8)

Views share data:

>>> x[0] = (9, 10)
>>> z[0]
np.record((9, 10), dtype=[('a', 'i1'), ('b', 'i1')])

Views that change the dtype size (bytes per entry) should normally be avoided on arrays defined by slices, transposes, fortran-ordering, etc.:

>>> x = np.array([[1, 2, 3], [4, 5, 6]], dtype=np.int16)
>>> y = x[:, ::2]
>>> y
array([[1, 3],
       [4, 6]], dtype=int16)
>>> y.view(dtype=[('width', np.int16), ('length', np.int16)])
Traceback (most recent call last):
    ...
ValueError: To change to a dtype of a different size, the last axis must be contiguous
>>> z = y.copy()
>>> z.view(dtype=[('width', np.int16), ('length', np.int16)])
array([[(1, 3)],
       [(4, 6)]], dtype=[('width', '<i2'), ('length', '<i2')])

However, views that change dtype are totally fine for arrays with a contiguous last axis, even if the rest of the axes are not C-contiguous:

>>> x = np.arange(2 * 3 * 4, dtype=np.int8).reshape(2, 3, 4)
>>> x.transpose(1, 0, 2).view(np.int16)
array([[[ 256,  770],
        [3340, 3854]],

       [[1284, 1798],
        [4368, 4882]],

       [[2312, 2826],
        [5396, 5910]]], dtype=int16)

class pyorps.core.types.uint32

Bases: unsignedinteger

Unsigned integer type, compatible with C unsigned int.

Character code:: 'I'
Canonical name:: numpy.uintc
Alias on this platform (Linux x86_64):: numpy.uint32: 32-bit unsigned integer (0 to 4_294_967_295).

bit_count() → int

Computes the number of 1-bits in the absolute value of the input. Analogous to the builtin int.bit_count or popcount in C++.

Examples

>>> np.uint32(127).bit_count()
7

class pyorps.core.types.uint64

Bases: unsignedinteger

Unsigned signed integer type, 64bit on 64bit systems and 32bit on 32bit systems.

Character code:: 'L'
Canonical name:: numpy.uint
Alias on this platform (Linux x86_64):: numpy.uint64: 64-bit unsigned integer (0 to 18_446_744_073_709_551_615).
Alias on this platform (Linux x86_64):: numpy.uintp: Unsigned integer large enough to fit pointer, compatible with C uintptr_t.

bit_count() → int

Computes the number of 1-bits in the absolute value of the input. Analogous to the builtin int.bit_count or popcount in C++.

Examples

>>> np.uint64(127).bit_count()
7

Module contents

Core types and base classes for geospatial data processing.

exception pyorps.core.AlgorthmNotImplementedError(algorithm, graph_library)[source]

Bases: Exception

Custom exception if a specific algorithm is not implemented in the API or the graph library

exception pyorps.core.ColumnAnalysisError[source]

Bases: FeatureColumnError

Exception raised when column analysis fails

class pyorps.core.CostAssumptions(source=None)[source]

Bases: object

A class for handling cost assumptions for rasterization.

This class handles: - Loading cost assumptions from files (CSV, Excel, JSON) or generating of cost assumptions from a dictionary or a GeoDataFrame. - Mapping costs to features in a GeoDataFrame - Managing hierarchical cost structures

_apply_nested_costs(gdf, main_feature=None, side_features=None)[source]

Apply costs to the GeoDataFrame based on nested dictionary cost assumptions.

Parameters:

gdf (GeoDataFrame) – GeoDataFrame to update with cost values
main_feature (Optional[str]) – Column name for the primary feature
side_features (Optional[list[str]]) – List containing a single column name for the
feature (secondary)

Returns:

None (modifies gdf in-place)

_apply_tuple_costs(gdf, main_feature=None, side_features=None)[source]

Apply costs to the GeoDataFrame based on tuple keys in cost assumptions.

Parameters:

gdf (GeoDataFrame) – GeoDataFrame to update with cost values
main_feature (Optional[str]) – Column name for the primary feature
side_features (Optional[list[str]]) – List of column names for secondary features

Returns:

None (modifies gdf in-place)

static _convert_numeric_columns(df)[source]

Convert columns to numeric, handling different decimal separators.

Parameters:

df (DataFrame) – DataFrame with potential numeric columns that might use different
separators (decimal)

Return type:

DataFrame

Returns:

DataFrame with properly converted numeric columns

_load_csv_cost_assumptions(filepath)[source]

Load cost assumptions from a CSV file with auto-detection of encoding, delimiter, and decimal separator.

Parameters:: filepath (str) – Path to the CSV file
Return type:: dict
Returns:: dictionary of cost assumptions

_load_excel_cost_assumptions(filepath)[source]

Load cost assumptions from an Excel file, handling different decimal separators.

Parameters:: filepath (str) – Path to the Excel file
Return type:: dict
Returns:: dictionary of cost assumptions

_load_json_cost_assumptions(filepath)[source]

Load cost assumptions from a JSON file with auto-detection of encoding.

Parameters:: filepath (str) – Path to the JSON file
Return type:: dict
Returns:: dictionary of cost assumptions

apply_to_geodataframe(gdf, main_feature=None, side_features=None)[source]

Apply cost assumptions to a GeoDataFrame.

Parameters:

gdf (GeoDataFrame) – GeoDataFrame to apply costs to
main_feature (Optional[str]) – Main feature column name
side_features (Optional[list[str]]) – list of side feature column names or single side feature name

Returns:

GeoDataFrame with ‘cost’ column added

convert_df_to_cost_dict(df)[source]

Convert a DataFrame to a nested dictionary for cost assumptions.

Parameters:: df (DataFrame) – DataFrame containing cost assumptions with hierarchical structure
Return type:: dict
Returns:: dictionary of cost assumptions with nested structure based on DataFrame columns

Uses one numeric column for costs, and all other columns as a hierarchical index: - The first column is the ‘main_feature’ - All additional columns are ‘side_features’

cost_dict_to_df(cost_dict)[source]

Convert cost assumptions dictionary to DataFrame.

Parameters:: cost_dict (dict) – Dictionary of cost assumptions
Return type:: DataFrame
Returns:: DataFrame representation of cost assumptions

load(source)[source]

Load cost assumptions from a file or dictionary.

Parameters:: source (Union[str, dict]) – Path to a file or a dictionary containing cost assumptions
Return type:: dict
Returns:: dictionary of cost assumptions

to_csv(filepath, separator=';', decimal='.', encoding='ISO-8859-1')[source]

Save the cost assumptions to a CSV file.

Parameters:

filepath (str) – Path where to save the CSV file
separator (str) – Column separator character (default is ‘;’)
decimal (str) – Decimal separator character (default is ‘.’)
encoding (str) – The encoding of the file (default is ‘ISO-8859-1’)

Return type:

None

to_excel(filepath, sheet_name='CostAssumptions', index=False)[source]

Save the cost assumptions to an Excel file.

Parameters:

filepath (str) – Path where to save the Excel file
sheet_name (str) – Name of the worksheet (default is ‘CostAssumptions’)
index (bool) – Whether to write row indices (default is False)

Return type:

None

to_json(filepath, indent=2, encoding='ISO-8859-1')[source]

Save the cost assumptions to a JSON file.

Parameters:

filepath (str) – Path where to save the JSON file
indent (int) – Number of spaces for indentation (default is 2)
encoding (str) – The encoding of the file (default is ‘ISO-8859-1’)

Return type:

None

exception pyorps.core.CostAssumptionsError[source]

Bases: Exception

Base exception for CostAssumptions class.

exception pyorps.core.FeatureColumnError[source]

Bases: Exception

Base exception for feature column detection errors

exception pyorps.core.FileLoadError[source]

Bases: CostAssumptionsError

Exception raised when loading files fails.

exception pyorps.core.FormatError[source]

Bases: CostAssumptionsError

Exception raised when data format is invalid.

exception pyorps.core.InvalidSourceError[source]

Bases: CostAssumptionsError

Exception raised when the provided source is invalid.

exception pyorps.core.NoPathFoundError(source, target)[source]

Bases: Exception

Custom exception if no path can be found in the graph for source and target

exception pyorps.core.NoSuitableColumnsError[source]

Bases: FeatureColumnError

Exception raised when no suitable columns are found

class pyorps.core.Path(source, target, algorithm, graph_api, path_indices, path_coords, path_geometry, euclidean_distance, runtimes, path_id, search_space_buffer_m, neighborhood, total_length=None, total_cost=None, length_by_category=None, length_by_category_percent=None)[source]

Bases: object

Dataclass representing a path in a raster graph. Used as container for all path metrics and information.

algorithm: str

euclidean_distance: float

graph_api: str

length_by_category: Optional[dict[float, float]] = None

length_by_category_percent: Optional[dict[float, float]] = None

neighborhood: str

path_coords: list[Union[tuple[float, float], list[float]]]

path_geometry: LineString

path_id: int

path_indices: Union[list[Union[int, int32, int64, uint32, uint64]], ndarray[int]]

runtimes: dict[str, float]

search_space_buffer_m: float

source: Union[tuple[float, float], list[float]]

target: Union[tuple[float, float], list[float]]

to_geodataframe_dict()[source]

Convert Path object to a dictionary suitable for GeoDataFrame creation.

Return type:: dict
Returns:: dictionary with path data formatted for GeoDataFrame

total_cost: Optional[float] = None

total_length: Optional[float] = None

class pyorps.core.PathCollection[source]

Bases: object

Container for Path objects with O(1) retrieval by path ID and O(n) lookup for source and target information. Paths can be added with new id by replacing a Path object with the same ID already existing in th PathCollection.

_next_id: int

_paths: dict[int, Path]

add(path, replace=False)[source]

Add a path to the PathCollection. If the Path’s path_id is None or if replace is False, the path_id of the Path object will set to self._next_id and self._next_id will be incremented. If the Path’s path_id is not None and replace is True, a Path with the same path_id (if present) will be replaced with the new Path object.

Parameters:

path (Path) – A Path object which should be added to the PathCollection.
replace (bool) – Whether to replace an existing Path object with the same path_id (if present) or not.

Return type:

None

property all

Return all Path objects from the values of the PathCollection’s _paths dictionary as a list.

Returns:: A list of all Path objects in the PathCollection.

get(path_id=None, source=None, target=None)[source]

Retrieve a stored path by ID, or by source AND target.

Parameters:

path_id (int) – The ID of the Path object to retrieve (must be None if path should be found by source and target)
source (Any) – The source Path object to retrieve (only used if path_id is None and target os set too; neglected otherwise)
target (Any) – The target Path object to retrieve (only used if path_id is None and target os set too; neglected otherwise)

Return type:

Optional[Path]

Returns:

The Path object with the specified ID or source/target pair. None if no such path exists.

to_geodataframe_records()[source]

Convert all paths to a list of dictionaries suitable for a GeoDataFrame.

Return type:: list
Returns:: List of dictionaries with path data formatted for a GeoDataFrame

exception pyorps.core.RasterShapeError(raster_shape)[source]

Bases: Exception

Custom exception if the raster shape is not supported

exception pyorps.core.WFSConnectionError[source]

Bases: WFSError

Exception raised for connection issues with WFS services.

exception pyorps.core.WFSError[source]

Bases: Exception

Base exception for WFS-related errors.

exception pyorps.core.WFSLayerNotFoundError[source]

Bases: WFSError

Exception raised when a requested layer cannot be found.

exception pyorps.core.WFSResponseParsingError[source]

Bases: WFSError

Exception raised when parsing WFS responses fails.

pyorps.core.detect_feature_columns(gdf, max_features_per_column=50)[source]

Analyze columns in a geodataframe to identify the best candidates for main_feature and side_features based on statistical metrics.

Parameters:

gdf (GeoDataFrame) – GeoDataFrame to analyze
max_features_per_column (int) – Maximum number of unique values allowed in a
column (categorical)

Return type:

tuple[str, list[str]]

Returns:

tuple of (main_feature, side_features)

Raises:

NoSuitableColumnsError – When no suitable columns are found for feature selection

pyorps.core.get_zero_cost_assumptions(gdf, main_feature, side_features)[source]

Generate cost assumptions with zero values for all feature combinations.

Creates structures matching format for CostAssumptions: - Without side features: {main_feature: {val1: 0, val2: 0, …}} - With side features: {(main_feature, side_feature1, …): {(val1, val2, …): 0, …}}

Parameters:

gdf (GeoDataFrame) – GeoDataFrame with feature columns
main_feature (str) – Primary feature column name
side_features (list[str]) – List of secondary feature column names

Returns:

Instacne of zero-cost assumptions

Return type:

CostAssumptions

pyorps.core.save_empty_cost_assumptions(geo_dataset, save_path, main_feature=None, side_features=None, file_type='csv', **kwargs)[source]

Generate and save empty cost assumptions with zero values for a geo dataset.

This function analyzes the given dataset to detect appropriate feature columns, creates a CostAssumptions object with zero costs for all feature combinations, and saves it to the specified path in the requested format.

Parameters:

geo_dataset (Any) – GeoDataset object with a ‘data’ attribute containing a GeoDataFrame
save_path (Union[str, Path]) – File path where the cost assumptions should be saved
main_feature (Optional[str]) – Column name for the primary feature
side_features (Optional[list[str]]) – List containing a single column name for the secondary feature
file_type (str) – Output file format - one of ‘json’, ‘csv’, or ‘excel’ (default is ‘json’)

Raises:

TypeError – If file_type is not one of the supported formats
NoSuitableColumnsError – If no suitable columns can be detected in the dataset

Returns:

This function saves to a file and doesn’t return a value

Return type:

None