pyorps.raster package

Submodules

pyorps.raster.handler module

class pyorps.raster.handler.Affine(a: float, b: float, c: float, d: float, e: float, f: float, g: float = 0.0, h: float = 0.0, i: float = 1.0)[source]

Bases: Affine

Two dimensional affine transform for 2D linear mapping.

Parameters:
  • a (float) –

    Coefficients of an augmented affine transformation matrix

    x’ | | a b c | | x |
    y’ | = | d e f | | y |
    1 | | 0 0 1 | | 1 |

    a, b, and c are the elements of the first row of the matrix. d, e, and f are the elements of the second row.

  • b (float) –

    Coefficients of an augmented affine transformation matrix

    x’ | | a b c | | x |
    y’ | = | d e f | | y |
    1 | | 0 0 1 | | 1 |

    a, b, and c are the elements of the first row of the matrix. d, e, and f are the elements of the second row.

  • c (float) –

    Coefficients of an augmented affine transformation matrix

    x’ | | a b c | | x |
    y’ | = | d e f | | y |
    1 | | 0 0 1 | | 1 |

    a, b, and c are the elements of the first row of the matrix. d, e, and f are the elements of the second row.

  • d (float) –

    Coefficients of an augmented affine transformation matrix

    x’ | | a b c | | x |
    y’ | = | d e f | | y |
    1 | | 0 0 1 | | 1 |

    a, b, and c are the elements of the first row of the matrix. d, e, and f are the elements of the second row.

  • e (float) –

    Coefficients of an augmented affine transformation matrix

    x’ | | a b c | | x |
    y’ | = | d e f | | y |
    1 | | 0 0 1 | | 1 |

    a, b, and c are the elements of the first row of the matrix. d, e, and f are the elements of the second row.

  • f (float) –

    Coefficients of an augmented affine transformation matrix

    x’ | | a b c | | x |
    y’ | = | d e f | | y |
    1 | | 0 0 1 | | 1 |

    a, b, and c are the elements of the first row of the matrix. d, e, and f are the elements of the second row.

a, b, c, d, e, f, g, h, i

The coefficients of the 3x3 augmented affine transformation matrix

x’ | | a b c | | x |
y’ | = | d e f | | y |
1 | | g h i | | 1 |

g, h, and i are always 0, 0, and 1.

Type:

float

The Affine package is derived from Casey Duncan's Planar package.
See the copyright statement below.  Parallel lines are preserved by
these transforms. Affine transforms can perform any combination of
translations, scales/flips, shears, and rotations.  Class methods
are provided to conveniently compose transforms from these
operations.
Internally the transform is stored as a 3x3 transformation matrix.
The transform may be constructed directly by specifying the first
two rows of matrix values as 6 floats. Since the matrix is an affine
transform, the last row is always ``(0, 0, 1)``.
N.B.
Type:

multiplication of a transform and an (x, y) vector always

returns the column vector that is the matrix multiplication product
of the transform and (x, y) as a column vector, no matter which is
on the left or right side. This is obviously not the case for
matrices and vectors in general, but provides a convenience for
users of this class.
property _scaling

The absolute scaling factors of the transformation.

This tuple represents the absolute value of the scaling factors of the transformation, sorted from bigger to smaller.

almost_equals(other, precision=1e-05)[source]

Compare transforms for approximate equality.

Parameters:

other (Affine) – Transform being compared.

Return type:

bool

Returns:

True if absolute difference between each element of each respective transform matrix < self.precision.

property column_vectors

The values of the transform as three 2D column vectors

property determinant

The determinant of the transform matrix.

This value is equal to the area scaling factor when the transform is applied to a shape.

property eccentricity: float

The eccentricity of the affine transformation.

This value represents the eccentricity of an ellipse under this affine transformation.

Raises NotImplementedError for improper transformations.

classmethod from_gdal(c, a, b, f, d, e)[source]

Use same coefficient order as GDAL’s GetGeoTransform().

Parameters:

e (c, a, b, f, d,) – 6 floats ordered by GDAL.

Return type:

Affine

classmethod identity()[source]

Return the identity transform.

Return type:

Affine

property is_conformal: bool

True if the transform is conformal.

i.e., if angles between points are preserved after applying the transform, within rounding limits. This implies that the transform has no effective shear.

property is_degenerate

True if this transform is degenerate.

Which means that it will collapse a shape to an effective area of zero. Degenerate transforms cannot be inverted.

property is_identity: bool

True if this transform equals the identity matrix, within rounding limits.

property is_orthonormal: bool

True if the transform is orthonormal.

Which means that the transform represents a rigid motion, which has no effective scaling or shear. Mathematically, this means that the axis vectors of the transform matrix are perpendicular and unit-length. Applying an orthonormal transform to a shape always results in a congruent shape.

property is_proper

True if this transform is proper.

Which means that it does not include reflection.

property is_rectilinear: bool

True if the transform is rectilinear.

i.e., whether a shape would remain axis-aligned, within rounding limits, after applying the transform.

itransform(seq)[source]

Transform a sequence of points or vectors in place.

Parameters:

seq – Mutable sequence of Vec2 to be transformed.

Return type:

None

Returns:

None, the input sequence is mutated in place.

classmethod permutation(*scaling)[source]

Create the permutation transform

For 2x2 matrices, there is only one permutation matrix that is not the identity.

Return type:

Affine

precision = 1e-05
classmethod rotation(angle, pivot=None)[source]

Create a rotation transform at the specified angle.

A pivot point other than the coordinate system origin may be optionally specified.

Parameters:
  • angle (float) – Rotation angle in degrees, counter-clockwise about the pivot point.

  • pivot (sequence) – Point to rotate about, if omitted the rotation is about the origin.

Return type:

Affine

property rotation_angle: float

The rotation angle in degrees of the affine transformation.

This is the rotation angle in degrees of the affine transformation, assuming it is in the form M = R S, where R is a rotation and S is a scaling.

Raises UndefinedRotationError for improper and degenerate transformations.

classmethod scale(*scaling)[source]

Create a scaling transform from a scalar or vector.

Parameters:

scaling (float or sequence) – The scaling factor. A scalar value will scale in both dimensions equally. A vector scaling value scales the dimensions independently.

Return type:

Affine

classmethod shear(x_angle=0, y_angle=0)[source]

Create a shear transform along one or both axes.

Parameters:
  • x_angle (float) – Shear angle in degrees parallel to the x-axis.

  • y_angle (float) – Shear angle in degrees parallel to the y-axis.

Return type:

Affine

to_gdal()[source]

Return same coefficient order as GDAL’s SetGeoTransform().

Return type:

tuple

to_shapely()[source]

Return an affine transformation matrix compatible with shapely

Shapely’s affinity module expects an affine transformation matrix in (a,b,d,e,xoff,yoff) order.

Return type:

tuple

classmethod translation(xoff, yoff)[source]

Create a translation transform from an offset vector.

Parameters:
  • xoff (float) – Translation x offset.

  • yoff (float) – Translation y offset.

Return type:

Affine

property xoff: float

Alias for ‘c’

property yoff: float

Alias for ‘f’

class pyorps.raster.handler.Any(*args, **kwargs)[source]

Bases: object

Special type indicating an unconstrained type.

  • Any is compatible with every type.

  • Any assumed to have all methods.

  • All values assumed to be instances of Any.

Note that all the above statements are true from the point of view of static type checkers. At runtime, Any should not be used with instance checks.

class pyorps.raster.handler.LineString(coordinates=None)[source]

Bases: BaseGeometry

A geometry type composed of one or more line segments.

A LineString is a one-dimensional feature and has a non-zero length but zero area. It may approximate a curve and need not be straight. A LineString may be closed.

Parameters:

coordinates (sequence) – A sequence of (x, y, [,z]) numeric coordinate pairs or triples, or an array-like with shape (N, 2) or (N, 3). Also can be a sequence of Point objects, or combination of both.

Examples

Create a LineString with two segments

>>> from shapely import LineString
>>> a = LineString([[0, 0], [1, 0], [1, 1]])
>>> a.length
2.0
offset_curve(distance, quad_segs=16, join_style=BufferJoinStyle.round, mitre_limit=5.0)[source]

Return a (Multi)LineString at a distance from the object.

The side, left or right, is determined by the sign of the distance parameter (negative for right side offset, positive for left side offset). The resolution of the buffer around each vertex of the object increases by increasing the quad_segs keyword parameter.

The join style is for outside corners between line segments. Accepted values are JOIN_STYLE.round (1), JOIN_STYLE.mitre (2), and JOIN_STYLE.bevel (3).

The mitre ratio limit is used for very sharp corners. It is the ratio of the distance from the corner to the end of the mitred offset corner. When two line segments meet at a sharp angle, a miter join will extend far beyond the original geometry. To prevent unreasonable geometry, the mitre limit allows controlling the maximum length of the join corner. Corners with a ratio which exceed the limit will be beveled.

Note: the behaviour regarding orientation of the resulting line depends on the GEOS version. With GEOS < 3.11, the line retains the same direction for a left offset (positive distance) or has reverse direction for a right offset (negative distance), and this behaviour was documented as such in previous Shapely versions. Starting with GEOS 3.11, the function tries to preserve the orientation of the original line.

parallel_offset(distance, side='right', resolution=16, join_style=BufferJoinStyle.round, mitre_limit=5.0)[source]

Alternative method to offset_curve() method.

Older alternative method to the offset_curve() method, but uses resolution instead of quad_segs and a side keyword (‘left’ or ‘right’) instead of sign of the distance. This method is kept for backwards compatibility for now, but is is recommended to use offset_curve() instead.

svg(scale_factor=1.0, stroke_color=None, opacity=None)[source]

Return SVG polyline element for the LineString geometry.

Parameters:
  • scale_factor (float) – Multiplication factor for the SVG stroke-width. Default is 1.

  • stroke_color (str, optional) – Hex string for stroke color. Default is to use “#66cc99” if geometry is valid, and “#ff3333” if invalid.

  • opacity (float) – Float number between 0 and 1 for color opacity. Default value is 0.8

property xy

Separate arrays of X and Y coordinate values.

Examples

>>> from shapely import LineString
>>> x, y = LineString([(0, 0), (1, 1)]).xy
>>> list(x)
[0.0, 1.0]
>>> list(y)
[0.0, 1.0]
class pyorps.raster.handler.MultiPoint(points=None)[source]

Bases: BaseMultipartGeometry

A collection of one or more Points.

A MultiPoint has zero area and zero length.

Parameters:

points (sequence) – A sequence of Points, or a sequence of (x, y [,z]) numeric coordinate pairs or triples, or an array-like of shape (N, 2) or (N, 3).

geoms

A sequence of Points

Type:

sequence

Examples

Construct a MultiPoint containing two Points

>>> from shapely import MultiPoint, Point
>>> ob = MultiPoint([[0.0, 0.0], [1.0, 2.0]])
>>> len(ob.geoms)
2
>>> type(ob.geoms[0]) == Point
True
svg(scale_factor=1.0, fill_color=None, opacity=None)[source]

Return a group of SVG circle elements for the MultiPoint geometry.

Parameters:
  • scale_factor (float) – Multiplication factor for the SVG circle diameters. Default is 1.

  • fill_color (str, optional) – Hex string for fill color. Default is to use “#66cc99” if geometry is valid, and “#ff3333” if invalid.

  • opacity (float) – Float number between 0 and 1 for color opacity. Default value is 0.6

class pyorps.raster.handler.Polygon(shell=None, holes=None)[source]

Bases: BaseGeometry

A geometry type representing an area that is enclosed by a linear ring.

A polygon is a two-dimensional feature and has a non-zero area. It may have one or more negative-space “holes” which are also bounded by linear rings. If any rings cross each other, the feature is invalid and operations on it may fail.

Parameters:
  • shell (sequence) – A sequence of (x, y [,z]) numeric coordinate pairs or triples, or an array-like with shape (N, 2) or (N, 3). Also can be a sequence of Point objects.

  • holes (sequence) – A sequence of objects which satisfy the same requirements as the shell parameters above

exterior

The ring which bounds the positive space of the polygon.

Type:

LinearRing

interiors

A sequence of rings which bound all existing holes.

Type:

sequence

Examples

Create a square polygon with no holes

>>> from shapely import Polygon
>>> coords = ((0., 0.), (0., 1.), (1., 1.), (1., 0.), (0., 0.))
>>> polygon = Polygon(coords)
>>> polygon.area
1.0
property coords

Not implemented for polygons.

property exterior

Return the exterior ring of the polygon.

classmethod from_bounds(xmin, ymin, xmax, ymax)[source]

Construct a Polygon() from spatial bounds.

property interiors

Return the sequence of interior rings of the polygon.

svg(scale_factor=1.0, fill_color=None, opacity=None)[source]

Return SVG path element for the Polygon geometry.

Parameters:
  • scale_factor (float) – Multiplication factor for the SVG stroke-width. Default is 1.

  • fill_color (str, optional) – Hex string for fill color. Default is to use “#66cc99” if geometry is valid, and “#ff3333” if invalid.

  • opacity (float) – Float number between 0 and 1 for color opacity. Default value is 0.6

class pyorps.raster.handler.RasterDataset(file_source, crs=None)[source]

Bases: GeoDataset, ABC

_abc_impl = <_abc._abc_data object>
count: int
dtype: dtype
file_source: Any
shape: tuple[int, int]
transform: Affine
class pyorps.raster.handler.RasterHandler(raster_source, source_coords, target_coords, search_space_buffer_m=None, input_crs=None, apply_mask=True, outside_value=None, bands=None)[source]

Bases: object

Class for efficiently working with raster data while preserving geographic transformation information. Can be initialized with either a file path or directly with raster data, CRS, and transform.

_init_from_metadata(source_coords, target_coords, search_space_buffer_m=None, input_crs=None, apply_mask=True, outside_value=None, bands=None)[source]

Initialize using metadata and raster data.

This method contains the common initialization code used regardless of whether the input is a path or direct data components.

Parameters:
  • source_coords (Union[tuple[float, float], list[float], list[Union[tuple[float, float], list[float]]]]) – Source point(s) as (x, y) tuple or list of tuples

  • target_coords (Union[tuple[float, float], list[float], list[Union[tuple[float, float], list[float]]]]) – Target point(s) as (x, y) tuple or list of tuples

  • search_space_buffer_m (Optional[float]) – Buffer distance in map units (typically meters)

  • input_crs (Optional[str]) – CRS of the input coordinates (e.g., ‘EPSG:4326’). If None, assumes same as raster

  • apply_mask (bool) – If True, apply the buffer mask after loading data

  • outside_value (Optional[Any]) – Value to set for pixels outside the buffer (defaults to max value of the data type)

  • bands (Optional[List[int]]) – List of bands to modify if apply_mask is True (1-based). If None, all bands are modified

static _transform_coords(coords, input_crs, target_crs)[source]

Transform coordinates from input_crs to target_crs. Handles both single coordinates and lists of coordinates.

Parameters:
  • coords (Union[tuple[float, float], list[float], list[Union[tuple[float, float], list[float]]]]) – Coordinates to transform from input_crs to target_crs

  • input_crs (str) – Coordinate reference system of the input coordinates

  • target_crs (str) – Coordinate reference system of the target coordinates

Returns:

The transformed coordinates

apply_geometry_mask(geometry, outside_value=None, bands=None)[source]

Set pixel values outside the given geometry to the specified value.

Parameters:
  • geometry (Polygon) – A shapely geometry object (Polygon)

  • outside_value (Optional[int]) – Value to set for pixels outside the geometry

  • bands (Union[list[int], int, None]) – List of bands to modify (1-based). If None, all bands are modified.

buffer_geometry: Polygon
coords_to_indices(coords)[source]

Convert geographic coordinates to pixel row/column indices within this raster section.

Parameters:

coords (Union[tuple[float, float], list[float], list[Union[tuple[float, float], list[float]]]]) – List of (x, y) coordinate tuples or a single coordinate tuple

Returns:

Array of (row, col) pixel indices

Return type:

numpy.ndarray

data: ndarray
estimate_buffer_width(source_coords, target_coords, min_buffer=200, max_buffer=4000, sample_radius=50)[source]

Estimate an appropriate buffer width for path finding based on terrain characteristics.

Parameters:
  • source_coords (Union[tuple[float, float], list[float], list[Union[tuple[float, float], list[float]]]]) – (x, y) coordinates of the source point

  • target_coords (Union[tuple[float, float], list[float], list[Union[tuple[float, float], list[float]]]]) – (x, y) coordinates of the target point

  • min_buffer (float) – Minimum buffer width to consider (meters)

  • max_buffer (float) – Maximum buffer width to consider (meters)

  • sample_radius (float) – Radius for sampling around the straight line to assess terrain complexity

Returns:

Estimated optimal buffer width in meters

indices_to_coords(indices)[source]

Convert pixel indices to geographic coordinates.

Parameters:

indices (List[Tuple[int, int]]) – List of (row, col) pixel indices

Returns:

Array of (x, y) coordinates

Return type:

numpy.ndarray

static max_distance_pair(coords1, coords2)[source]

Find the pair of coordinates (one from coords1, one from coords2) with the highest Euclidean distance.

Parameters:
  • coords1 (Union[tuple[float, float], list[float], list[Union[tuple[float, float], list[float]]]]) – Either a single coordinate tuple (x, y, …) or a list of coordinate tuples

  • coords2 (Union[tuple[float, float], list[float], list[Union[tuple[float, float], list[float]]]]) – Either a single coordinate tuple (x, y, …) or a list of coordinate tuples

Returns:

A tuple containing the two points with the maximum distance (point1, point2)

raster_dataset: RasterDataset
save_section_as_raster(output_path)[source]

Save the section as a new raster file with proper geo referencing.

Parameters:

output_path (str) – Path for the output raster file

search_space_buffer_m: float
window: Window
window_transform: Affine
class pyorps.raster.handler.Transformer(transformer_maker=None)[source]

Bases: object

The Transformer class is for facilitating re-using transforms without needing to re-create them. The goal is to make repeated transforms faster.

Additionally, it provides multiple methods for initialization.

Added in version 2.1.0.

property _transformer

The Cython _Transformer object for this thread.

Return type:

_Transformer

property accuracy: float

Expected accuracy of the transformation. -1 if unknown.

Type:

float

property area_of_use: AreaOfUse

Added in version 2.3.0.

Returns:

The area of use object with associated attributes.

Return type:

AreaOfUse

property definition: str

Definition of the projection.

Type:

str

property description: str

Description of the projection.

Type:

str

static from_crs(crs_from, crs_to, always_xy=False, area_of_interest=None, authority=None, accuracy=None, allow_ballpark=None, force_over=False, only_best=None)[source]

Make a Transformer from a pyproj.crs.CRS or input used to create one.

See:

  • proj_create_crs_to_crs()

  • proj_create_crs_to_crs_from_pj()

Added in version 2.2.0: always_xy

Added in version 2.3.0: area_of_interest

Added in version 3.1.0: authority, accuracy, allow_ballpark

Added in version 3.4.0: force_over

Added in version 3.5.0: only_best

Parameters:
  • crs_from (pyproj.crs.CRS or input used to create one) – Projection of input data.

  • crs_to (pyproj.crs.CRS or input used to create one) – Projection of output data.

  • always_xy (bool, default=False) – If true, the transform method will accept as input and return as output coordinates using the traditional GIS order, that is longitude, latitude for geographic CRS and easting, northing for most projected CRS.

  • area_of_interest (AreaOfInterest, optional) – The area of interest to help select the transformation.

  • authority (str, optional) – When not specified, coordinate operations from any authority will be searched, with the restrictions set in the authority_to_authority_preference database table related to the authority of the source/target CRS themselves. If authority is set to “any”, then coordinate operations from any authority will be searched. If authority is a non-empty string different from “any”, then coordinate operations will be searched only in that authority namespace (e.g. EPSG).

  • accuracy (float, optional) – The minimum desired accuracy (in metres) of the candidate coordinate operations.

  • allow_ballpark (bool, optional) – Set to False to disallow the use of Ballpark transformation in the candidate coordinate operations. Default is to allow.

  • force_over (bool, default=False) – If True, it will to force the +over flag on the transformation. Requires PROJ 9+.

  • only_best (bool, optional) – Can be set to True to cause PROJ to error out if the best transformation known to PROJ and usable by PROJ if all grids known and usable by PROJ were accessible, cannot be used. Best transformation should be understood as the transformation returned by proj_get_suggested_operation() if all known grids were accessible (either locally or through network). Note that the default value for this option can be also set with the PROJ_ONLY_BEST_DEFAULT environment variable, or with the only_best_default setting of proj-ini. The only_best kwarg overrides the default value if set. Requires PROJ 9.2+.

Return type:

Transformer

static from_pipeline(proj_pipeline)[source]

Make a Transformer from a PROJ pipeline string.

pipeline

See:

  • proj_create()

  • proj_create_from_database()

Added in version 3.1.0: AUTH:CODE string support (e.g. EPSG:1671)

Allowed input:
  • a PROJ string

  • a WKT string

  • a PROJJSON string

  • an object code (e.g. “EPSG:1671” “urn:ogc:def:coordinateOperation:EPSG::1671”)

  • an object name. e.g “ITRF2014 to ETRF2014 (1)”. In that case as uniqueness is not guaranteed, heuristics are applied to determine the appropriate best match.

  • a OGC URN combining references for concatenated operations (e.g. “urn:ogc:def:coordinateOperation,coordinateOperation:EPSG::3895, coordinateOperation:EPSG::1618”)

Parameters:

proj_pipeline (str) – Projection pipeline string.

Return type:

Transformer

static from_proj(proj_from, proj_to, always_xy=False, area_of_interest=None)[source]

Make a Transformer from a pyproj.Proj or input used to create one.

Deprecated since version 3.4.1: from_crs() is preferred.

Added in version 2.2.0: always_xy

Added in version 2.3.0: area_of_interest

Parameters:
  • proj_from (pyproj.Proj or input used to create one) – Projection of input data.

  • proj_to (pyproj.Proj or input used to create one) – Projection of output data.

  • always_xy (bool, default=False) – If true, the transform method will accept as input and return as output coordinates using the traditional GIS order, that is longitude, latitude for geographic CRS and easting, northing for most projected CRS.

  • area_of_interest (AreaOfInterest, optional) – The area of interest to help select the transformation.

Return type:

Transformer

get_last_used_operation()[source]

Added in version 3.4.0.

Note

Requires PROJ 9.1+

See: proj_trans_get_last_used_operation()

Returns:

The operation used in the transform call.

Return type:

Transformer

property has_inverse: bool

True if an inverse mapping exists.

Type:

bool

is_exact_same(other)[source]

Check if the Transformer objects are the exact same. If it is not a Transformer, then it returns False.

Parameters:

other (Any)

Return type:

bool

property is_network_enabled: bool

Added in version 3.0.0.

Returns:

If the network is enabled.

Return type:

bool

itransform(points, switch=False, time_3rd=False, radians=False, errcheck=False, direction=TransformDirection.FORWARD)[source]

Iterator/generator version of the function pyproj.Transformer.transform.

See: proj_trans_generic()

Added in version 2.1.1: errcheck

Added in version 2.2.0: direction

Parameters:
  • points (list) – List of point tuples.

  • switch (bool, default=False) – If True x, y or lon,lat coordinates of points are switched to y, x or lat, lon. Default is False.

  • time_3rd (bool, default=False) – If the input coordinates are 3 dimensional and the 3rd dimension is time.

  • radians (bool, default=False) – If True, will expect input data to be in radians and will return radians if the projection is geographic. Otherwise, it uses degrees. Ignored for pipeline transformations with pyproj 2, but will work in pyproj 3.

  • errcheck (bool, default=False) – If True, an exception is raised if the errors are found in the process. If False, inf is returned for errors.

  • direction (pyproj.enums.TransformDirection, optional) – The direction of the transform. Default is pyproj.enums.TransformDirection.FORWARD.

Return type:

Iterator[Iterable]

Example

>>> from pyproj import Transformer
>>> transformer = Transformer.from_crs(4326, 2100)
>>> points = [(22.95, 40.63), (22.81, 40.53), (23.51, 40.86)]
>>> for pt in transformer.itransform(points): '{:.3f} {:.3f}'.format(*pt)
'2221638.801 2637034.372'
'2212924.125 2619851.898'
'2238294.779 2703763.736'
>>> pipeline_str = (
...     "+proj=pipeline +step +proj=longlat +ellps=WGS84 "
...     "+step +proj=unitconvert +xy_in=rad +xy_out=deg"
... )
>>> pipe_trans = Transformer.from_pipeline(pipeline_str)
>>> for pt in pipe_trans.itransform([(2.1, 0.001)]):
...     '{:.3f} {:.3f}'.format(*pt)
'2.100 0.001'
>>> transproj = Transformer.from_crs(
...     {"proj":'geocent', "ellps":'WGS84', "datum":'WGS84'},
...     "EPSG:4326",
...     always_xy=True,
... )
>>> for pt in transproj.itransform(
...     [(-2704026.010, -4253051.810, 3895878.820)],
...     radians=True,
... ):
...     '{:.3f} {:.3f} {:.3f}'.format(*pt)
'-2.137 0.661 -20.531'
>>> transprojr = Transformer.from_crs(
...     "EPSG:4326",
...     {"proj":'geocent', "ellps":'WGS84', "datum":'WGS84'},
...     always_xy=True,
... )
>>> for pt in transprojr.itransform(
...     [(-2.137, 0.661, -20.531)],
...     radians=True
... ):
...     '{:.3f} {:.3f} {:.3f}'.format(*pt)
'-2704214.394 -4254414.478 3894270.731'
>>> transproj_eq = Transformer.from_crs(
...     'EPSG:4326',
...     '+proj=longlat +datum=WGS84 +no_defs +type=crs',
...     always_xy=True,
... )
>>> for pt in transproj_eq.itransform([(-2.137, 0.661)]):
...     '{:.3f} {:.3f}'.format(*pt)
'-2.137 0.661'
property name: str

Name of the projection.

Type:

str

property operations: tuple[CoordinateOperation] | None

Added in version 2.4.0.

Returns:

The operations in a concatenated operation.

Return type:

tuple[CoordinateOperation]

property remarks: str

Added in version 2.4.0.

Returns:

Remarks about object.

Return type:

str

property scope: str

Added in version 2.4.0.

Returns:

Scope of object.

Return type:

str

property source_crs: CRS | None

Added in version 3.3.0.

Returns:

The source CRS of a CoordinateOperation.

Return type:

CRS | None

property target_crs: CRS | None

Added in version 3.3.0.

Returns:

The target CRS of a CoordinateOperation.

Return type:

CRS | None

to_json(pretty=False, indentation=2)[source]

Convert the projection to a JSON string.

Added in version 2.4.0.

Parameters:
  • pretty (bool, default=False) – If True, it will set the output to be a multiline string.

  • indentation (int, default=2) – If pretty is True, it will set the width of the indentation.

Returns:

The JSON string.

Return type:

str

to_json_dict()[source]

Convert the projection to a JSON dictionary.

Added in version 2.4.0.

Returns:

The JSON dictionary.

Return type:

dict

to_proj4(version=ProjVersion.PROJ_5, pretty=False)[source]

Convert the projection to a PROJ string.

Added in version 3.1.0.

Parameters:
  • version (pyproj.enums.ProjVersion) – The version of the PROJ string output. Default is pyproj.enums.ProjVersion.PROJ_5.

  • pretty (bool, default=False) – If True, it will set the output to be a multiline string.

Returns:

The PROJ string.

Return type:

str

to_wkt(version=WktVersion.WKT2_2019, pretty=False)[source]

Convert the projection to a WKT string.

Version options:
  • WKT2_2015

  • WKT2_2015_SIMPLIFIED

  • WKT2_2019

  • WKT2_2019_SIMPLIFIED

  • WKT1_GDAL

  • WKT1_ESRI

Parameters:
  • version (pyproj.enums.WktVersion, optional) – The version of the WKT output. Default is pyproj.enums.WktVersion.WKT2_2019.

  • pretty (bool, default=False) – If True, it will set the output to be a multiline string.

Returns:

The WKT string.

Return type:

str

transform(xx, yy, zz=None, tt=None, radians=False, errcheck=False, direction=TransformDirection.FORWARD, inplace=False)[source]

Transform points between two coordinate systems.

See: proj_trans_generic()

Added in version 2.1.1: errcheck

Added in version 2.2.0: direction

Added in version 3.2.0: inplace

Accepted numeric scalar or array:

  • int

  • float

  • numpy.floating

  • numpy.integer

  • list

  • tuple

  • array.array

  • numpy.ndarray

  • xarray.DataArray

  • pandas.Series

Parameters:
  • xx (scalar or array) – Input x coordinate(s).

  • yy (scalar or array) – Input y coordinate(s).

  • zz (scalar or array, optional) – Input z coordinate(s).

  • tt (scalar or array, optional) – Input time coordinate(s).

  • radians (bool, default=False) – If True, will expect input data to be in radians and will return radians if the projection is geographic. Otherwise, it uses degrees. Ignored for pipeline transformations with pyproj 2, but will work in pyproj 3.

  • errcheck (bool, default=False) – If True, an exception is raised if the errors are found in the process. If False, inf is returned for errors.

  • direction (pyproj.enums.TransformDirection, optional) – The direction of the transform. Default is pyproj.enums.TransformDirection.FORWARD.

  • inplace (bool, default=False) – If True, will attempt to write the results to the input array instead of returning a new array. This will fail if the input is not an array in C order with the double data type.

Example

>>> from pyproj import Transformer
>>> transformer = Transformer.from_crs("EPSG:4326", "EPSG:3857")
>>> x3, y3 = transformer.transform(33, 98)
>>> f"{x3:.3f}  {y3:.3f}"
'10909310.098  3895303.963'
>>> pipeline_str = (
...     "+proj=pipeline +step +proj=longlat +ellps=WGS84 "
...     "+step +proj=unitconvert +xy_in=rad +xy_out=deg"
... )
>>> pipe_trans = Transformer.from_pipeline(pipeline_str)
>>> xt, yt = pipe_trans.transform(2.1, 0.001)
>>> f"{xt:.3f}  {yt:.3f}"
'2.100  0.001'
>>> transproj = Transformer.from_crs(
...     {"proj":'geocent', "ellps":'WGS84', "datum":'WGS84'},
...     "EPSG:4326",
...     always_xy=True,
... )
>>> xpj, ypj, zpj = transproj.transform(
...     -2704026.010,
...     -4253051.810,
...     3895878.820,
...     radians=True,
... )
>>> f"{xpj:.3f} {ypj:.3f} {zpj:.3f}"
'-2.137 0.661 -20.531'
>>> transprojr = Transformer.from_crs(
...     "EPSG:4326",
...     {"proj":'geocent', "ellps":'WGS84', "datum":'WGS84'},
...     always_xy=True,
... )
>>> xpjr, ypjr, zpjr = transprojr.transform(xpj, ypj, zpj, radians=True)
>>> f"{xpjr:.3f} {ypjr:.3f} {zpjr:.3f}"
'-2704026.010 -4253051.810 3895878.820'
>>> transformer = Transformer.from_crs("EPSG:4326", 4326)
>>> xeq, yeq = transformer.transform(33, 98)
>>> f"{xeq:.0f}  {yeq:.0f}"
'33  98'
transform_bounds(left, bottom, right, top, densify_pts=21, radians=False, errcheck=False, direction=TransformDirection.FORWARD)[source]

Added in version 3.1.0.

See: proj_trans_bounds()

Transform boundary densifying the edges to account for nonlinear transformations along these edges and extracting the outermost bounds.

If the destination CRS is geographic and right < left then the bounds crossed the antimeridian. In this scenario there are two polygons, one on each side of the antimeridian. The first polygon should be constructed with (left, bottom, 180, top) and the second with (-180, bottom, top, right).

To construct the bounding polygons with shapely:

def bounding_polygon(left, bottom, right, top):
    if right < left:
        return shapely.geometry.MultiPolygon(
            [
                shapely.geometry.box(left, bottom, 180, top),
                shapely.geometry.box(-180, bottom, right, top),
            ]
        )
    return shapely.geometry.box(left, bottom, right, top)
Parameters:
  • left (float) – Minimum bounding coordinate of the first axis in source CRS (or the target CRS if using the reverse direction).

  • bottom (float) – Minimum bounding coordinate of the second axis in source CRS. (or the target CRS if using the reverse direction).

  • right (float) – Maximum bounding coordinate of the first axis in source CRS. (or the target CRS if using the reverse direction).

  • top (float) – Maximum bounding coordinate of the second axis in source CRS. (or the target CRS if using the reverse direction).

  • densify_points (uint, default=21) – Number of points to add to each edge to account for nonlinear edges produced by the transform process. Large numbers will produce worse performance.

  • radians (bool, default=False) – If True, will expect input data to be in radians and will return radians if the projection is geographic. Otherwise, it uses degrees.

  • errcheck (bool, default=False) – If True, an exception is raised if the errors are found in the process. If False, inf is returned for errors.

  • direction (pyproj.enums.TransformDirection, optional) – The direction of the transform. Default is pyproj.enums.TransformDirection.FORWARD.

Returns:

left, bottom, right, top – Outermost coordinates in target coordinate reference system.

Return type:

float

class pyorps.raster.handler.Window(col_off, row_off, width, height)[source]

Bases: object

Windows are rectangular subsets of rasters.

This class abstracts the 2-tuples mentioned in the module docstring and adds methods and new constructors.

col_off, row_off

The offset for the window.

Type:

float

width, height

Lengths of the window.

Type:

float

Notes

Previously the lengths were called ‘num_cols’ and ‘num_rows’ but this is a bit confusing in the new float precision world and the attributes have been changed. The originals are deprecated.

col_off
crop(height, width)[source]

Return a copy cropped to height and width

flatten()[source]

A flattened form of the window.

Returns:

col_off, row_off, width, height – Window offsets and lengths.

Return type:

float

classmethod from_slices(rows, cols, height=-1, width=-1, boundless=False)[source]

Construct a Window from row and column slices or tuples / lists of start and stop indexes. Converts the rows and cols to offsets, height, and width.

In general, indexes are defined relative to the upper left corner of the dataset: rows=(0, 10), cols=(0, 4) defines a window that is 4 columns wide and 10 rows high starting from the upper left.

Start indexes may be None and will default to 0. Stop indexes may be None and will default to width or height, which must be provided in this case.

Negative start indexes are evaluated relative to the lower right of the dataset: rows=(-2, None), cols=(-2, None) defines a window that is 2 rows high and 2 columns wide starting from the bottom right.

Parameters:
  • rows (slice, tuple, or list) – Slices or 2 element tuples/lists containing start, stop indexes.

  • cols (slice, tuple, or list) – Slices or 2 element tuples/lists containing start, stop indexes.

  • height (float) – A shape to resolve relative values against. Only used when a start or stop index is negative or a stop index is None.

  • width (float) – A shape to resolve relative values against. Only used when a start or stop index is negative or a stop index is None.

  • boundless (bool, optional) – Whether the inputs are bounded (default) or not.

Return type:

Window

height
intersection(other)[source]

Return the intersection of this window and another

Parameters:

other (Window) – Another window

Return type:

Window

round(ndigits=None)[source]

Round a window’s offsets and lengths

Rounding to a very small fraction of a pixel can help treat floating point issues arising from computation of windows.

round_lengths(**kwds)[source]

Return a copy with width and height rounded.

Lengths are rounded to the nearest whole number. The offsets are not changed.

Parameters:

kwds (dict) – Collects keyword arguments that are no longer used.

Return type:

Window

round_offsets(**kwds)[source]

Return a copy with column and row offsets rounded.

Offsets are rounded to the preceding whole number. The lengths are not changed.

Parameters:

kwds (dict) – Collects keyword arguments that are no longer used.

Return type:

Window

round_shape(**kwds)[source]
row_off
todict()[source]

A mapping of attribute names and values.

Return type:

dict

toranges()[source]

Makes an equivalent pair of range tuples

toslices()[source]

Slice objects for use as an ndarray indexer.

Returns:

row_slice, col_slice – A pair of slices in row, column order

Return type:

slice

width
pyorps.raster.handler.create_test_tiff(output_path, width=100, height=100, transform=None, crs='EPSG:32632', pattern='random', bands=1, nodata=None)[source]

Creates a synthetic GeoTIFF file for testing with different patterns.

Parameters:
  • output_path (str) – Path to save the test GeoTIFF file

  • width (int) – Width of the raster in pixels

  • height (int) – Height of the raster in pixels

  • transform (Optional[Affine]) – Affine transformation for the raster

  • crs – Coordinate reference system

  • pattern (str) – Data pattern - “random”, “gradient”, or “checkerboard”

  • bands (int) – Number of bands to create

  • nodata (Optional[int]) – No data value

Returns:

An array which can be used as a test raster

pyorps.raster.handler.from_origin(west, north, xsize, ysize)[source]

Return an Affine transformation given upper left and pixel sizes.

Return an Affine transformation for a georeferenced raster given the coordinates of its upper left corner west, north and pixel sizes xsize, ysize.

pyorps.raster.handler.rasterize(shapes, out_shape=None, fill=0, nodata=None, masked=False, out=None, transform=(1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0), all_touched=False, merge_alg=MergeAlg.replace, default_value=1, dtype=None, skip_invalid=True, dst_path=None, dst_kwds=None)[source]

Return an image array with input geometries burned in.

Warnings will be raised for any invalid or empty geometries, and an exception will be raised if there are no valid shapes to rasterize.

Parameters:
  • shapes (iterable of (geometry, value) pairs or geometries) – The geometry can either be an object that implements the geo interface or GeoJSON-like object. If no value is provided the default_value will be used. If value is None the fill value will be used.

  • out_shape (tuple or list with 2 integers) – Shape of output numpy.ndarray.

  • fill (int or float, optional) – Used as fill value for all areas not covered by input geometries.

  • nodata (float, optional) – nodata value to use in output file or masked array.

  • masked (bool, optional. Default: False.) – If True, return a masked array. Note: nodata is always set in the case of file output.

  • out (numpy.ndarray, optional) – Array in which to store results. If not provided, out_shape and dtype are required.

  • transform (Affine transformation object, optional) – Transformation from pixel coordinates of source to the coordinate system of the input shapes. See the transform property of dataset objects.

  • all_touched (boolean, optional) – If True, all pixels touched by geometries will be burned in. If false, only pixels whose center is within the polygon or that are selected by Bresenham’s line algorithm will be burned in.

  • merge_alg (MergeAlg, optional) –

    Merge algorithm to use. One of:
    MergeAlg.replace (default):

    the new value will overwrite the existing value.

    MergeAlg.add:

    the new value will be added to the existing raster.

  • default_value (int or float, optional) – Used as value for all geometries, if not provided in shapes.

  • dtype (rasterio or numpy.dtype, optional) – Used as data type for results, if out is not provided.

  • skip_invalid (bool, optional) – If True (default), invalid shapes will be skipped. If False, ValueError will be raised.

  • dst_path (str or PathLike, optional) – Path of output dataset

  • dst_kwds (dict, optional) – Dictionary of creation options and other parameters that will be overlaid on the profile of the output dataset.

Returns:

If out was not None then out is returned, it will have been modified in-place. If out was None, this will be a new array.

Return type:

numpy.ndarray

Notes

Valid data types for fill, default_value, out, dtype and shape values are “int16”, “int32”, “uint8”, “uint16”, “uint32”, “float32”, and “float64”.

This function requires significant memory resources. The shapes iterator will be materialized to a Python list and another C copy of that list will be made. The out array will be copied and additional temporary raster memory equal to 2x the smaller of out data or GDAL’s max cache size (controlled by GDAL_CACHEMAX, default is 5% of the computer’s physical memory) is required.

If GDAL max cache size is smaller than the output data, the array of shapes will be iterated multiple times. Performance is thus a linear function of buffer size. For maximum speed, ensure that GDAL_CACHEMAX is larger than the size of out or out_shape.

pyorps.raster.handler.rio_open(fp, mode='r', driver=None, width=None, height=None, count=None, crs=None, transform=None, dtype=None, nodata=None, sharing=False, opener=None, **kwargs)

Open a dataset for reading or writing.

The dataset may be located in a local file, in a resource located by a URL, or contained within a stream of bytes. This function accepts different types of fp parameters. However, it is almost always best to pass a string that has a dataset name as its value. These are passed directly to GDAL protocol and format handlers. A path to a zipfile is more efficiently used by GDAL than a Python ZipFile object, for example.

In read (‘r’) or read/write (‘r+’) mode, no keyword arguments are required: these attributes are supplied by the opened dataset.

In write (‘w’ or ‘w+’) mode, the driver, width, height, count, and dtype keywords are strictly required.

Parameters:
  • fp (str, os.PathLike, file-like, or rasterio.io.MemoryFile) – A filename or URL, a file object opened in binary (‘rb’) mode, a Path object, or one of the rasterio classes that provides the dataset-opening interface (has an open method that returns a dataset). Use a string when possible: GDAL can more efficiently access a dataset if it opens it natively.

  • mode (str, optional) – ‘r’ (read, the default), ‘r+’ (read/write), ‘w’ (write), or ‘w+’ (write/read).

  • driver (str, optional) – A short format driver name (e.g. “GTiff” or “JPEG”) or a list of such names (see GDAL docs at https://gdal.org/drivers/raster/index.html). In ‘w’ or ‘w+’ modes a single name is required. In ‘r’ or ‘r+’ modes the driver can usually be omitted. Registered drivers will be tried sequentially until a match is found. When multiple drivers are available for a format such as JPEG2000, one of them can be selected by using this keyword argument.

  • width (int, optional) – The number of columns of the raster dataset. Required in ‘w’ or ‘w+’ modes, it is ignored in ‘r’ or ‘r+’ modes.

  • height (int, optional) – The number of rows of the raster dataset. Required in ‘w’ or ‘w+’ modes, it is ignored in ‘r’ or ‘r+’ modes.

  • count (int, optional) – The count of dataset bands. Required in ‘w’ or ‘w+’ modes, it is ignored in ‘r’ or ‘r+’ modes.

  • crs (str, dict, or CRS, optional) – The coordinate reference system. Required in ‘w’ or ‘w+’ modes, it is ignored in ‘r’ or ‘r+’ modes.

  • transform (affine.Affine, optional) – Affine transformation mapping the pixel space to geographic space. Required in ‘w’ or ‘w+’ modes, it is ignored in ‘r’ or ‘r+’ modes.

  • dtype (str or numpy.dtype, optional) – The data type for bands. For example: ‘uint8’ or rasterio.uint16. Required in ‘w’ or ‘w+’ modes, it is ignored in ‘r’ or ‘r+’ modes.

  • nodata (int, float, or nan, optional) – Defines the pixel value to be interpreted as not valid data. Required in ‘w’ or ‘w+’ modes, it is ignored in ‘r’ or ‘r+’ modes.

  • sharing (bool, optional) – To reduce overhead and prevent programs from running out of file descriptors, rasterio maintains a pool of shared low level dataset handles. If True this function will use a shared handle if one is available. Multithreaded programs must avoid sharing and should set sharing to False.

  • opener (callable, optional) – A custom dataset opener which can serve GDAL’s virtual filesystem machinery via Python file-like objects. The underlying file-like object is obtained by calling opener with (fp, mode) or (fp, mode + “b”) depending on the format driver’s native mode. opener must return a Python file-like object that provides read, seek, tell, and close methods. Note: only one opener at a time per fp, mode pair is allowed.

  • kwargs (optional) – These are passed to format drivers as directives for creating or interpreting datasets. For example: in ‘w’ or ‘w+’ modes a tiled=True keyword argument will direct the GeoTIFF format driver to create a tiled, rather than striped, TIFF.

Returns:

  • rasterio.io.DatasetReader – If mode is “r”.

  • rasterio.io.DatasetWriter – If mode is “r+”, “w”, or “w+”.

Raises:
  • TypeError – If arguments are of the wrong Python type.

  • rasterio.errors.RasterioIOError – If the dataset can not be opened. Such as when there is no dataset with the given name.

  • rasterio.errors.DriverCapabilityError – If the detected format driver does not support the requested opening mode.

Examples

To open a local GeoTIFF dataset for reading using standard driver discovery and no directives:

>>> import rasterio
>>> with rasterio.open('example.tif') as dataset:
...     print(dataset.profile)

To open a local JPEG2000 dataset using only the JP2OpenJPEG driver:

>>> with rasterio.open(
...         'example.jp2', driver='JP2OpenJPEG') as dataset:
...     print(dataset.profile)

To create a new 8-band, 16-bit unsigned, tiled, and LZW-compressed GeoTIFF with a global extent and 0.5 degree resolution:

>>> from rasterio.transform import from_origin
>>> with rasterio.open(
...         'example.tif', 'w', driver='GTiff', dtype='uint16',
...         width=720, height=360, count=8, crs='EPSG:4326',
...         transform=from_origin(-180.0, 90.0, 0.5, 0.5),
...         nodata=0, tiled=True, compress='lzw') as dataset:
...     dataset.write(...)
pyorps.raster.handler.rowcol(transform, xs, ys, zs=None, op=None, precision=None, **rpc_options)[source]

Get rows and cols of the pixels containing (x, y).

Parameters:
  • transform (Affine or sequence of GroundControlPoint or RPC) – Transform suitable for input to AffineTransformer, GCPTransformer, or RPCTransformer.

  • xs (list or float) – x values in coordinate reference system.

  • ys (list or float) – y values in coordinate reference system.

  • zs (list or float, optional) – Height associated with coordinates. Primarily used for RPC based coordinate transformations. Ignored for affine based transformations. Default: 0.

  • op (function, optional (default: numpy.floor)) – Function to convert fractional pixels to whole numbers (floor, ceiling, round)

  • precision (int or float, optional) – This parameter is unused, deprecated in rasterio 1.3.0, and will be removed in version 2.0.0.

  • rpc_options (dict, optional) – Additional arguments passed to GDALCreateRPCTransformer.

Returns:

  • rows (array of ints or floats)

  • cols (array of ints or floats) – Integers are the default. The numerical type is determined by the type returned by op().

pyorps.raster.handler.transform_window(window, transform)

Construct an affine transform matrix relative to a window.

Parameters:
  • window (Window) – The input window.

  • transform (Affine) – an affine transform matrix.

Returns:

The affine transform matrix for the given window

Return type:

Affine

pyorps.raster.handler.transform_xy(transform, rows, cols, zs=None, offset='center', **rpc_options)

Get the x and y coordinates of pixels at rows and cols.

The pixel’s center is returned by default, but a corner can be returned by setting offset to one of ul, ur, ll, lr.

Supports affine, Ground Control Point (GCP), or Rational Polynomial Coefficients (RPC) based coordinate transformations.

Parameters:
  • transform (Affine or sequence of GroundControlPoint or RPC) – Transform suitable for input to AffineTransformer, GCPTransformer, or RPCTransformer.

  • rows (list or int) – Pixel rows.

  • cols (int or sequence of ints) – Pixel columns.

  • zs (list or float, optional) – Height associated with coordinates. Primarily used for RPC based coordinate transformations. Ignored for affine based transformations. Default: 0.

  • offset (str, optional) – Determines if the returned coordinates are for the center of the pixel or for a corner.

  • rpc_options (dict, optional) – Additional arguments passed to GDALCreateRPCTransformer.

Returns:

  • xs (float or list of floats) – x coordinates in coordinate reference system

  • ys (float or list of floats) – y coordinates in coordinate reference system

pyorps.raster.rasterizer module

class pyorps.raster.rasterizer.Affine(a: float, b: float, c: float, d: float, e: float, f: float, g: float = 0.0, h: float = 0.0, i: float = 1.0)[source]

Bases: Affine

Two dimensional affine transform for 2D linear mapping.

Parameters:
  • a (float) –

    Coefficients of an augmented affine transformation matrix

    x’ | | a b c | | x |
    y’ | = | d e f | | y |
    1 | | 0 0 1 | | 1 |

    a, b, and c are the elements of the first row of the matrix. d, e, and f are the elements of the second row.

  • b (float) –

    Coefficients of an augmented affine transformation matrix

    x’ | | a b c | | x |
    y’ | = | d e f | | y |
    1 | | 0 0 1 | | 1 |

    a, b, and c are the elements of the first row of the matrix. d, e, and f are the elements of the second row.

  • c (float) –

    Coefficients of an augmented affine transformation matrix

    x’ | | a b c | | x |
    y’ | = | d e f | | y |
    1 | | 0 0 1 | | 1 |

    a, b, and c are the elements of the first row of the matrix. d, e, and f are the elements of the second row.

  • d (float) –

    Coefficients of an augmented affine transformation matrix

    x’ | | a b c | | x |
    y’ | = | d e f | | y |
    1 | | 0 0 1 | | 1 |

    a, b, and c are the elements of the first row of the matrix. d, e, and f are the elements of the second row.

  • e (float) –

    Coefficients of an augmented affine transformation matrix

    x’ | | a b c | | x |
    y’ | = | d e f | | y |
    1 | | 0 0 1 | | 1 |

    a, b, and c are the elements of the first row of the matrix. d, e, and f are the elements of the second row.

  • f (float) –

    Coefficients of an augmented affine transformation matrix

    x’ | | a b c | | x |
    y’ | = | d e f | | y |
    1 | | 0 0 1 | | 1 |

    a, b, and c are the elements of the first row of the matrix. d, e, and f are the elements of the second row.

a, b, c, d, e, f, g, h, i

The coefficients of the 3x3 augmented affine transformation matrix

x’ | | a b c | | x |
y’ | = | d e f | | y |
1 | | g h i | | 1 |

g, h, and i are always 0, 0, and 1.

Type:

float

The Affine package is derived from Casey Duncan's Planar package.
See the copyright statement below.  Parallel lines are preserved by
these transforms. Affine transforms can perform any combination of
translations, scales/flips, shears, and rotations.  Class methods
are provided to conveniently compose transforms from these
operations.
Internally the transform is stored as a 3x3 transformation matrix.
The transform may be constructed directly by specifying the first
two rows of matrix values as 6 floats. Since the matrix is an affine
transform, the last row is always ``(0, 0, 1)``.
N.B.
Type:

multiplication of a transform and an (x, y) vector always

returns the column vector that is the matrix multiplication product
of the transform and (x, y) as a column vector, no matter which is
on the left or right side. This is obviously not the case for
matrices and vectors in general, but provides a convenience for
users of this class.
property _scaling

The absolute scaling factors of the transformation.

This tuple represents the absolute value of the scaling factors of the transformation, sorted from bigger to smaller.

almost_equals(other, precision=1e-05)[source]

Compare transforms for approximate equality.

Parameters:

other (Affine) – Transform being compared.

Return type:

bool

Returns:

True if absolute difference between each element of each respective transform matrix < self.precision.

property column_vectors

The values of the transform as three 2D column vectors

property determinant

The determinant of the transform matrix.

This value is equal to the area scaling factor when the transform is applied to a shape.

property eccentricity: float

The eccentricity of the affine transformation.

This value represents the eccentricity of an ellipse under this affine transformation.

Raises NotImplementedError for improper transformations.

classmethod from_gdal(c, a, b, f, d, e)[source]

Use same coefficient order as GDAL’s GetGeoTransform().

Parameters:

e (c, a, b, f, d,) – 6 floats ordered by GDAL.

Return type:

Affine

classmethod identity()[source]

Return the identity transform.

Return type:

Affine

property is_conformal: bool

True if the transform is conformal.

i.e., if angles between points are preserved after applying the transform, within rounding limits. This implies that the transform has no effective shear.

property is_degenerate

True if this transform is degenerate.

Which means that it will collapse a shape to an effective area of zero. Degenerate transforms cannot be inverted.

property is_identity: bool

True if this transform equals the identity matrix, within rounding limits.

property is_orthonormal: bool

True if the transform is orthonormal.

Which means that the transform represents a rigid motion, which has no effective scaling or shear. Mathematically, this means that the axis vectors of the transform matrix are perpendicular and unit-length. Applying an orthonormal transform to a shape always results in a congruent shape.

property is_proper

True if this transform is proper.

Which means that it does not include reflection.

property is_rectilinear: bool

True if the transform is rectilinear.

i.e., whether a shape would remain axis-aligned, within rounding limits, after applying the transform.

itransform(seq)[source]

Transform a sequence of points or vectors in place.

Parameters:

seq – Mutable sequence of Vec2 to be transformed.

Return type:

None

Returns:

None, the input sequence is mutated in place.

classmethod permutation(*scaling)[source]

Create the permutation transform

For 2x2 matrices, there is only one permutation matrix that is not the identity.

Return type:

Affine

precision = 1e-05
classmethod rotation(angle, pivot=None)[source]

Create a rotation transform at the specified angle.

A pivot point other than the coordinate system origin may be optionally specified.

Parameters:
  • angle (float) – Rotation angle in degrees, counter-clockwise about the pivot point.

  • pivot (sequence) – Point to rotate about, if omitted the rotation is about the origin.

Return type:

Affine

property rotation_angle: float

The rotation angle in degrees of the affine transformation.

This is the rotation angle in degrees of the affine transformation, assuming it is in the form M = R S, where R is a rotation and S is a scaling.

Raises UndefinedRotationError for improper and degenerate transformations.

classmethod scale(*scaling)[source]

Create a scaling transform from a scalar or vector.

Parameters:

scaling (float or sequence) – The scaling factor. A scalar value will scale in both dimensions equally. A vector scaling value scales the dimensions independently.

Return type:

Affine

classmethod shear(x_angle=0, y_angle=0)[source]

Create a shear transform along one or both axes.

Parameters:
  • x_angle (float) – Shear angle in degrees parallel to the x-axis.

  • y_angle (float) – Shear angle in degrees parallel to the y-axis.

Return type:

Affine

to_gdal()[source]

Return same coefficient order as GDAL’s SetGeoTransform().

Return type:

tuple

to_shapely()[source]

Return an affine transformation matrix compatible with shapely

Shapely’s affinity module expects an affine transformation matrix in (a,b,d,e,xoff,yoff) order.

Return type:

tuple

classmethod translation(xoff, yoff)[source]

Create a translation transform from an offset vector.

Parameters:
  • xoff (float) – Translation x offset.

  • yoff (float) – Translation y offset.

Return type:

Affine

property xoff: float

Alias for ‘c’

property yoff: float

Alias for ‘f’

class pyorps.raster.rasterizer.CostAssumptions(source=None)[source]

Bases: object

A class for handling cost assumptions for rasterization.

This class handles: - Loading cost assumptions from files (CSV, Excel, JSON) or generating of cost assumptions from a dictionary or a GeoDataFrame. - Mapping costs to features in a GeoDataFrame - Managing hierarchical cost structures

_apply_nested_costs(gdf, main_feature=None, side_features=None)[source]

Apply costs to the GeoDataFrame based on nested dictionary cost assumptions.

Parameters:
  • gdf (GeoDataFrame) – GeoDataFrame to update with cost values

  • main_feature (Optional[str]) – Column name for the primary feature

  • side_features (Optional[list[str]]) – List containing a single column name for the

  • feature (secondary)

Returns:

None (modifies gdf in-place)

_apply_tuple_costs(gdf, main_feature=None, side_features=None)[source]

Apply costs to the GeoDataFrame based on tuple keys in cost assumptions.

Parameters:
  • gdf (GeoDataFrame) – GeoDataFrame to update with cost values

  • main_feature (Optional[str]) – Column name for the primary feature

  • side_features (Optional[list[str]]) – List of column names for secondary features

Returns:

None (modifies gdf in-place)

static _convert_numeric_columns(df)[source]

Convert columns to numeric, handling different decimal separators.

Parameters:
  • df (DataFrame) – DataFrame with potential numeric columns that might use different

  • separators (decimal)

Return type:

DataFrame

Returns:

DataFrame with properly converted numeric columns

_load_csv_cost_assumptions(filepath)[source]

Load cost assumptions from a CSV file with auto-detection of encoding, delimiter, and decimal separator.

Parameters:

filepath (str) – Path to the CSV file

Return type:

dict

Returns:

dictionary of cost assumptions

_load_excel_cost_assumptions(filepath)[source]

Load cost assumptions from an Excel file, handling different decimal separators.

Parameters:

filepath (str) – Path to the Excel file

Return type:

dict

Returns:

dictionary of cost assumptions

_load_json_cost_assumptions(filepath)[source]

Load cost assumptions from a JSON file with auto-detection of encoding.

Parameters:

filepath (str) – Path to the JSON file

Return type:

dict

Returns:

dictionary of cost assumptions

apply_to_geodataframe(gdf, main_feature=None, side_features=None)[source]

Apply cost assumptions to a GeoDataFrame.

Parameters:
  • gdf (GeoDataFrame) – GeoDataFrame to apply costs to

  • main_feature (Optional[str]) – Main feature column name

  • side_features (Optional[list[str]]) – list of side feature column names or single side feature name

Returns:

GeoDataFrame with ‘cost’ column added

convert_df_to_cost_dict(df)[source]

Convert a DataFrame to a nested dictionary for cost assumptions.

Parameters:

df (DataFrame) – DataFrame containing cost assumptions with hierarchical structure

Return type:

dict

Returns:

dictionary of cost assumptions with nested structure based on DataFrame columns

Uses one numeric column for costs, and all other columns as a hierarchical index: - The first column is the ‘main_feature’ - All additional columns are ‘side_features’

cost_dict_to_df(cost_dict)[source]

Convert cost assumptions dictionary to DataFrame.

Parameters:

cost_dict (dict) – Dictionary of cost assumptions

Return type:

DataFrame

Returns:

DataFrame representation of cost assumptions

load(source)[source]

Load cost assumptions from a file or dictionary.

Parameters:

source (Union[str, dict]) – Path to a file or a dictionary containing cost assumptions

Return type:

dict

Returns:

dictionary of cost assumptions

to_csv(filepath, separator=';', decimal='.', encoding='ISO-8859-1')[source]

Save the cost assumptions to a CSV file.

Parameters:
  • filepath (str) – Path where to save the CSV file

  • separator (str) – Column separator character (default is ‘;’)

  • decimal (str) – Decimal separator character (default is ‘.’)

  • encoding (str) – The encoding of the file (default is ‘ISO-8859-1’)

Return type:

None

to_excel(filepath, sheet_name='CostAssumptions', index=False)[source]

Save the cost assumptions to an Excel file.

Parameters:
  • filepath (str) – Path where to save the Excel file

  • sheet_name (str) – Name of the worksheet (default is ‘CostAssumptions’)

  • index (bool) – Whether to write row indices (default is False)

Return type:

None

to_json(filepath, indent=2, encoding='ISO-8859-1')[source]

Save the cost assumptions to a JSON file.

Parameters:
  • filepath (str) – Path where to save the JSON file

  • indent (int) – Number of spaces for indentation (default is 2)

  • encoding (str) – The encoding of the file (default is ‘ISO-8859-1’)

Return type:

None

class pyorps.raster.rasterizer.GeoDataFrame(data=None, *args, geometry=None, crs=None, **kwargs)[source]

Bases: GeoPandasBase, DataFrame

A GeoDataFrame object is a pandas.DataFrame that has one or more columns containing geometry. In addition to the standard DataFrame constructor arguments, GeoDataFrame also accepts the following keyword arguments:

Parameters:
  • crs (value (optional)) – Coordinate Reference System of the geometry objects. Can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.

  • geometry (str or array-like (optional)) –

    Value to use as the active geometry column. If str, treated as column name to use. If array-like, it will be added as new column named ‘geometry’ on the GeoDataFrame and set as the active geometry column.

    Note that if geometry is a (Geo)Series with a name, the name will not be used, a column named “geometry” will still be added. To preserve the name, you can use rename_geometry() to update the geometry column name.

Examples

Constructing GeoDataFrame from a dictionary.

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(d, crs="EPSG:4326")
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)

Notice that the inferred dtype of ‘geometry’ columns is geometry.

>>> gdf.dtypes
col1          object
geometry    geometry
dtype: object

Constructing GeoDataFrame from a pandas DataFrame with a column of WKT geometries:

>>> import pandas as pd
>>> d = {'col1': ['name1', 'name2'], 'wkt': ['POINT (1 2)', 'POINT (2 1)']}
>>> df = pd.DataFrame(d)
>>> gs = geopandas.GeoSeries.from_wkt(df['wkt'])
>>> gdf = geopandas.GeoDataFrame(df, geometry=gs, crs="EPSG:4326")
>>> gdf
    col1          wkt     geometry
0  name1  POINT (1 2)  POINT (1 2)
1  name2  POINT (2 1)  POINT (2 1)

See also

GeoSeries

Series object designed to store shapely geometry objects

_attrs: dict[Hashable, Any]
_cache: dict[str, Any]
property _constructor

Used when a manipulation result has the same dimensions as the original.

_constructor_from_mgr(mgr, axes)[source]
property _constructor_sliced

One-dimensional ndarray with axis labels (including time series).

Labels need not be unique but must be a hashable type. The object supports both integer- and label-based indexing and provides a host of methods for performing operations involving the index. Statistical methods from ndarray have been overridden to automatically exclude missing data (currently represented as NaN).

Operations between Series (+, -, /, *, **) align values based on their associated index values– they need not be the same length. The result index will be the sorted union of the two indexes.

Parameters:
  • data (array-like, Iterable, dict, or scalar value) – Contains data stored in Series. If data is a dict, argument order is maintained.

  • index (array-like or Index (1d)) – Values must be hashable and have the same length as data. Non-unique index values are allowed. Will default to RangeIndex (0, 1, 2, …, n) if not provided. If data is dict-like and index is None, then the keys in the data are used as the index. If the index is not None, the resulting Series is reindexed with the index values.

  • dtype (str, numpy.dtype, or ExtensionDtype, optional) – Data type for the output Series. If not specified, this will be inferred from data. See the user guide for more usages.

  • name (Hashable, default None) – The name to give to the Series.

  • copy (bool, default False) – Copy input data. Only affects Series or 1d ndarray input. See examples.

Notes

Please reference the User Guide for more information.

Examples

Constructing Series from a dictionary with an Index specified

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['a', 'b', 'c'])
>>> ser
a   1
b   2
c   3
dtype: int64

The keys of the dictionary match with the Index values, hence the Index values have no effect.

>>> d = {'a': 1, 'b': 2, 'c': 3}
>>> ser = pd.Series(data=d, index=['x', 'y', 'z'])
>>> ser
x   NaN
y   NaN
z   NaN
dtype: float64

Note that the Index is first build with the keys from the dictionary. After this the Series is reindexed with the given Index values, hence we get all NaN as a result.

Constructing Series from a list with copy=False.

>>> r = [1, 2]
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
[1, 2]
>>> ser
0    999
1      2
dtype: int64

Due to input data type the Series has a copy of the original data even though copy=False, so the data is unchanged.

Constructing Series from a 1d ndarray with copy=False.

>>> r = np.array([1, 2])
>>> ser = pd.Series(r, copy=False)
>>> ser.iloc[0] = 999
>>> r
array([999,   2])
>>> ser
0    999
1      2
dtype: int64

Due to input data type the Series has a view on the original data, so the data is changed as well.

_constructor_sliced_from_mgr(mgr, axes)[source]
_geometry_column_name = None
_get_geometry()[source]
_internal_names: list[str] = ['_mgr', '_cacher', '_item_cache', '_cache', '_is_copy', '_name', '_metadata', '_flags', 'geometry']
_internal_names_set: set[str] = {'_cache', '_cacher', '_flags', '_is_copy', '_item_cache', '_metadata', '_mgr', '_name', 'geometry'}
_metadata: list[str] = ['_geometry_column_name']
_mgr: BlockManager | ArrayManager
_persist_old_default_geometry_colname()[source]

Internal util to temporarily persist the default geometry column name of ‘geometry’ for backwards compatibility.

_set_geometry(col)[source]
property active_geometry_name

Return the name of the active geometry column

Returns a string name if a GeoDataFrame has an active geometry column set. Otherwise returns None. You can also access the active geometry column using the .geometry property. You can set a GeoSeries to be an active geometry using the set_geometry() method.

Returns:

name of an active geometry column or None

Return type:

str

See also

GeoDataFrame.set_geometry

set the active geometry

apply(func, axis=0, raw=False, result_type=None, args=(), **kwargs)[source]

Two-dimensional, size-mutable, potentially heterogeneous tabular data.

Data structure also contains labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure.

Parameters:
  • data (ndarray (structured or homogeneous), Iterable, dict, or DataFrame) –

    Dict can contain Series, arrays, constants, dataclass or list-like objects. If data is a dict, column order follows insertion-order. If a dict contains Series which have an index defined, it is aligned by its index. This alignment also occurs if data is a Series or a DataFrame itself. Alignment is done on Series/DataFrame inputs.

    If data is a list of dicts, column order follows insertion-order.

  • index (Index or array-like) – Index to use for resulting frame. Will default to RangeIndex if no indexing information part of input data and no index provided.

  • columns (Index or array-like) – Column labels to use for resulting frame when data does not have them, defaulting to RangeIndex(0, 1, 2, …, n). If data contains column labels, will perform column selection instead.

  • dtype (dtype, default None) – Data type to force. Only a single dtype is allowed. If None, infer.

  • copy (bool or None, default None) –

    Copy data from inputs. For dict data, the default of None behaves like copy=True. For DataFrame or 2d ndarray input, the default of None behaves like copy=False. If data is a dict containing one or more Series (possibly of different dtypes), copy=False will ensure that these inputs are not copied.

    Changed in version 1.3.0.

See also

DataFrame.from_records

Constructor from tuples, also record arrays.

DataFrame.from_dict

From dicts of Series, arrays, or dicts.

read_csv

Read a comma-separated values (csv) file into DataFrame.

read_table

Read general delimited file into DataFrame.

read_clipboard

Read text from clipboard into DataFrame.

Notes

Please reference the User Guide for more information.

Examples

Constructing DataFrame from a dictionary.

>>> d = {'col1': [1, 2], 'col2': [3, 4]}
>>> df = pd.DataFrame(data=d)
>>> df
   col1  col2
0     1     3
1     2     4

Notice that the inferred dtype is int64.

>>> df.dtypes
col1    int64
col2    int64
dtype: object

To enforce a single dtype:

>>> df = pd.DataFrame(data=d, dtype=np.int8)
>>> df.dtypes
col1    int8
col2    int8
dtype: object

Constructing DataFrame from a dictionary including Series:

>>> d = {'col1': [0, 1, 2, 3], 'col2': pd.Series([2, 3], index=[2, 3])}
>>> pd.DataFrame(data=d, index=[0, 1, 2, 3])
   col1  col2
0     0   NaN
1     1   NaN
2     2   2.0
3     3   3.0

Constructing DataFrame from numpy ndarray:

>>> df2 = pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]),
...                    columns=['a', 'b', 'c'])
>>> df2
   a  b  c
0  1  2  3
1  4  5  6
2  7  8  9

Constructing DataFrame from a numpy ndarray that has labeled columns:

>>> data = np.array([(1, 2, 3), (4, 5, 6), (7, 8, 9)],
...                 dtype=[("a", "i4"), ("b", "i4"), ("c", "i4")])
>>> df3 = pd.DataFrame(data, columns=['c', 'a'])
...
>>> df3
   c  a
0  3  1
1  6  4
2  9  7

Constructing DataFrame from dataclass:

>>> from dataclasses import make_dataclass
>>> Point = make_dataclass("Point", [("x", int), ("y", int)])
>>> pd.DataFrame([Point(0, 0), Point(0, 3), Point(2, 3)])
   x  y
0  0  0
1  0  3
2  2  3

Constructing DataFrame from Series/DataFrame:

>>> ser = pd.Series([1, 2, 3], index=["a", "b", "c"])
>>> df = pd.DataFrame(data=ser, index=["a", "c"])
>>> df
   0
a  1
c  3
>>> df1 = pd.DataFrame([1, 2, 3], index=["a", "b", "c"], columns=["x"])
>>> df2 = pd.DataFrame(data=df1, index=["a", "c"])
>>> df2
   x
a  1
c  3
astype(dtype, copy=None, errors='raise', **kwargs)[source]

Cast a pandas object to a specified dtype dtype. Returns a GeoDataFrame when the geometry column is kept as geometries, otherwise returns a pandas DataFrame. See the pandas.DataFrame.astype docstring for more details. :rtype: GeoDataFrame or DataFrame

clip(mask, keep_geom_type=False, sort=False)[source]

Clip points, lines, or polygon geometries to the mask extent.

Both layers must be in the same Coordinate Reference System (CRS). The GeoDataFrame will be clipped to the full extent of the mask object.

If there are multiple polygons in mask, data from the GeoDataFrame will be clipped to the total boundary of all polygons in mask.

Parameters:
  • mask (GeoDataFrame, GeoSeries, (Multi)Polygon, list-like) – Polygon vector layer used to clip the GeoDataFrame. The mask’s geometry is dissolved into one geometric feature and intersected with GeoDataFrame. If the mask is list-like with four elements (minx, miny, maxx, maxy), clip will use a faster rectangle clipping (clip_by_rect()), possibly leading to slightly different results.

  • keep_geom_type (boolean, default False) – If True, return only geometries of original type in case of intersection resulting in multiple geometry types or GeometryCollections. If False, return all resulting geometries (potentially mixed types).

  • sort (boolean, default False) – If True, the order of rows in the clipped GeoDataFrame will be preserved at small performance cost. If False the order of rows in the clipped GeoDataFrame will be random.

Returns:

Vector data (points, lines, polygons) from the GeoDataFrame clipped to polygon boundary from mask.

Return type:

GeoDataFrame

See also

clip

equivalent top-level function

Examples

Clip points (grocery stores) with polygons (the Near West Side community):

>>> import geodatasets
>>> chicago = geopandas.read_file(
...     geodatasets.get_path("geoda.chicago_health")
... )
>>> near_west_side = chicago[chicago["community"] == "NEAR WEST SIDE"]
>>> groceries = geopandas.read_file(
...     geodatasets.get_path("geoda.groceries")
... ).to_crs(chicago.crs)
>>> groceries.shape
(148, 8)
>>> nws_groceries = groceries.clip(near_west_side)
>>> nws_groceries.shape
(7, 8)
copy(deep=True)[source]

Two-dimensional, size-mutable, potentially heterogeneous tabular data.

Data structure also contains labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure.

Parameters:
  • data (ndarray (structured or homogeneous), Iterable, dict, or DataFrame) –

    Dict can contain Series, arrays, constants, dataclass or list-like objects. If data is a dict, column order follows insertion-order. If a dict contains Series which have an index defined, it is aligned by its index. This alignment also occurs if data is a Series or a DataFrame itself. Alignment is done on Series/DataFrame inputs.

    If data is a list of dicts, column order follows insertion-order.

  • index (Index or array-like) – Index to use for resulting frame. Will default to RangeIndex if no indexing information part of input data and no index provided.

  • columns (Index or array-like) – Column labels to use for resulting frame when data does not have them, defaulting to RangeIndex(0, 1, 2, …, n). If data contains column labels, will perform column selection instead.

  • dtype (dtype, default None) – Data type to force. Only a single dtype is allowed. If None, infer.

  • copy (bool or None, default None) –

    Copy data from inputs. For dict data, the default of None behaves like copy=True. For DataFrame or 2d ndarray input, the default of None behaves like copy=False. If data is a dict containing one or more Series (possibly of different dtypes), copy=False will ensure that these inputs are not copied.

    Changed in version 1.3.0.

See also

DataFrame.from_records

Constructor from tuples, also record arrays.

DataFrame.from_dict

From dicts of Series, arrays, or dicts.

read_csv

Read a comma-separated values (csv) file into DataFrame.

read_table

Read general delimited file into DataFrame.

read_clipboard

Read text from clipboard into DataFrame.

Notes

Please reference the User Guide for more information.

Examples

Constructing DataFrame from a dictionary.

>>> d = {'col1': [1, 2], 'col2': [3, 4]}
>>> df = pd.DataFrame(data=d)
>>> df
   col1  col2
0     1     3
1     2     4

Notice that the inferred dtype is int64.

>>> df.dtypes
col1    int64
col2    int64
dtype: object

To enforce a single dtype:

>>> df = pd.DataFrame(data=d, dtype=np.int8)
>>> df.dtypes
col1    int8
col2    int8
dtype: object

Constructing DataFrame from a dictionary including Series:

>>> d = {'col1': [0, 1, 2, 3], 'col2': pd.Series([2, 3], index=[2, 3])}
>>> pd.DataFrame(data=d, index=[0, 1, 2, 3])
   col1  col2
0     0   NaN
1     1   NaN
2     2   2.0
3     3   3.0

Constructing DataFrame from numpy ndarray:

>>> df2 = pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]),
...                    columns=['a', 'b', 'c'])
>>> df2
   a  b  c
0  1  2  3
1  4  5  6
2  7  8  9

Constructing DataFrame from a numpy ndarray that has labeled columns:

>>> data = np.array([(1, 2, 3), (4, 5, 6), (7, 8, 9)],
...                 dtype=[("a", "i4"), ("b", "i4"), ("c", "i4")])
>>> df3 = pd.DataFrame(data, columns=['c', 'a'])
...
>>> df3
   c  a
0  3  1
1  6  4
2  9  7

Constructing DataFrame from dataclass:

>>> from dataclasses import make_dataclass
>>> Point = make_dataclass("Point", [("x", int), ("y", int)])
>>> pd.DataFrame([Point(0, 0), Point(0, 3), Point(2, 3)])
   x  y
0  0  0
1  0  3
2  2  3

Constructing DataFrame from Series/DataFrame:

>>> ser = pd.Series([1, 2, 3], index=["a", "b", "c"])
>>> df = pd.DataFrame(data=ser, index=["a", "c"])
>>> df
   0
a  1
c  3
>>> df1 = pd.DataFrame([1, 2, 3], index=["a", "b", "c"], columns=["x"])
>>> df2 = pd.DataFrame(data=df1, index=["a", "c"])
>>> df2
   x
a  1
c  3
property crs

The Coordinate Reference System (CRS) represented as a pyproj.CRS object.

Returns None if the CRS is not set, and to set the value it :getter: Returns a pyproj.CRS or None. When setting, the value can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.

Examples

>>> gdf.crs
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

See also

GeoDataFrame.set_crs

assign CRS

GeoDataFrame.to_crs

re-project to another CRS

dissolve(by=None, aggfunc='first', as_index=True, level=None, sort=True, observed=False, dropna=True, method='unary', **kwargs)[source]

Dissolve geometries within groupby into single observation. This is accomplished by applying the union_all method to all geometries within a groupself.

Observations associated with each groupby group will be aggregated using the aggfunc.

Parameters:
  • by (str or list-like, default None) – Column(s) whose values define the groups to be dissolved. If None, the entire GeoDataFrame is considered as a single group. If a list-like object is provided, the values in the list are treated as categorical labels, and polygons will be combined based on the equality of these categorical labels.

  • aggfunc (function or string, default "first") –

    Aggregation function for manipulation of data associated with each group. Passed to pandas groupby.agg method. Accepted combinations are:

    • function

    • string function name

    • list of functions and/or function names, e.g. [np.sum, ‘mean’]

    • dict of axis labels -> functions, function names or list of such.

  • as_index (boolean, default True) – If true, groupby columns become index of result.

  • level (int or str or sequence of int or sequence of str, default None) – If the axis is a MultiIndex (hierarchical), group by a particular level or levels.

  • sort (bool, default True) – Sort group keys. Get better performance by turning this off. Note this does not influence the order of observations within each group. Groupby preserves the order of rows within each group.

  • observed (bool, default False) – This only applies if any of the groupers are Categoricals. If True: only show observed values for categorical groupers. If False: show all values for categorical groupers.

  • dropna (bool, default True) – If True, and if group keys contain NA values, NA values together with row/column will be dropped. If False, NA values will also be treated as the key in groups.

  • method (str (default "unary")) –

    The method to use for the union. Options are:

    • "unary": use the unary union algorithm. This option is the most robust but can be slow for large numbers of geometries (default).

    • "coverage": use the coverage union algorithm. This option is optimized for non-overlapping polygons and can be significantly faster than the unary union algorithm. However, it can produce invalid geometries if the polygons overlap.

  • **kwargs

    Keyword arguments to be passed to the pandas DataFrameGroupby.agg method which is used by dissolve. In particular, numeric_only may be supplied, which will be required in pandas 2.0 for certain aggfuncs.

    Added in version 0.13.0.

Return type:

GeoDataFrame

Examples

>>> from shapely.geometry import Point
>>> d = {
...     "col1": ["name1", "name2", "name1"],
...     "geometry": [Point(1, 2), Point(2, 1), Point(0, 1)],
... }
>>> gdf = geopandas.GeoDataFrame(d, crs=4326)
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)
2  name1  POINT (0 1)
>>> dissolved = gdf.dissolve('col1')
>>> dissolved
                        geometry
col1
name1  MULTIPOINT ((0 1), (1 2))
name2                POINT (2 1)

See also

GeoDataFrame.explode

explode multi-part geometries into single geometries

estimate_utm_crs(datum_name='WGS 84')[source]

Returns the estimated UTM CRS based on the bounds of the dataset.

Added in version 0.9.

Parameters:

datum_name (str, optional) – The name of the datum to use in the query. Default is WGS 84.

Return type:

pyproj.CRS

Examples

>>> import geodatasets
>>> df = geopandas.read_file(
...     geodatasets.get_path("geoda.chicago_health")
... )
>>> df.estimate_utm_crs()
<Derived Projected CRS: EPSG:32616>
Name: WGS 84 / UTM zone 16N
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: Between 90°W and 84°W, northern hemisphere between equator and 84°N...
- bounds: (-90.0, 0.0, -84.0, 84.0)
Coordinate Operation:
- name: UTM zone 16N
- method: Transverse Mercator
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
explode(column=None, ignore_index=False, index_parts=False, **kwargs)[source]

Explode multi-part geometries into multiple single geometries.

Each row containing a multi-part geometry will be split into multiple rows with single geometries, thereby increasing the vertical size of the GeoDataFrame.

Parameters:
  • column (string, default None) – Column to explode. In the case of a geometry column, multi-part geometries are converted to single-part. If None, the active geometry column is used.

  • ignore_index (bool, default False) – If True, the resulting index will be labelled 0, 1, …, n - 1, ignoring index_parts.

  • index_parts (boolean, default False) – If True, the resulting index will be a multi-index (original index with an additional level indicating the multiple geometries: a new zero-based index for each single part geometry per multi-part geometry).

Returns:

Exploded geodataframe with each single geometry as a separate entry in the geodataframe.

Return type:

GeoDataFrame

Examples

>>> from shapely.geometry import MultiPoint
>>> d = {
...     "col1": ["name1", "name2"],
...     "geometry": [
...         MultiPoint([(1, 2), (3, 4)]),
...         MultiPoint([(2, 1), (0, 0)]),
...     ],
... }
>>> gdf = geopandas.GeoDataFrame(d, crs=4326)
>>> gdf
    col1               geometry
0  name1  MULTIPOINT ((1 2), (3 4))
1  name2  MULTIPOINT ((2 1), (0 0))
>>> exploded = gdf.explode(index_parts=True)
>>> exploded
      col1     geometry
0 0  name1  POINT (1 2)
  1  name1  POINT (3 4)
1 0  name2  POINT (2 1)
  1  name2  POINT (0 0)
>>> exploded = gdf.explode(index_parts=False)
>>> exploded
    col1     geometry
0  name1  POINT (1 2)
0  name1  POINT (3 4)
1  name2  POINT (2 1)
1  name2  POINT (0 0)
>>> exploded = gdf.explode(ignore_index=True)
>>> exploded
    col1     geometry
0  name1  POINT (1 2)
1  name1  POINT (3 4)
2  name2  POINT (2 1)
3  name2  POINT (0 0)

See also

GeoDataFrame.dissolve

dissolve geometries into a single observation.

explore(*args, **kwargs)[source]

Interactive map based on GeoPandas and folium/leaflet.js

Generate an interactive leaflet map based on GeoDataFrame

Parameters:
  • column (str, np.array, pd.Series (default None)) – The name of the dataframe column, numpy.array, or pandas.Series to be plotted. If numpy.array or pandas.Series are used then it must have same length as dataframe.

  • cmap (str, matplotlib.Colormap, branca.colormap or function (default None)) –

    The name of a colormap recognized by matplotlib, a list-like of colors, matplotlib.colors.Colormap, a branca.colormap.ColorMap or function that returns a named color or hex based on the column value, e.g.:

    def my_colormap(value):  # scalar value defined in 'column'
        if value > 1:
            return "green"
        return "red"
    

  • color (str, array-like (default None)) – Named color or a list-like of colors (named or hex).

  • m (folium.Map (default None)) – Existing map instance on which to draw the plot.

  • tiles (str, xyzservices.TileProvider (default 'OpenStreetMap Mapnik')) –

    Map tileset to use. Can choose from the list supported by folium, query a xyzservices.TileProvider by a name from xyzservices.providers, pass xyzservices.TileProvider object or pass custom XYZ URL. The current list of built-in providers (when xyzservices is not available):

    ["OpenStreetMap", "CartoDB positron", “CartoDB dark_matter"]

    You can pass a custom tileset to Folium by passing a Leaflet-style URL to the tiles parameter: http://{s}.yourtiles.com/{z}/{x}/{y}.png. Be sure to check their terms and conditions and to provide attribution with the attr keyword.

  • attr (str (default None)) – Map tile attribution; only required if passing custom tile URL.

  • tooltip (bool, str, int, list (default True)) – Display GeoDataFrame attributes when hovering over the object. True includes all columns. False removes tooltip. Pass string or list of strings to specify a column(s). Integer specifies first n columns to be included. Defaults to True.

  • popup (bool, str, int, list (default False)) – Input GeoDataFrame attributes for object displayed when clicking. True includes all columns. False removes popup. Pass string or list of strings to specify a column(s). Integer specifies first n columns to be included. Defaults to False.

  • highlight (bool (default True)) – Enable highlight functionality when hovering over a geometry.

  • categorical (bool (default False)) – If False, cmap will reflect numerical values of the column being plotted. For non-numerical columns, this will be set to True.

  • legend (bool (default True)) – Plot a legend in choropleth plots. Ignored if no column is given.

  • scheme (str (default None)) – Name of a choropleth classification scheme (requires mapclassify >= 2.4.0). A mapclassify.classify() will be used under the hood. Supported are all schemes provided by mapclassify (e.g. 'BoxPlot', 'EqualInterval', 'FisherJenks', 'FisherJenksSampled', 'HeadTailBreaks', 'JenksCaspall', 'JenksCaspallForced', 'JenksCaspallSampled', 'MaxP', 'MaximumBreaks', 'NaturalBreaks', 'Quantiles', 'Percentiles', 'StdMean', 'UserDefined'). Arguments can be passed in classification_kwds.

  • k (int (default 5)) – Number of classes

  • vmin (None or float (default None)) – Minimum value of cmap. If None, the minimum data value in the column to be plotted is used.

  • vmax (None or float (default None)) – Maximum value of cmap. If None, the maximum data value in the column to be plotted is used.

  • width (pixel int or percentage string (default: '100%')) – Width of the folium Map. If the argument m is given explicitly, width is ignored.

  • height (pixel int or percentage string (default: '100%')) – Height of the folium Map. If the argument m is given explicitly, height is ignored.

  • categories (list-like) – Ordered list-like object of categories to be used for categorical plot.

  • classification_kwds (dict (default None)) – Keyword arguments to pass to mapclassify

  • control_scale (bool, (default True)) – Whether to add a control scale on the map.

  • marker_type (str, folium.Circle, folium.CircleMarker, folium.Marker (default None)) – Allowed string options are (‘marker’, ‘circle’, ‘circle_marker’). Defaults to folium.CircleMarker.

  • marker_kwds (dict (default {})) –

    Additional keywords to be passed to the selected marker_type, e.g.:

    radiusfloat (default 2 for circle_marker and 50 for circle))

    Radius of the circle, in meters (for circle) or pixels (for circle_marker).

    fillbool (default True)

    Whether to fill the circle or circle_marker with color.

    iconfolium.map.Icon

    the folium.map.Icon object to use to render the marker.

    draggablebool (default False)

    Set to True to be able to drag the marker around the map.

  • style_kwds (dict (default {})) –

    Additional style to be passed to folium style_function:

    strokebool (default True)

    Whether to draw stroke along the path. Set it to False to disable borders on polygons or circles.

    colorstr

    Stroke color

    weightint

    Stroke width in pixels

    opacityfloat (default 1.0)

    Stroke opacity

    fillboolean (default True)

    Whether to fill the path with color. Set it to False to disable filling on polygons or circles.

    fillColorstr

    Fill color. Defaults to the value of the color option

    fillOpacityfloat (default 0.5)

    Fill opacity.

    style_functioncallable

    Function mapping a GeoJson Feature to a style dict.

    • Style properties folium.vector_layers.path_options()

    • GeoJson features GeoDataFrame.__geo_interface__

    e.g.:

    lambda x: {"color":"red" if x["properties"]["gdp_md_est"]<10**6
                                 else "blue"}
    

    Plus all supported by folium.vector_layers.path_options(). See the documentation of folium.features.GeoJson for details.

  • highlight_kwds (dict (default {})) – Style to be passed to folium highlight_function. Uses the same keywords as style_kwds. When empty, defaults to {"fillOpacity": 0.75}.

  • tooltip_kwds (dict (default {})) – Additional keywords to be passed to folium.features.GeoJsonTooltip, e.g. aliases, labels, or sticky.

  • popup_kwds (dict (default {})) – Additional keywords to be passed to folium.features.GeoJsonPopup, e.g. aliases or labels.

  • legend_kwds (dict (default {})) –

    Additional keywords to be passed to the legend.

    Currently supported customisation:

    captionstring

    Custom caption of the legend. Defaults to the column name.

    Additional accepted keywords when scheme is specified:

    colorbarbool (default True)

    An option to control the style of the legend. If True, continuous colorbar will be used. If False, categorical legend will be used for bins.

    scalebool (default True)

    Scale bins along the colorbar axis according to the bin edges (True) or use the equal length for each bin (False)

    fmtstring (default “{:.2f}”)

    A formatting specification for the bin edges of the classes in the legend. For example, to have no decimals: {"fmt": "{:.0f}"}. Applies if colorbar=False.

    labelslist-like

    A list of legend labels to override the auto-generated labels. Needs to have the same number of elements as the number of classes (k). Applies if colorbar=False.

    intervalboolean (default False)

    An option to control brackets from mapclassify legend. If True, open/closed interval brackets are shown in the legend. Applies if colorbar=False.

    max_labelsint, default 10

    Maximum number of colorbar tick labels (requires branca>=0.5.0)

  • map_kwds (dict (default {})) – Additional keywords to be passed to folium Map, e.g. dragging, or scrollWheelZoom.

**kwargsdict

Additional options to be passed on to the folium object.

Returns:

m – folium Map instance

Return type:

folium.folium.Map

Examples

>>> import geodatasets
>>> df = geopandas.read_file(
...     geodatasets.get_path("geoda.chicago_health")
... )
>>> df.head(2)
   ComAreaID  ...                                           geometry
0         35  ...  POLYGON ((-87.60914 41.84469, -87.60915 41.844...
1         36  ...  POLYGON ((-87.59215 41.81693, -87.59231 41.816...

[2 rows x 87 columns]

>>> df.explore("Pop2012", cmap="Blues")
classmethod from_arrow(table, geometry=None)[source]

Construct a GeoDataFrame from a Arrow table object based on GeoArrow extension types.

See https://geoarrow.org/ for details on the GeoArrow specification.

This functions accepts any tabular Arrow object implementing the Arrow PyCapsule Protocol (i.e. having an __arrow_c_array__ or __arrow_c_stream__ method).

Added in version 1.0.

Parameters:
  • table (pyarrow.Table or Arrow-compatible table) – Any tabular object implementing the Arrow PyCapsule Protocol (i.e. has an __arrow_c_array__ or __arrow_c_stream__ method). This table should have at least one column with a geoarrow geometry type.

  • geometry (str, default None) – The name of the geometry column to set as the active geometry column. If None, the first geometry column found will be used.

Return type:

GeoDataFrame

classmethod from_dict(data, geometry=None, crs=None, **kwargs)[source]

Construct GeoDataFrame from dict of array-like or dicts by overriding DataFrame.from_dict method with geometry and crs

Parameters:
  • data (dict) – Of the form {field : array-like} or {field : dict}.

  • geometry (str or array (optional)) – If str, column to use as geometry. If array, will be set as ‘geometry’ column on GeoDataFrame.

  • crs (str or dict (optional)) – Coordinate reference system to set on the resulting frame.

  • kwargs (key-word arguments) – These arguments are passed to DataFrame.from_dict

Return type:

GeoDataFrame

classmethod from_features(features, crs=None, columns=None)[source]

Alternate constructor to create GeoDataFrame from an iterable of features or a feature collection.

Parameters:
  • features

    • Iterable of features, where each element must be a feature dictionary or implement the __geo_interface__.

    • Feature collection, where the ‘features’ key contains an iterable of features.

    • Object holding a feature collection that implements the __geo_interface__.

  • crs (str or dict (optional)) – Coordinate reference system to set on the resulting frame.

  • columns (list of column names, optional) – Optionally specify the column names to include in the output frame. This does not overwrite the property names of the input, but can ensure a consistent output format.

Return type:

GeoDataFrame

Notes

For more information about the __geo_interface__, see https://gist.github.com/sgillies/2217756

Examples

>>> feature_coll = {
...     "type": "FeatureCollection",
...     "features": [
...         {
...             "id": "0",
...             "type": "Feature",
...             "properties": {"col1": "name1"},
...             "geometry": {"type": "Point", "coordinates": (1.0, 2.0)},
...             "bbox": (1.0, 2.0, 1.0, 2.0),
...         },
...         {
...             "id": "1",
...             "type": "Feature",
...             "properties": {"col1": "name2"},
...             "geometry": {"type": "Point", "coordinates": (2.0, 1.0)},
...             "bbox": (2.0, 1.0, 2.0, 1.0),
...         },
...     ],
...     "bbox": (1.0, 1.0, 2.0, 2.0),
... }
>>> df = geopandas.GeoDataFrame.from_features(feature_coll)
>>> df
      geometry   col1
0  POINT (1 2)  name1
1  POINT (2 1)  name2
classmethod from_file(filename, **kwargs)[source]

Alternate constructor to create a GeoDataFrame from a file.

It is recommended to use geopandas.read_file() instead.

Can load a GeoDataFrame from a file in any format recognized by pyogrio. See http://pyogrio.readthedocs.io/ for details.

Parameters:
  • filename (str) – File path or file handle to read from. Depending on which kwargs are included, the content of filename may vary. See pyogrio.read_dataframe() for usage details.

  • kwargs (key-word arguments) – These arguments are passed to pyogrio.read_dataframe(), and can be used to access multi-layer data, data stored within archives (zip files), etc.

Examples

>>> import geodatasets
>>> path = geodatasets.get_path('nybb')
>>> gdf = geopandas.GeoDataFrame.from_file(path)
>>> gdf
   BoroCode       BoroName     Shape_Leng    Shape_Area                                           geometry
0         5  Staten Island  330470.010332  1.623820e+09  MULTIPOLYGON (((970217.022 145643.332, 970227....
1         4         Queens  896344.047763  3.045213e+09  MULTIPOLYGON (((1029606.077 156073.814, 102957...
2         3       Brooklyn  741080.523166  1.937479e+09  MULTIPOLYGON (((1021176.479 151374.797, 102100...
3         1      Manhattan  359299.096471  6.364715e+08  MULTIPOLYGON (((981219.056 188655.316, 980940....
4         2          Bronx  464392.991824  1.186925e+09  MULTIPOLYGON (((1012821.806 229228.265, 101278...

The recommended method of reading files is geopandas.read_file():

>>> gdf = geopandas.read_file(path)

See also

read_file

read file to GeoDataFame

GeoDataFrame.to_file

write GeoDataFrame to file

classmethod from_postgis(sql, con, geom_col='geom', crs=None, index_col=None, coerce_float=True, parse_dates=None, params=None, chunksize=None)[source]

Alternate constructor to create a GeoDataFrame from a sql query containing a geometry column in WKB representation.

Parameters:
  • sql (string)

  • con (sqlalchemy.engine.Connection or sqlalchemy.engine.Engine)

  • geom_col (string, default 'geom') – column name to convert to shapely geometries

  • crs (optional) – Coordinate reference system to use for the returned GeoDataFrame

  • index_col (string or list of strings, optional, default: None) – Column(s) to set as index(MultiIndex)

  • coerce_float (boolean, default True) – Attempt to convert values of non-string, non-numeric objects (like decimal.Decimal) to floating point, useful for SQL result sets

  • parse_dates (list or dict, default None) –

    • List of column names to parse as dates.

    • Dict of {column_name: format string} where format string is strftime compatible in case of parsing string times, or is one of (D, s, ns, ms, us) in case of parsing integer timestamps.

    • Dict of {column_name: arg dict}, where the arg dict corresponds to the keyword arguments of pandas.to_datetime(). Especially useful with databases without native Datetime support, such as SQLite.

  • params (list, tuple or dict, optional, default None) – List of parameters to pass to execute method.

  • chunksize (int, default None) – If specified, return an iterator where chunksize is the number of rows to include in each chunk.

Examples

PostGIS

>>> from sqlalchemy import create_engine
>>> db_connection_url = "postgresql://myusername:mypassword@myhost:5432/mydb"
>>> con = create_engine(db_connection_url)
>>> sql = "SELECT geom, highway FROM roads"
>>> df = geopandas.GeoDataFrame.from_postgis(sql, con)

SpatiaLite

>>> sql = "SELECT ST_Binary(geom) AS geom, highway FROM roads"
>>> df = geopandas.GeoDataFrame.from_postgis(sql, con)

The recommended method of reading from PostGIS is geopandas.read_postgis():

>>> df = geopandas.read_postgis(sql, con)

See also

geopandas.read_postgis

read PostGIS database to GeoDataFrame

property geometry

Geometry data for GeoDataFrame

iterfeatures(na='null', show_bbox=False, drop_id=False)[source]

Returns an iterator that yields feature dictionaries that comply with __geo_interface__

Parameters:
  • na (str, optional) –

    Options are {‘null’, ‘drop’, ‘keep’}, default ‘null’. Indicates how to output missing (NaN) values in the GeoDataFrame

    • null: output the missing entries as JSON null

    • drop: remove the property from the feature. This applies to each feature individually so that features may have different properties

    • keep: output the missing entries as NaN

  • show_bbox (bool, optional) – Include bbox (bounds) in the geojson. Default False.

  • drop_id (bool, default: False) – Whether to retain the index of the GeoDataFrame as the id property in the generated GeoJSON. Default is False, but may want True if the index is just arbitrary row numbers.

Examples

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(d, crs="EPSG:4326")
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)
>>> feature = next(gdf.iterfeatures())
>>> feature
{'id': '0', 'type': 'Feature', 'properties': {'col1': 'name1'}, 'geometry': {'type': 'Point', 'coordinates': (1.0, 2.0)}}
overlay(right, how='intersection', keep_geom_type=None, make_valid=True)[source]

Perform spatial overlay between GeoDataFrames.

Currently only supports data GeoDataFrames with uniform geometry types, i.e. containing only (Multi)Polygons, or only (Multi)Points, or a combination of (Multi)LineString and LinearRing shapes. Implements several methods that are all effectively subsets of the union.

See the User Guide page ../../user_guide/set_operations for details.

Parameters:
  • right (GeoDataFrame)

  • how (string) – Method of spatial overlay: ‘intersection’, ‘union’, ‘identity’, ‘symmetric_difference’ or ‘difference’.

  • keep_geom_type (bool) – If True, return only geometries of the same geometry type the GeoDataFrame has, if False, return all resulting geometries. Default is None, which will set keep_geom_type to True but warn upon dropping geometries.

  • make_valid (bool, default True) – If True, any invalid input geometries are corrected with a call to make_valid(), if False, a ValueError is raised if any input geometries are invalid.

Returns:

df – GeoDataFrame with new set of polygons and attributes resulting from the overlay

Return type:

GeoDataFrame

Examples

>>> from shapely.geometry import Polygon
>>> polys1 = geopandas.GeoSeries([Polygon([(0,0), (2,0), (2,2), (0,2)]),
...                               Polygon([(2,2), (4,2), (4,4), (2,4)])])
>>> polys2 = geopandas.GeoSeries([Polygon([(1,1), (3,1), (3,3), (1,3)]),
...                               Polygon([(3,3), (5,3), (5,5), (3,5)])])
>>> df1 = geopandas.GeoDataFrame({'geometry': polys1, 'df1_data':[1,2]})
>>> df2 = geopandas.GeoDataFrame({'geometry': polys2, 'df2_data':[1,2]})
>>> df1.overlay(df2, how='union')
   df1_data  df2_data                                           geometry
0       1.0       1.0                POLYGON ((2 2, 2 1, 1 1, 1 2, 2 2))
1       2.0       1.0                POLYGON ((2 2, 2 3, 3 3, 3 2, 2 2))
2       2.0       2.0                POLYGON ((4 4, 4 3, 3 3, 3 4, 4 4))
3       1.0       NaN      POLYGON ((2 0, 0 0, 0 2, 1 2, 1 1, 2 1, 2 0))
4       2.0       NaN  MULTIPOLYGON (((3 4, 3 3, 2 3, 2 4, 3 4)), ((4...
5       NaN       1.0  MULTIPOLYGON (((2 3, 2 2, 1 2, 1 3, 2 3)), ((3...
6       NaN       2.0      POLYGON ((3 5, 5 5, 5 3, 4 3, 4 4, 3 4, 3 5))
>>> df1.overlay(df2, how='intersection')
   df1_data  df2_data                             geometry
0         1         1  POLYGON ((2 2, 2 1, 1 1, 1 2, 2 2))
1         2         1  POLYGON ((2 2, 2 3, 3 3, 3 2, 2 2))
2         2         2  POLYGON ((4 4, 4 3, 3 3, 3 4, 4 4))
>>> df1.overlay(df2, how='symmetric_difference')
   df1_data  df2_data                                           geometry
0       1.0       NaN      POLYGON ((2 0, 0 0, 0 2, 1 2, 1 1, 2 1, 2 0))
1       2.0       NaN  MULTIPOLYGON (((3 4, 3 3, 2 3, 2 4, 3 4)), ((4...
2       NaN       1.0  MULTIPOLYGON (((2 3, 2 2, 1 2, 1 3, 2 3)), ((3...
3       NaN       2.0      POLYGON ((3 5, 5 5, 5 3, 4 3, 4 4, 3 4, 3 5))
>>> df1.overlay(df2, how='difference')
                                            geometry  df1_data
0      POLYGON ((2 0, 0 0, 0 2, 1 2, 1 1, 2 1, 2 0))         1
1  MULTIPOLYGON (((3 4, 3 3, 2 3, 2 4, 3 4)), ((4...         2
>>> df1.overlay(df2, how='identity')
   df1_data  df2_data                                           geometry
0       1.0       1.0                POLYGON ((2 2, 2 1, 1 1, 1 2, 2 2))
1       2.0       1.0                POLYGON ((2 2, 2 3, 3 3, 3 2, 2 2))
2       2.0       2.0                POLYGON ((4 4, 4 3, 3 3, 3 4, 4 4))
3       1.0       NaN      POLYGON ((2 0, 0 0, 0 2, 1 2, 1 1, 2 1, 2 0))
4       2.0       NaN  MULTIPOLYGON (((3 4, 3 3, 2 3, 2 4, 3 4)), ((4...

See also

GeoDataFrame.sjoin

spatial join

overlay

equivalent top-level function

Notes

Every operation in GeoPandas is planar, i.e. the potential third dimension is not taken into account.

plot

alias of GeoplotAccessor

rename_geometry(col, inplace=False)[source]

Renames the GeoDataFrame geometry column to the specified name. By default yields a new object.

The original geometry column is replaced with the input.

Parameters:
  • col (new geometry column label)

  • inplace (boolean, default False) – Modify the GeoDataFrame in place (do not create a new object)

Examples

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> df = geopandas.GeoDataFrame(d, crs="EPSG:4326")
>>> df1 = df.rename_geometry('geom1')
>>> df1.geometry.name
'geom1'
>>> df.rename_geometry('geom1', inplace=True)
>>> df.geometry.name
'geom1'
Returns:

geodataframe

Return type:

GeoDataFrame

See also

GeoDataFrame.set_geometry

set the active geometry

set_crs(crs=None, epsg=None, inplace=False, allow_override=False)[source]

Set the Coordinate Reference System (CRS) of the GeoDataFrame.

If there are multiple geometry columns within the GeoDataFrame, only the CRS of the active geometry column is set.

Pass None to remove CRS from the active geometry column.

Notes

The underlying geometries are not transformed to this CRS. To transform the geometries to a new CRS, use the to_crs method.

Parameters:
  • crs (pyproj.CRS | None, optional) – The value can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.

  • epsg (int, optional) – EPSG code specifying the projection.

  • inplace (bool, default False) – If True, the CRS of the GeoDataFrame will be changed in place (while still returning the result) instead of making a copy of the GeoDataFrame.

  • allow_override (bool, default False) – If the the GeoDataFrame already has a CRS, allow to replace the existing CRS, even when both are not equal.

Examples

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(d)
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)

Setting CRS to a GeoDataFrame without one:

>>> gdf.crs is None
True
>>> gdf = gdf.set_crs('epsg:3857')
>>> gdf.crs
<Projected CRS: EPSG:3857>
Name: WGS 84 / Pseudo-Mercator
Axis Info [cartesian]:
- X[east]: Easting (metre)
- Y[north]: Northing (metre)
Area of Use:
- name: World - 85°S to 85°N
- bounds: (-180.0, -85.06, 180.0, 85.06)
Coordinate Operation:
- name: Popular Visualisation Pseudo-Mercator
- method: Popular Visualisation Pseudo Mercator
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

Overriding existing CRS:

>>> gdf = gdf.set_crs(4326, allow_override=True)

Without allow_override=True, set_crs returns an error if you try to override CRS.

See also

GeoDataFrame.to_crs

re-project to another CRS

set_geometry(col, drop=None, inplace=False, crs=None)[source]

Set the GeoDataFrame geometry using either an existing column or the specified input. By default yields a new object.

The original geometry column is replaced with the input.

Parameters:
  • col (column label or array-like) – An existing column name or values to set as the new geometry column. If values (array-like, (Geo)Series) are passed, then if they are named (Series) the new geometry column will have the corresponding name, otherwise the existing geometry column will be replaced. If there is no existing geometry column, the new geometry column will use the default name “geometry”.

  • drop (boolean, default False) –

    When specifying a named Series or an existing column name for col, controls if the previous geometry column should be dropped from the result. The default of False keeps both the old and new geometry column.

    Deprecated since version 1.0.0.

  • inplace (boolean, default False) – Modify the GeoDataFrame in place (do not create a new object)

  • crs (pyproj.CRS, optional) – Coordinate system to use. The value can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string. If passed, overrides both DataFrame and col’s crs. Otherwise, tries to get crs from passed col values or DataFrame.

Examples

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(d, crs="EPSG:4326")
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)

Passing an array:

>>> df1 = gdf.set_geometry([Point(0,0), Point(1,1)])
>>> df1
    col1     geometry
0  name1  POINT (0 0)
1  name2  POINT (1 1)

Using existing column:

>>> gdf["buffered"] = gdf.buffer(2)
>>> df2 = gdf.set_geometry("buffered")
>>> df2.geometry
0    POLYGON ((3 2, 2.99037 1.80397, 2.96157 1.6098...
1    POLYGON ((4 1, 3.99037 0.80397, 3.96157 0.6098...
Name: buffered, dtype: geometry
Return type:

GeoDataFrame

See also

GeoDataFrame.rename_geometry

rename an active geometry column

sjoin(df, *args, **kwargs)[source]

Spatial join of two GeoDataFrames.

See the User Guide page ../../user_guide/mergingdata for details.

Parameters:
  • df (GeoDataFrame)

  • how (string, default 'inner') –

    The type of join:

    • ’left’: use keys from left_df; retain only left_df geometry column

    • ’right’: use keys from right_df; retain only right_df geometry column

    • ’inner’: use intersection of keys from both dfs; retain only left_df geometry column

  • predicate (string, default 'intersects') – Binary predicate. Valid values are determined by the spatial index used. You can check the valid values in left_df or right_df as left_df.sindex.valid_query_predicates or right_df.sindex.valid_query_predicates

  • lsuffix (string, default 'left') – Suffix to apply to overlapping column names (left GeoDataFrame).

  • rsuffix (string, default 'right') – Suffix to apply to overlapping column names (right GeoDataFrame).

  • distance (number or array_like, optional) – Distance(s) around each input geometry within which to query the tree for the ‘dwithin’ predicate. If array_like, must be one-dimesional with length equal to length of left GeoDataFrame. Required if predicate='dwithin'.

  • on_attribute (string, list or tuple) – Column name(s) to join on as an additional join restriction on top of the spatial predicate. These must be found in both DataFrames. If set, observations are joined only if the predicate applies and values in specified columns match.

Examples

>>> import geodatasets
>>> chicago = geopandas.read_file(
...     geodatasets.get_path("geoda.chicago_commpop")
... )
>>> groceries = geopandas.read_file(
...     geodatasets.get_path("geoda.groceries")
... ).to_crs(chicago.crs)
>>> chicago.head()
         community  ...                                           geometry
0          DOUGLAS  ...  MULTIPOLYGON (((-87.60914 41.84469, -87.60915 ...
1          OAKLAND  ...  MULTIPOLYGON (((-87.59215 41.81693, -87.59231 ...
2      FULLER PARK  ...  MULTIPOLYGON (((-87.62880 41.80189, -87.62879 ...
3  GRAND BOULEVARD  ...  MULTIPOLYGON (((-87.60671 41.81681, -87.60670 ...
4          KENWOOD  ...  MULTIPOLYGON (((-87.59215 41.81693, -87.59215 ...

[5 rows x 9 columns]

>>> groceries.head()
   OBJECTID     Ycoord  ...  Category                           geometry
0        16  41.973266  ...       NaN  MULTIPOINT ((-87.65661 41.97321))
1        18  41.696367  ...       NaN  MULTIPOINT ((-87.68136 41.69713))
2        22  41.868634  ...       NaN  MULTIPOINT ((-87.63918 41.86847))
3        23  41.877590  ...       new  MULTIPOINT ((-87.65495 41.87783))
4        27  41.737696  ...       NaN  MULTIPOINT ((-87.62715 41.73623))
[5 rows x 8 columns]
>>> groceries_w_communities = groceries.sjoin(chicago)
>>> groceries_w_communities[["OBJECTID", "community", "geometry"]].head()
   OBJECTID       community                           geometry
0        16          UPTOWN  MULTIPOINT ((-87.65661 41.97321))
1        18     MORGAN PARK  MULTIPOINT ((-87.68136 41.69713))
2        22  NEAR WEST SIDE  MULTIPOINT ((-87.63918 41.86847))
3        23  NEAR WEST SIDE  MULTIPOINT ((-87.65495 41.87783))
4        27         CHATHAM  MULTIPOINT ((-87.62715 41.73623))

Notes

Every operation in GeoPandas is planar, i.e. the potential third dimension is not taken into account.

See also

GeoDataFrame.sjoin_nearest

nearest neighbor join

sjoin

equivalent top-level function

sjoin_nearest(right, how='inner', max_distance=None, lsuffix='left', rsuffix='right', distance_col=None, exclusive=False)[source]

Spatial join of two GeoDataFrames based on the distance between their geometries.

Results will include multiple output records for a single input record where there are multiple equidistant nearest or intersected neighbors.

See the User Guide page https://geopandas.readthedocs.io/en/latest/docs/user_guide/mergingdata.html for more details.

Parameters:
  • right (GeoDataFrame)

  • how (string, default 'inner') –

    The type of join:

    • ’left’: use keys from left_df; retain only left_df geometry column

    • ’right’: use keys from right_df; retain only right_df geometry column

    • ’inner’: use intersection of keys from both dfs; retain only left_df geometry column

  • max_distance (float, default None) – Maximum distance within which to query for nearest geometry. Must be greater than 0. The max_distance used to search for nearest items in the tree may have a significant impact on performance by reducing the number of input geometries that are evaluated for nearest items in the tree.

  • lsuffix (string, default 'left') – Suffix to apply to overlapping column names (left GeoDataFrame).

  • rsuffix (string, default 'right') – Suffix to apply to overlapping column names (right GeoDataFrame).

  • distance_col (string, default None) – If set, save the distances computed between matching geometries under a column of this name in the joined GeoDataFrame.

  • exclusive (bool, optional, default False) – If True, the nearest geometries that are equal to the input geometry will not be returned, default False. Requires Shapely >= 2.0

Examples

>>> import geodatasets
>>> groceries = geopandas.read_file(
...     geodatasets.get_path("geoda.groceries")
... )
>>> chicago = geopandas.read_file(
...     geodatasets.get_path("geoda.chicago_health")
... ).to_crs(groceries.crs)
>>> chicago.head()
   ComAreaID  ...                                           geometry
0         35  ...  POLYGON ((-87.60914 41.84469, -87.60915 41.844...
1         36  ...  POLYGON ((-87.59215 41.81693, -87.59231 41.816...
2         37  ...  POLYGON ((-87.62880 41.80189, -87.62879 41.801...
3         38  ...  POLYGON ((-87.60671 41.81681, -87.60670 41.816...
4         39  ...  POLYGON ((-87.59215 41.81693, -87.59215 41.816...
[5 rows x 87 columns]
>>> groceries.head()
   OBJECTID     Ycoord  ...  Category                           geometry
0        16  41.973266  ...       NaN  MULTIPOINT ((-87.65661 41.97321))
1        18  41.696367  ...       NaN  MULTIPOINT ((-87.68136 41.69713))
2        22  41.868634  ...       NaN  MULTIPOINT ((-87.63918 41.86847))
3        23  41.877590  ...       new  MULTIPOINT ((-87.65495 41.87783))
4        27  41.737696  ...       NaN  MULTIPOINT ((-87.62715 41.73623))
[5 rows x 8 columns]
>>> groceries_w_communities = groceries.sjoin_nearest(chicago)
>>> groceries_w_communities[["Chain", "community", "geometry"]].head(2)
               Chain    community                                geometry
0     VIET HOA PLAZA       UPTOWN   MULTIPOINT ((1168268.672 1933554.35))
1  COUNTY FAIR FOODS  MORGAN PARK  MULTIPOINT ((1162302.618 1832900.224))

To include the distances:

>>> groceries_w_communities = groceries.sjoin_nearest(chicago, distance_col="distances")
>>> groceries_w_communities[["Chain", "community", "distances"]].head(2)
               Chain    community  distances
0     VIET HOA PLAZA       UPTOWN        0.0
1  COUNTY FAIR FOODS  MORGAN PARK        0.0

In the following example, we get multiple groceries for Uptown because all results are equidistant (in this case zero because they intersect). In fact, we get 4 results in total:

>>> chicago_w_groceries = groceries.sjoin_nearest(chicago, distance_col="distances", how="right")
>>> uptown_results = chicago_w_groceries[chicago_w_groceries["community"] == "UPTOWN"]
>>> uptown_results[["Chain", "community"]]
            Chain community
30  VIET HOA PLAZA    UPTOWN
30      JEWEL OSCO    UPTOWN
30          TARGET    UPTOWN
30       Mariano's    UPTOWN

See also

GeoDataFrame.sjoin

binary predicate joins

sjoin_nearest

equivalent top-level function

Notes

Since this join relies on distances, results will be inaccurate if your geometries are in a geographic CRS.

Every operation in GeoPandas is planar, i.e. the potential third dimension is not taken into account.

to_arrow(*, index=None, geometry_encoding='WKB', interleaved=True, include_z=None)[source]

Encode a GeoDataFrame to GeoArrow format.

See https://geoarrow.org/ for details on the GeoArrow specification.

This functions returns a generic Arrow data object implementing the Arrow PyCapsule Protocol (i.e. having an __arrow_c_stream__ method). This object can then be consumed by your Arrow implementation of choice that supports this protocol.

Added in version 1.0.

Parameters:
  • index (bool, default None) – If True, always include the dataframe’s index(es) as columns in the file output. If False, the index(es) will not be written to the file. If None, the index(ex) will be included as columns in the file output except RangeIndex which is stored as metadata only.

  • geometry_encoding ({'WKB', 'geoarrow' }, default 'WKB') – The GeoArrow encoding to use for the data conversion.

  • interleaved (bool, default True) – Only relevant for ‘geoarrow’ encoding. If True, the geometries’ coordinates are interleaved in a single fixed size list array. If False, the coordinates are stored as separate arrays in a struct type.

  • include_z (bool, default None) – Only relevant for ‘geoarrow’ encoding (for WKB, the dimensionality of the individial geometries is preserved). If False, return 2D geometries. If True, include the third dimension in the output (if a geometry has no third dimension, the z-coordinates will be NaN). By default, will infer the dimensionality from the input geometries. Note that this inference can be unreliable with empty geometries (for a guaranteed result, it is recommended to specify the keyword).

Returns:

A generic Arrow table object with geometry columns encoded to GeoArrow.

Return type:

ArrowTable

Examples

>>> from shapely.geometry import Point
>>> data = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(data)
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)
>>> arrow_table = gdf.to_arrow()
>>> arrow_table
<geopandas.io._geoarrow.ArrowTable object at ...>

The returned data object needs to be consumed by a library implementing the Arrow PyCapsule Protocol. For example, wrapping the data as a pyarrow.Table (requires pyarrow >= 14.0):

>>> import pyarrow as pa
>>> table = pa.table(arrow_table)
>>> table
pyarrow.Table
col1: string
geometry: binary
----
col1: [["name1","name2"]]
geometry: [[0101000000000000000000F03F0000000000000040,01010000000000000000000040000000000000F03F]]
to_crs(crs=None, epsg=None, inplace=False)[source]

Transform geometries to a new coordinate reference system.

Transform all geometries in an active geometry column to a different coordinate reference system. The crs attribute on the current GeoSeries must be set. Either crs or epsg may be specified for output.

This method will transform all points in all objects. It has no notion of projecting entire geometries. All segments joining points are assumed to be lines in the current projection, not geodesics. Objects crossing the dateline (or other projection boundary) will have undesirable behavior.

Parameters:
  • crs (pyproj.CRS, optional if epsg is specified) – The value can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string.

  • epsg (int, optional if crs is specified) – EPSG code specifying output projection.

  • inplace (bool, optional, default: False) – Whether to return a new GeoDataFrame or do the transformation in place.

Return type:

GeoDataFrame

Examples

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(d, crs=4326)
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)
>>> gdf.crs
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
>>> gdf = gdf.to_crs(3857)
>>> gdf
    col1                       geometry
0  name1  POINT (111319.491 222684.209)
1  name2  POINT (222638.982 111325.143)
>>> gdf.crs
<Projected CRS: EPSG:3857>
Name: WGS 84 / Pseudo-Mercator
Axis Info [cartesian]:
- X[east]: Easting (metre)
- Y[north]: Northing (metre)
Area of Use:
- name: World - 85°S to 85°N
- bounds: (-180.0, -85.06, 180.0, 85.06)
Coordinate Operation:
- name: Popular Visualisation Pseudo-Mercator
- method: Popular Visualisation Pseudo Mercator
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

See also

GeoDataFrame.set_crs

assign CRS without re-projection

to_feather(path, index=None, compression=None, schema_version=None, **kwargs)[source]

Write a GeoDataFrame to the Feather format.

Any geometry columns present are serialized to WKB format in the file.

Requires ‘pyarrow’ >= 0.17.

Added in version 0.8.

Parameters:
  • path (str, path object)

  • index (bool, default None) – If True, always include the dataframe’s index(es) as columns in the file output. If False, the index(es) will not be written to the file. If None, the index(ex) will be included as columns in the file output except RangeIndex which is stored as metadata only.

  • compression ({'zstd', 'lz4', 'uncompressed'}, optional) – Name of the compression to use. Use "uncompressed" for no compression. By default uses LZ4 if available, otherwise uncompressed.

  • schema_version ({'0.1.0', '0.4.0', '1.0.0', None}) – GeoParquet specification version; if not provided will default to latest supported version.

  • kwargs – Additional keyword arguments passed to to pyarrow.feather.write_feather().

Examples

>>> gdf.to_feather('data.feather')

See also

GeoDataFrame.to_parquet

write GeoDataFrame to parquet

GeoDataFrame.to_file

write GeoDataFrame to file

to_file(filename, driver=None, schema=None, index=None, **kwargs)[source]

Write the GeoDataFrame to a file.

By default, an ESRI shapefile is written, but any OGR data source supported by Pyogrio or Fiona can be written. A dictionary of supported OGR providers is available via:

>>> import pyogrio
>>> pyogrio.list_drivers()
Parameters:
  • filename (string) – File path or file handle to write to. The path may specify a GDAL VSI scheme.

  • driver (string, default None) – The OGR format driver used to write the vector file. If not specified, it attempts to infer it from the file extension. If no extension is specified, it saves ESRI Shapefile to a folder.

  • schema (dict, default None) – If specified, the schema dictionary is passed to Fiona to better control how the file is written. If None, GeoPandas will determine the schema based on each column’s dtype. Not supported for the “pyogrio” engine.

  • index (bool, default None) –

    If True, write index into one or more columns (for MultiIndex). Default None writes the index into one or more columns only if the index is named, is a MultiIndex, or has a non-integer data type. If False, no index is written.

    Added in version 0.7: Previously the index was not written.

  • mode (string, default 'w') – The write mode, ‘w’ to overwrite the existing file and ‘a’ to append. Not all drivers support appending. The drivers that support appending are listed in fiona.supported_drivers or https://github.com/Toblerity/Fiona/blob/master/fiona/drvsupport.py

  • crs (pyproj.CRS, default None) – If specified, the CRS is passed to Fiona to better control how the file is written. If None, GeoPandas will determine the crs based on crs df attribute. The value can be anything accepted by pyproj.CRS.from_user_input(), such as an authority string (eg “EPSG:4326”) or a WKT string. The keyword is not supported for the “pyogrio” engine.

  • engine (str, "pyogrio" or "fiona") – The underlying library that is used to write the file. Currently, the supported options are “pyogrio” and “fiona”. Defaults to “pyogrio” if installed, otherwise tries “fiona”.

  • metadata (dict[str, str], default None) – Optional metadata to be stored in the file. Keys and values must be strings. Supported only for “GPKG” driver.

  • **kwargs – Keyword args to be passed to the engine, and can be used to write to multi-layer data, store data within archives (zip files), etc. In case of the “pyogrio” engine, the keyword arguments are passed to pyogrio.write_dataframe. In case of the “fiona” engine, the keyword arguments are passed to fiona.open`. For more information on possible keywords, type: import pyogrio; help(pyogrio.write_dataframe).

Notes

The format drivers will attempt to detect the encoding of your data, but may fail. In this case, the proper encoding can be specified explicitly by using the encoding keyword parameter, e.g. encoding='utf-8'.

See also

GeoSeries.to_file

GeoDataFrame.to_postgis

write GeoDataFrame to PostGIS database

GeoDataFrame.to_parquet

write GeoDataFrame to parquet

GeoDataFrame.to_feather

write GeoDataFrame to feather

Examples

>>> gdf.to_file('dataframe.shp')
>>> gdf.to_file('dataframe.gpkg', driver='GPKG', layer='name')
>>> gdf.to_file('dataframe.geojson', driver='GeoJSON')

With selected drivers you can also append to a file with mode=”a”:

>>> gdf.to_file('dataframe.shp', mode="a")

Using the engine-specific keyword arguments it is possible to e.g. create a spatialite file with a custom layer name:

>>> gdf.to_file(
...     'dataframe.sqlite', driver='SQLite', spatialite=True, layer='test'
... )
to_geo_dict(na='null', show_bbox=False, drop_id=False)[source]

Returns a python feature collection representation of the GeoDataFrame as a dictionary with a list of features based on the __geo_interface__ GeoJSON-like specification.

Parameters:
  • na (str, optional) –

    Options are {‘null’, ‘drop’, ‘keep’}, default ‘null’. Indicates how to output missing (NaN) values in the GeoDataFrame

    • null: output the missing entries as JSON null

    • drop: remove the property from the feature. This applies to each feature individually so that features may have different properties

    • keep: output the missing entries as NaN

  • show_bbox (bool, optional) – Include bbox (bounds) in the geojson. Default False.

  • drop_id (bool, default: False) – Whether to retain the index of the GeoDataFrame as the id property in the generated dictionary. Default is False, but may want True if the index is just arbitrary row numbers.

Examples

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(d)
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)
>>> gdf.to_geo_dict()
{'type': 'FeatureCollection', 'features': [{'id': '0', 'type': 'Feature', 'properties': {'col1': 'name1'}, 'geometry': {'type': 'Point', 'coordinates': (1.0, 2.0)}}, {'id': '1', 'type': 'Feature', 'properties': {'col1': 'name2'}, 'geometry': {'type': 'Point', 'coordinates': (2.0, 1.0)}}]}

See also

GeoDataFrame.to_json

return a GeoDataFrame as a GeoJSON string

to_json(na='null', show_bbox=False, drop_id=False, to_wgs84=False, **kwargs)[source]

Returns a GeoJSON representation of the GeoDataFrame as a string.

Parameters:
  • na ({'null', 'drop', 'keep'}, default 'null') – Indicates how to output missing (NaN) values in the GeoDataFrame. See below.

  • show_bbox (bool, optional, default: False) – Include bbox (bounds) in the geojson

  • drop_id (bool, default: False) – Whether to retain the index of the GeoDataFrame as the id property in the generated GeoJSON. Default is False, but may want True if the index is just arbitrary row numbers.

  • to_wgs84 (bool, optional, default: False) – If the CRS is set on the active geometry column it is exported as WGS84 (EPSG:4326) to meet the 2016 GeoJSON specification. Set to True to force re-projection and set to False to ignore CRS. False by default.

Notes

The remaining kwargs are passed to json.dumps().

Missing (NaN) values in the GeoDataFrame can be represented as follows:

  • null: output the missing entries as JSON null.

  • drop: remove the property from the feature. This applies to each feature individually so that features may have different properties.

  • keep: output the missing entries as NaN.

If the GeoDataFrame has a defined CRS, its definition will be included in the output unless it is equal to WGS84 (default GeoJSON CRS) or not possible to represent in the URN OGC format, or unless to_wgs84=True is specified.

Examples

>>> from shapely.geometry import Point
>>> d = {'col1': ['name1', 'name2'], 'geometry': [Point(1, 2), Point(2, 1)]}
>>> gdf = geopandas.GeoDataFrame(d, crs="EPSG:3857")
>>> gdf
    col1     geometry
0  name1  POINT (1 2)
1  name2  POINT (2 1)
>>> gdf.to_json()
'{"type": "FeatureCollection", "features": [{"id": "0", "type": "Feature", "properties": {"col1": "name1"}, "geometry": {"type": "Point", "coordinates": [1.0, 2.0]}}, {"id": "1", "type": "Feature", "properties": {"col1": "name2"}, "geometry": {"type": "Point", "coordinates": [2.0, 1.0]}}], "crs": {"type": "name", "properties": {"name": "urn:ogc:def:crs:EPSG::3857"}}}'

Alternatively, you can write GeoJSON to file:

>>> gdf.to_file(path, driver="GeoJSON")

See also

GeoDataFrame.to_file

write GeoDataFrame to file

to_parquet(path, index=None, compression='snappy', geometry_encoding='WKB', write_covering_bbox=False, schema_version=None, **kwargs)[source]

Write a GeoDataFrame to the Parquet format.

By default, all geometry columns present are serialized to WKB format in the file.

Requires ‘pyarrow’.

Added in version 0.8.

Parameters:
  • path (str, path object)

  • index (bool, default None) – If True, always include the dataframe’s index(es) as columns in the file output. If False, the index(es) will not be written to the file. If None, the index(ex) will be included as columns in the file output except RangeIndex which is stored as metadata only.

  • compression ({'snappy', 'gzip', 'brotli', None}, default 'snappy') – Name of the compression to use. Use None for no compression.

  • geometry_encoding ({'WKB', 'geoarrow'}, default 'WKB') – The encoding to use for the geometry columns. Defaults to “WKB” for maximum interoperability. Specify “geoarrow” to use one of the native GeoArrow-based single-geometry type encodings. Note: the “geoarrow” option is part of the newer GeoParquet 1.1 specification, should be considered as experimental, and may not be supported by all readers.

  • write_covering_bbox (bool, default False) – Writes the bounding box column for each row entry with column name ‘bbox’. Writing a bbox column can be computationally expensive, but allows you to specify a bbox in : func:read_parquet for filtered reading. Note: this bbox column is part of the newer GeoParquet 1.1 specification and should be considered as experimental. While writing the column is backwards compatible, using it for filtering may not be supported by all readers.

  • schema_version ({'0.1.0', '0.4.0', '1.0.0', '1.1.0', None}) – GeoParquet specification version; if not provided, will default to latest supported stable version (1.0.0).

  • kwargs – Additional keyword arguments passed to pyarrow.parquet.write_table().

Examples

>>> gdf.to_parquet('data.parquet')

See also

GeoDataFrame.to_feather

write GeoDataFrame to feather

GeoDataFrame.to_file

write GeoDataFrame to file

to_postgis(name, con, schema=None, if_exists='fail', index=False, index_label=None, chunksize=None, dtype=None)[source]

Upload GeoDataFrame into PostGIS database.

This method requires SQLAlchemy and GeoAlchemy2, and a PostgreSQL Python driver (psycopg or psycopg2) to be installed.

It is also possible to use to_file() to write to a database. Especially for file geodatabases like GeoPackage or SpatiaLite this can be easier.

Parameters:
  • name (str) – Name of the target table.

  • con (sqlalchemy.engine.Connection or sqlalchemy.engine.Engine) – Active connection to the PostGIS database.

  • if_exists ({'fail', 'replace', 'append'}, default 'fail') –

    How to behave if the table already exists:

    • fail: Raise a ValueError.

    • replace: Drop the table before inserting new values.

    • append: Insert new values to the existing table.

  • schema (string, optional) – Specify the schema. If None, use default schema: ‘public’.

  • index (bool, default False) – Write DataFrame index as a column. Uses index_label as the column name in the table.

  • index_label (string or sequence, default None) – Column label for index column(s). If None is given (default) and index is True, then the index names are used.

  • chunksize (int, optional) – Rows will be written in batches of this size at a time. By default, all rows will be written at once.

  • dtype (dict of column name to SQL type, default None) – Specifying the datatype for columns. The keys should be the column names and the values should be the SQLAlchemy types.

Examples

>>> from sqlalchemy import create_engine
>>> engine = create_engine("postgresql://myusername:mypassword@myhost:5432/mydatabase")
>>> gdf.to_postgis("my_table", engine)

See also

GeoDataFrame.to_file

write GeoDataFrame to file

read_postgis

read PostGIS database to GeoDataFrame

to_wkb(hex=False, **kwargs)[source]

Encode all geometry columns in the GeoDataFrame to WKB.

Parameters:
  • hex (bool) – If true, export the WKB as a hexadecimal string. The default is to return a binary bytes object.

  • kwargs – Additional keyword args will be passed to shapely.to_wkb().

Returns:

geometry columns are encoded to WKB

Return type:

DataFrame

to_wkt(**kwargs)[source]

Encode all geometry columns in the GeoDataFrame to WKT.

Parameters:

kwargs – Keyword args will be passed to shapely.to_wkt().

Returns:

geometry columns are encoded to WKT

Return type:

DataFrame

class pyorps.raster.rasterizer.GeoDataset(file_source, crs=None)[source]

Bases: ABC

_abc_impl = <_abc._abc_data object>
crs: Optional[str] = None
data: Union[GeoDataFrame, ndarray, None] = None
file_source: Any
abstractmethod load_data(**kwargs)[source]
class pyorps.raster.rasterizer.GeoRasterizer(input_data, cost_assumptions, bbox=None, mask=None, default_crs=None, **kwargs)[source]

Bases: object

A class for preparing and rasterizing geospatial data with cost assumptions.

This class integrates:
  • GeoDataset for representing datasets with metadata

  • CostAssumptions for handling cost mappings

  • Rasterization functionality for converting vector data to rasters

_calculate_out_shape_from_bounding_box(bounding_box, resolution_m2=1.0)[source]

Calculate the output shape (rows, columns) based on a bounding box and resolution.

Parameters:
  • bounding_box (Polygon) – The bounding box defining the output shape in a planar CRS

  • resolution_m2 (float) – The resolution in square meters

Return type:

tuple[int, int]

Returns:

tuple of (rows, columns) representing the output shape

_calculate_out_shape_from_geodataframe(gdf, resolution_m2=1.0, bounding_box=None)[source]

Calculate the output shape (rows, columns) based on a GeoDataFrame and resolution.

Parameters:
  • gdf (GeoDataFrame) – The GeoDataFrame containing the geometries to cover

  • resolution_m2 (float) – The resolution in square meters

  • bounding_box (Optional[Polygon]) – Optional bounding box defining the output shape

Return type:

tuple[int, int]

Returns:

tuple of (rows, columns) representing the output shape

static _get_rows_and_columns(width, height, resolution_m2, total_area_m2)[source]

Calculate rows and columns based on width, height, and resolution.

Parameters:
  • width – Width of the area

  • height – Height of the area

  • resolution_m2 – Resolution in square meters

  • total_area_m2 – Total area in square meters

Returns:

tuple of (rows, columns)

_modify_raster_from_dataset_simple_cost_assumptions(gdf, cost_assumptions=None, ignore_value=65535, multiply=False, zone_field=None, forbidden_zone=None, forbidden_value=65535)[source]

Modify the raster with an additional GeoDataFrame.

Parameters:
  • gdf (GeoDataFrame) – GeoDataFrame, used to modify the raster dataset

  • cost_assumptions (Union[dict, str, CostAssumptions, int, float, None]) – The CostAssumptionsType or numeric to apply as cost values to the base_dataset

  • ignore_value (Optional[float]) – Value in the raster to ignore

  • multiply (bool) – If True, multiply the raster values by the given value (in cost_assumptions)

  • zone_field (Optional[str]) – Field name for zones in the dataset

  • forbidden_zone (Optional[str]) – Zone value that should be treated as forbidden

  • forbidden_value (int) – Value to use for forbidden areas

Returns:

The modified raster

property base_data: GeoDataFrame

Property to directly access the data attribute of the base_dataset.

Returns:

The base dataset (GeoDataFrame

clip_to_area(clip_geometry)[source]

Clip the base dataset to a specific area.

Parameters:

clip_geometry (Union[GeoDataFrame, Polygon]) – The geometry to clip by

Return type:

GeoDataset

Returns:

The clipped base dataset

create_bounds_geodataframe(target_crs=None)[source]

Creates a GeoDataFrame from the bounds of the base data in a specified CRS.

Parameters:

target_crs (Optional[str]) – The desired CRS for the new GeoDataFrame

Return type:

GeoDataFrame

Returns:

A new GeoDataFrame containing the bounds of the base data

static create_buffer(dataset, geometry_buffer_m, inplace=True)[source]

Add a buffer to geometries in a dataset.

Parameters:
  • dataset (Union[VectorDataset, GeoDataFrame]) – The dataset to buffer (GeoDataset or GeoDataFrame)

  • geometry_buffer_m (float) – Distance to buffer in dataset’s CRS units

  • inplace (bool) – If True, modify the dataset in place

Return type:

Union[VectorDataset, GeoDataFrame]

Returns:

The buffered dataset

property crs

Passing crs property of base_dataset.

Returns:

The desired CRS of the base dataset

modify_raster_from_dataset(input_data, cost_assumptions=None, bbox=None, mask=None, transform=None, geometry_buffer_m=0, ignore_value=65535, multiply=False, zone_field=None, forbidden_zone=None, forbidden_value=65535, **kwargs)[source]

Modify the raster with an additional dataset.

Parameters:
  • input_data (Union[str, dict, GeoDataFrame, GeoSeries, ndarray]) – Path to the additional dataset file

  • cost_assumptions (Union[dict, str, CostAssumptions, int, float, None]) – The CostAssumptionsType or numeric to apply as cost values to the base_dataset

  • bbox (Union[Polygon, GeoDataFrame, GeoSeries, tuple[float, float, float, float], None]) – The bounding box to apply to the input data

  • mask (Union[Polygon, GeoDataFrame, tuple, None]) – The geometry mask to apply to the input data

  • transform (Optional[Affine]) – The transform describing the input data

  • geometry_buffer_m (float) – Buffer to apply to the dataset geometries

  • ignore_value (Optional[float]) – Value in the raster to ignore

  • multiply (bool) – If True, multiply the raster values by the given value (in cost_assumptions)

  • zone_field (Optional[str]) – Field name for zones in the dataset

  • forbidden_zone (Optional[str]) – Zone value that should be treated as forbidden

  • forbidden_value (int) – Value to use for forbidden areas

  • **kwargs – Additional keyword arguments, passed to the loading function of the GeoDataset

Return type:

ndarray

Returns:

The modified raster

modify_raster_with_geodataframe(gdf, value, ignore_value=65535, multiply=False)[source]

Modifies the raster cells inside the polygons of a GeoDataFrame.

Parameters:
  • gdf (GeoDataFrame) – The GeoDataFrame containing polygons to use for masking

  • value (float) – The value to set for the raster cells inside the polygons

  • ignore_value (Optional[float]) – Value in the raster to ignore during modification

  • multiply (bool) – If True, multiply the raster values by the given value

Return type:

ndarray

Returns:

The modified raster

rasterize(field_name='cost', resolution_in_m=1.0, fill_value=65535, save_path=None, dtype='uint16', geometry_buffer_m=0, bounding_box=None)[source]

Rasterize the base dataset based on a specified field.

Parameters:
  • field_name (str) – The field to use for rasterization values

  • resolution_in_m (float) – The resolution of the output raster in meters

  • fill_value (int) – Value to use for areas with no data

  • save_path (Optional[str]) – Path to save the rasterized output

  • dtype (str) – Data type for the output raster

  • geometry_buffer_m (float) – Buffer to apply to the dataset geometries

  • bounding_box (Optional[Polygon]) – Bounding box to define the rasterization extent

Return type:

RasterDataset

Returns:

tuple of (raster_data, transform)

save_raster(save_path)[source]

Save the rasterized data to a file.

Parameters:

save_path (str) – Path to save the raster file

Return type:

None

shrink_raster(exclude_value)[source]

Shrink the raster by removing outer bounds with a specific value.

Parameters:

exclude_value (int) – Value to exclude from the outer bounds

Return type:

ndarray

Returns:

The shrunk raster

class pyorps.raster.rasterizer.InMemoryRasterDataset(file_source, crs, transform)[source]

Bases: RasterDataset

_abc_impl = <_abc._abc_data object>
count: int
dtype: dtype
file_source: Any
load_data(**kwargs)[source]
shape: tuple[int, int]
transform: Affine
class pyorps.raster.rasterizer.Polygon(shell=None, holes=None)[source]

Bases: BaseGeometry

A geometry type representing an area that is enclosed by a linear ring.

A polygon is a two-dimensional feature and has a non-zero area. It may have one or more negative-space “holes” which are also bounded by linear rings. If any rings cross each other, the feature is invalid and operations on it may fail.

Parameters:
  • shell (sequence) – A sequence of (x, y [,z]) numeric coordinate pairs or triples, or an array-like with shape (N, 2) or (N, 3). Also can be a sequence of Point objects.

  • holes (sequence) – A sequence of objects which satisfy the same requirements as the shell parameters above

exterior

The ring which bounds the positive space of the polygon.

Type:

LinearRing

interiors

A sequence of rings which bound all existing holes.

Type:

sequence

Examples

Create a square polygon with no holes

>>> from shapely import Polygon
>>> coords = ((0., 0.), (0., 1.), (1., 1.), (1., 0.), (0., 0.))
>>> polygon = Polygon(coords)
>>> polygon.area
1.0
property coords

Not implemented for polygons.

property exterior

Return the exterior ring of the polygon.

classmethod from_bounds(xmin, ymin, xmax, ymax)[source]

Construct a Polygon() from spatial bounds.

property interiors

Return the sequence of interior rings of the polygon.

svg(scale_factor=1.0, fill_color=None, opacity=None)[source]

Return SVG path element for the Polygon geometry.

Parameters:
  • scale_factor (float) – Multiplication factor for the SVG stroke-width. Default is 1.

  • fill_color (str, optional) – Hex string for fill color. Default is to use “#66cc99” if geometry is valid, and “#ff3333” if invalid.

  • opacity (float) – Float number between 0 and 1 for color opacity. Default value is 0.6

class pyorps.raster.rasterizer.RasterDataset(file_source, crs=None)[source]

Bases: GeoDataset, ABC

_abc_impl = <_abc._abc_data object>
count: int
dtype: dtype
file_source: Any
shape: tuple[int, int]
transform: Affine
class pyorps.raster.rasterizer.VectorDataset(file_source, crs=None, bbox=None, mask=None)[source]

Bases: GeoDataset, ABC

_abc_impl = <_abc._abc_data object>
abstractmethod apply_bbox()[source]
abstractmethod apply_mask()[source]
bbox: Union[Polygon, GeoDataFrame, GeoSeries, tuple[float, float, float, float], None] = (None,)
abstractmethod correct_crs()[source]
file_source: Any
mask: Union[Polygon, GeoDataFrame, tuple, None] = (None,)
abstractmethod post_loading()[source]
pyorps.raster.rasterizer.box(minx, miny, maxx, maxy, ccw=True)[source]

Return a rectangular polygon with configurable normal vector.

pyorps.raster.rasterizer.deepcopy(x, memo=None, _nil=[])[source]

Deep copy operation on arbitrary Python objects.

See the module’s __doc__ string for more info.

pyorps.raster.rasterizer.from_bounds(west, south, east, north, width, height)[source]

Return an Affine transformation given bounds, width and height.

Return an Affine transformation for a georeferenced raster given its bounds west, south, east, north and its width and height in number of pixels.

pyorps.raster.rasterizer.geometry_mask(geometries, out_shape, transform, all_touched=False, invert=False)[source]

Create a mask from shapes.

By default, mask is intended for use as a numpy mask, where pixels that overlap shapes are False.

Parameters:
  • geometries (iterable over geometries (GeoJSON-like objects))

  • out_shape (tuple or list) – Shape of output numpy.ndarray.

  • transform (Affine transformation object) – Transformation from pixel coordinates of source to the coordinate system of the input shapes. See the transform property of dataset objects.

  • all_touched (boolean, optional) – If True, all pixels touched by geometries will be burned in. If False, only pixels whose center is within the polygon or that are selected by Bresenham’s line algorithm will be burned in. False by default

  • invert (boolean, optional) – If True, mask will be True for pixels that overlap shapes. False by default.

Returns:

Type is numpy.bool_

Return type:

numpy.ndarray

Notes

See rasterize() for performance notes.

pyorps.raster.rasterizer.initialize_geo_dataset(file_source, crs=None, bbox=None, mask=None, transform=None)[source]

Factory function to create the appropriate GeoDataset instance based on the provided input.

Parameters:
  • file_source (Union[str, dict, GeoDataFrame, GeoSeries, ndarray]) – Source data (file path, GeoDataFrame, URL dict, numpy array, etc.)

  • crs (Optional[str]) – Coordinate reference system

  • bbox (Union[Polygon, GeoDataFrame, GeoSeries, tuple[float, float, float, float], None]) – Bounding box for vector datasets

  • mask (Union[Polygon, GeoDataFrame, tuple, None]) – Mask for vector datasets

  • transform (Optional[Affine]) – Affine transform for in-memory raster datasets

Return type:

GeoDataset

Returns:

An appropriate GeoDataset subclass instance

Examples

# From local vector file vector_dataset = create_geo_dataset(“path/to/shapefile.shp”, crs=”EPSG:4326”)

# From GeoDataFrame vector_dataset = create_geo_dataset(gdf, bbox=(x1, y1, x2, y2))

# From WFS source wfs_dataset = create_geo_dataset({“url”: “https://example.com/wfs”,

“layer”: “layer1”})

# From local raster file raster_dataset = create_geo_dataset(“path/to/dem.tif”)

# From numpy array raster_dataset = create_geo_dataset(array_data, transform=transform,

crs=”EPSG:4326”)

pyorps.raster.rasterizer.rasterize(shapes, out_shape=None, fill=0, nodata=None, masked=False, out=None, transform=(1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0), all_touched=False, merge_alg=MergeAlg.replace, default_value=1, dtype=None, skip_invalid=True, dst_path=None, dst_kwds=None)[source]

Return an image array with input geometries burned in.

Warnings will be raised for any invalid or empty geometries, and an exception will be raised if there are no valid shapes to rasterize.

Parameters:
  • shapes (iterable of (geometry, value) pairs or geometries) – The geometry can either be an object that implements the geo interface or GeoJSON-like object. If no value is provided the default_value will be used. If value is None the fill value will be used.

  • out_shape (tuple or list with 2 integers) – Shape of output numpy.ndarray.

  • fill (int or float, optional) – Used as fill value for all areas not covered by input geometries.

  • nodata (float, optional) – nodata value to use in output file or masked array.

  • masked (bool, optional. Default: False.) – If True, return a masked array. Note: nodata is always set in the case of file output.

  • out (numpy.ndarray, optional) – Array in which to store results. If not provided, out_shape and dtype are required.

  • transform (Affine transformation object, optional) – Transformation from pixel coordinates of source to the coordinate system of the input shapes. See the transform property of dataset objects.

  • all_touched (boolean, optional) – If True, all pixels touched by geometries will be burned in. If false, only pixels whose center is within the polygon or that are selected by Bresenham’s line algorithm will be burned in.

  • merge_alg (MergeAlg, optional) –

    Merge algorithm to use. One of:
    MergeAlg.replace (default):

    the new value will overwrite the existing value.

    MergeAlg.add:

    the new value will be added to the existing raster.

  • default_value (int or float, optional) – Used as value for all geometries, if not provided in shapes.

  • dtype (rasterio or numpy.dtype, optional) – Used as data type for results, if out is not provided.

  • skip_invalid (bool, optional) – If True (default), invalid shapes will be skipped. If False, ValueError will be raised.

  • dst_path (str or PathLike, optional) – Path of output dataset

  • dst_kwds (dict, optional) – Dictionary of creation options and other parameters that will be overlaid on the profile of the output dataset.

Returns:

If out was not None then out is returned, it will have been modified in-place. If out was None, this will be a new array.

Return type:

numpy.ndarray

Notes

Valid data types for fill, default_value, out, dtype and shape values are “int16”, “int32”, “uint8”, “uint16”, “uint32”, “float32”, and “float64”.

This function requires significant memory resources. The shapes iterator will be materialized to a Python list and another C copy of that list will be made. The out array will be copied and additional temporary raster memory equal to 2x the smaller of out data or GDAL’s max cache size (controlled by GDAL_CACHEMAX, default is 5% of the computer’s physical memory) is required.

If GDAL max cache size is smaller than the output data, the array of shapes will be iterated multiple times. Performance is thus a linear function of buffer size. For maximum speed, ensure that GDAL_CACHEMAX is larger than the size of out or out_shape.

pyorps.raster.rasterizer.rio_open(fp, mode='r', driver=None, width=None, height=None, count=None, crs=None, transform=None, dtype=None, nodata=None, sharing=False, opener=None, **kwargs)

Open a dataset for reading or writing.

The dataset may be located in a local file, in a resource located by a URL, or contained within a stream of bytes. This function accepts different types of fp parameters. However, it is almost always best to pass a string that has a dataset name as its value. These are passed directly to GDAL protocol and format handlers. A path to a zipfile is more efficiently used by GDAL than a Python ZipFile object, for example.

In read (‘r’) or read/write (‘r+’) mode, no keyword arguments are required: these attributes are supplied by the opened dataset.

In write (‘w’ or ‘w+’) mode, the driver, width, height, count, and dtype keywords are strictly required.

Parameters:
  • fp (str, os.PathLike, file-like, or rasterio.io.MemoryFile) – A filename or URL, a file object opened in binary (‘rb’) mode, a Path object, or one of the rasterio classes that provides the dataset-opening interface (has an open method that returns a dataset). Use a string when possible: GDAL can more efficiently access a dataset if it opens it natively.

  • mode (str, optional) – ‘r’ (read, the default), ‘r+’ (read/write), ‘w’ (write), or ‘w+’ (write/read).

  • driver (str, optional) – A short format driver name (e.g. “GTiff” or “JPEG”) or a list of such names (see GDAL docs at https://gdal.org/drivers/raster/index.html). In ‘w’ or ‘w+’ modes a single name is required. In ‘r’ or ‘r+’ modes the driver can usually be omitted. Registered drivers will be tried sequentially until a match is found. When multiple drivers are available for a format such as JPEG2000, one of them can be selected by using this keyword argument.

  • width (int, optional) – The number of columns of the raster dataset. Required in ‘w’ or ‘w+’ modes, it is ignored in ‘r’ or ‘r+’ modes.

  • height (int, optional) – The number of rows of the raster dataset. Required in ‘w’ or ‘w+’ modes, it is ignored in ‘r’ or ‘r+’ modes.

  • count (int, optional) – The count of dataset bands. Required in ‘w’ or ‘w+’ modes, it is ignored in ‘r’ or ‘r+’ modes.

  • crs (str, dict, or CRS, optional) – The coordinate reference system. Required in ‘w’ or ‘w+’ modes, it is ignored in ‘r’ or ‘r+’ modes.

  • transform (affine.Affine, optional) – Affine transformation mapping the pixel space to geographic space. Required in ‘w’ or ‘w+’ modes, it is ignored in ‘r’ or ‘r+’ modes.

  • dtype (str or numpy.dtype, optional) – The data type for bands. For example: ‘uint8’ or rasterio.uint16. Required in ‘w’ or ‘w+’ modes, it is ignored in ‘r’ or ‘r+’ modes.

  • nodata (int, float, or nan, optional) – Defines the pixel value to be interpreted as not valid data. Required in ‘w’ or ‘w+’ modes, it is ignored in ‘r’ or ‘r+’ modes.

  • sharing (bool, optional) – To reduce overhead and prevent programs from running out of file descriptors, rasterio maintains a pool of shared low level dataset handles. If True this function will use a shared handle if one is available. Multithreaded programs must avoid sharing and should set sharing to False.

  • opener (callable, optional) – A custom dataset opener which can serve GDAL’s virtual filesystem machinery via Python file-like objects. The underlying file-like object is obtained by calling opener with (fp, mode) or (fp, mode + “b”) depending on the format driver’s native mode. opener must return a Python file-like object that provides read, seek, tell, and close methods. Note: only one opener at a time per fp, mode pair is allowed.

  • kwargs (optional) – These are passed to format drivers as directives for creating or interpreting datasets. For example: in ‘w’ or ‘w+’ modes a tiled=True keyword argument will direct the GeoTIFF format driver to create a tiled, rather than striped, TIFF.

Returns:

  • rasterio.io.DatasetReader – If mode is “r”.

  • rasterio.io.DatasetWriter – If mode is “r+”, “w”, or “w+”.

Raises:
  • TypeError – If arguments are of the wrong Python type.

  • rasterio.errors.RasterioIOError – If the dataset can not be opened. Such as when there is no dataset with the given name.

  • rasterio.errors.DriverCapabilityError – If the detected format driver does not support the requested opening mode.

Examples

To open a local GeoTIFF dataset for reading using standard driver discovery and no directives:

>>> import rasterio
>>> with rasterio.open('example.tif') as dataset:
...     print(dataset.profile)

To open a local JPEG2000 dataset using only the JP2OpenJPEG driver:

>>> with rasterio.open(
...         'example.jp2', driver='JP2OpenJPEG') as dataset:
...     print(dataset.profile)

To create a new 8-band, 16-bit unsigned, tiled, and LZW-compressed GeoTIFF with a global extent and 0.5 degree resolution:

>>> from rasterio.transform import from_origin
>>> with rasterio.open(
...         'example.tif', 'w', driver='GTiff', dtype='uint16',
...         width=720, height=360, count=8, crs='EPSG:4326',
...         transform=from_origin(-180.0, 90.0, 0.5, 0.5),
...         nodata=0, tiled=True, compress='lzw') as dataset:
...     dataset.write(...)

Module contents

Raster data processing functionality for geospatial analysis.

This module provides: 1. Classes for handling and manipulating raster datasets 2. Rasterization tools for converting vector data to rasters 3. Cost surface generation capabilities 4. Utility functions for creating test data and processing rasters

class pyorps.raster.GeoRasterizer(input_data, cost_assumptions, bbox=None, mask=None, default_crs=None, **kwargs)[source]

Bases: object

A class for preparing and rasterizing geospatial data with cost assumptions.

This class integrates:
  • GeoDataset for representing datasets with metadata

  • CostAssumptions for handling cost mappings

  • Rasterization functionality for converting vector data to rasters

_calculate_out_shape_from_bounding_box(bounding_box, resolution_m2=1.0)[source]

Calculate the output shape (rows, columns) based on a bounding box and resolution.

Parameters:
  • bounding_box (Polygon) – The bounding box defining the output shape in a planar CRS

  • resolution_m2 (float) – The resolution in square meters

Return type:

tuple[int, int]

Returns:

tuple of (rows, columns) representing the output shape

_calculate_out_shape_from_geodataframe(gdf, resolution_m2=1.0, bounding_box=None)[source]

Calculate the output shape (rows, columns) based on a GeoDataFrame and resolution.

Parameters:
  • gdf (GeoDataFrame) – The GeoDataFrame containing the geometries to cover

  • resolution_m2 (float) – The resolution in square meters

  • bounding_box (Optional[Polygon]) – Optional bounding box defining the output shape

Return type:

tuple[int, int]

Returns:

tuple of (rows, columns) representing the output shape

static _get_rows_and_columns(width, height, resolution_m2, total_area_m2)[source]

Calculate rows and columns based on width, height, and resolution.

Parameters:
  • width – Width of the area

  • height – Height of the area

  • resolution_m2 – Resolution in square meters

  • total_area_m2 – Total area in square meters

Returns:

tuple of (rows, columns)

_modify_raster_from_dataset_simple_cost_assumptions(gdf, cost_assumptions=None, ignore_value=65535, multiply=False, zone_field=None, forbidden_zone=None, forbidden_value=65535)[source]

Modify the raster with an additional GeoDataFrame.

Parameters:
  • gdf (GeoDataFrame) – GeoDataFrame, used to modify the raster dataset

  • cost_assumptions (Union[dict, str, CostAssumptions, int, float, None]) – The CostAssumptionsType or numeric to apply as cost values to the base_dataset

  • ignore_value (Optional[float]) – Value in the raster to ignore

  • multiply (bool) – If True, multiply the raster values by the given value (in cost_assumptions)

  • zone_field (Optional[str]) – Field name for zones in the dataset

  • forbidden_zone (Optional[str]) – Zone value that should be treated as forbidden

  • forbidden_value (int) – Value to use for forbidden areas

Returns:

The modified raster

property base_data: GeoDataFrame

Property to directly access the data attribute of the base_dataset.

Returns:

The base dataset (GeoDataFrame

clip_to_area(clip_geometry)[source]

Clip the base dataset to a specific area.

Parameters:

clip_geometry (Union[GeoDataFrame, Polygon]) – The geometry to clip by

Return type:

GeoDataset

Returns:

The clipped base dataset

create_bounds_geodataframe(target_crs=None)[source]

Creates a GeoDataFrame from the bounds of the base data in a specified CRS.

Parameters:

target_crs (Optional[str]) – The desired CRS for the new GeoDataFrame

Return type:

GeoDataFrame

Returns:

A new GeoDataFrame containing the bounds of the base data

static create_buffer(dataset, geometry_buffer_m, inplace=True)[source]

Add a buffer to geometries in a dataset.

Parameters:
  • dataset (Union[VectorDataset, GeoDataFrame]) – The dataset to buffer (GeoDataset or GeoDataFrame)

  • geometry_buffer_m (float) – Distance to buffer in dataset’s CRS units

  • inplace (bool) – If True, modify the dataset in place

Return type:

Union[VectorDataset, GeoDataFrame]

Returns:

The buffered dataset

property crs

Passing crs property of base_dataset.

Returns:

The desired CRS of the base dataset

modify_raster_from_dataset(input_data, cost_assumptions=None, bbox=None, mask=None, transform=None, geometry_buffer_m=0, ignore_value=65535, multiply=False, zone_field=None, forbidden_zone=None, forbidden_value=65535, **kwargs)[source]

Modify the raster with an additional dataset.

Parameters:
  • input_data (Union[str, dict, GeoDataFrame, GeoSeries, ndarray]) – Path to the additional dataset file

  • cost_assumptions (Union[dict, str, CostAssumptions, int, float, None]) – The CostAssumptionsType or numeric to apply as cost values to the base_dataset

  • bbox (Union[Polygon, GeoDataFrame, GeoSeries, tuple[float, float, float, float], None]) – The bounding box to apply to the input data

  • mask (Union[Polygon, GeoDataFrame, tuple, None]) – The geometry mask to apply to the input data

  • transform (Optional[Affine]) – The transform describing the input data

  • geometry_buffer_m (float) – Buffer to apply to the dataset geometries

  • ignore_value (Optional[float]) – Value in the raster to ignore

  • multiply (bool) – If True, multiply the raster values by the given value (in cost_assumptions)

  • zone_field (Optional[str]) – Field name for zones in the dataset

  • forbidden_zone (Optional[str]) – Zone value that should be treated as forbidden

  • forbidden_value (int) – Value to use for forbidden areas

  • **kwargs – Additional keyword arguments, passed to the loading function of the GeoDataset

Return type:

ndarray

Returns:

The modified raster

modify_raster_with_geodataframe(gdf, value, ignore_value=65535, multiply=False)[source]

Modifies the raster cells inside the polygons of a GeoDataFrame.

Parameters:
  • gdf (GeoDataFrame) – The GeoDataFrame containing polygons to use for masking

  • value (float) – The value to set for the raster cells inside the polygons

  • ignore_value (Optional[float]) – Value in the raster to ignore during modification

  • multiply (bool) – If True, multiply the raster values by the given value

Return type:

ndarray

Returns:

The modified raster

rasterize(field_name='cost', resolution_in_m=1.0, fill_value=65535, save_path=None, dtype='uint16', geometry_buffer_m=0, bounding_box=None)[source]

Rasterize the base dataset based on a specified field.

Parameters:
  • field_name (str) – The field to use for rasterization values

  • resolution_in_m (float) – The resolution of the output raster in meters

  • fill_value (int) – Value to use for areas with no data

  • save_path (Optional[str]) – Path to save the rasterized output

  • dtype (str) – Data type for the output raster

  • geometry_buffer_m (float) – Buffer to apply to the dataset geometries

  • bounding_box (Optional[Polygon]) – Bounding box to define the rasterization extent

Return type:

RasterDataset

Returns:

tuple of (raster_data, transform)

save_raster(save_path)[source]

Save the rasterized data to a file.

Parameters:

save_path (str) – Path to save the raster file

Return type:

None

shrink_raster(exclude_value)[source]

Shrink the raster by removing outer bounds with a specific value.

Parameters:

exclude_value (int) – Value to exclude from the outer bounds

Return type:

ndarray

Returns:

The shrunk raster

class pyorps.raster.RasterHandler(raster_source, source_coords, target_coords, search_space_buffer_m=None, input_crs=None, apply_mask=True, outside_value=None, bands=None)[source]

Bases: object

Class for efficiently working with raster data while preserving geographic transformation information. Can be initialized with either a file path or directly with raster data, CRS, and transform.

_init_from_metadata(source_coords, target_coords, search_space_buffer_m=None, input_crs=None, apply_mask=True, outside_value=None, bands=None)[source]

Initialize using metadata and raster data.

This method contains the common initialization code used regardless of whether the input is a path or direct data components.

Parameters:
  • source_coords (Union[tuple[float, float], list[float], list[Union[tuple[float, float], list[float]]]]) – Source point(s) as (x, y) tuple or list of tuples

  • target_coords (Union[tuple[float, float], list[float], list[Union[tuple[float, float], list[float]]]]) – Target point(s) as (x, y) tuple or list of tuples

  • search_space_buffer_m (Optional[float]) – Buffer distance in map units (typically meters)

  • input_crs (Optional[str]) – CRS of the input coordinates (e.g., ‘EPSG:4326’). If None, assumes same as raster

  • apply_mask (bool) – If True, apply the buffer mask after loading data

  • outside_value (Optional[Any]) – Value to set for pixels outside the buffer (defaults to max value of the data type)

  • bands (Optional[List[int]]) – List of bands to modify if apply_mask is True (1-based). If None, all bands are modified

static _transform_coords(coords, input_crs, target_crs)[source]

Transform coordinates from input_crs to target_crs. Handles both single coordinates and lists of coordinates.

Parameters:
  • coords (Union[tuple[float, float], list[float], list[Union[tuple[float, float], list[float]]]]) – Coordinates to transform from input_crs to target_crs

  • input_crs (str) – Coordinate reference system of the input coordinates

  • target_crs (str) – Coordinate reference system of the target coordinates

Returns:

The transformed coordinates

apply_geometry_mask(geometry, outside_value=None, bands=None)[source]

Set pixel values outside the given geometry to the specified value.

Parameters:
  • geometry (Polygon) – A shapely geometry object (Polygon)

  • outside_value (Optional[int]) – Value to set for pixels outside the geometry

  • bands (Union[list[int], int, None]) – List of bands to modify (1-based). If None, all bands are modified.

buffer_geometry: Polygon
coords_to_indices(coords)[source]

Convert geographic coordinates to pixel row/column indices within this raster section.

Parameters:

coords (Union[tuple[float, float], list[float], list[Union[tuple[float, float], list[float]]]]) – List of (x, y) coordinate tuples or a single coordinate tuple

Returns:

Array of (row, col) pixel indices

Return type:

numpy.ndarray

data: ndarray
estimate_buffer_width(source_coords, target_coords, min_buffer=200, max_buffer=4000, sample_radius=50)[source]

Estimate an appropriate buffer width for path finding based on terrain characteristics.

Parameters:
  • source_coords (Union[tuple[float, float], list[float], list[Union[tuple[float, float], list[float]]]]) – (x, y) coordinates of the source point

  • target_coords (Union[tuple[float, float], list[float], list[Union[tuple[float, float], list[float]]]]) – (x, y) coordinates of the target point

  • min_buffer (float) – Minimum buffer width to consider (meters)

  • max_buffer (float) – Maximum buffer width to consider (meters)

  • sample_radius (float) – Radius for sampling around the straight line to assess terrain complexity

Returns:

Estimated optimal buffer width in meters

indices_to_coords(indices)[source]

Convert pixel indices to geographic coordinates.

Parameters:

indices (List[Tuple[int, int]]) – List of (row, col) pixel indices

Returns:

Array of (x, y) coordinates

Return type:

numpy.ndarray

static max_distance_pair(coords1, coords2)[source]

Find the pair of coordinates (one from coords1, one from coords2) with the highest Euclidean distance.

Parameters:
  • coords1 (Union[tuple[float, float], list[float], list[Union[tuple[float, float], list[float]]]]) – Either a single coordinate tuple (x, y, …) or a list of coordinate tuples

  • coords2 (Union[tuple[float, float], list[float], list[Union[tuple[float, float], list[float]]]]) – Either a single coordinate tuple (x, y, …) or a list of coordinate tuples

Returns:

A tuple containing the two points with the maximum distance (point1, point2)

raster_dataset: RasterDataset
save_section_as_raster(output_path)[source]

Save the section as a new raster file with proper geo referencing.

Parameters:

output_path (str) – Path for the output raster file

search_space_buffer_m: float
window: Window
window_transform: Affine