API (General)

High level Interface

The high-level interface tries to provide an opinionated, Pythonic, API to interact with openEO back-ends. It’s aim is to hide some of the details of using a web service, so the user can produce concise and readable code.

Users that want to interact with openEO on a lower level, and have more control, can use the lower level classes.

openeo

openeo.connect(url=None, *, auth_type=None, auth_options=None, session=None, default_timeout=None, auto_validate=True)[source]

This method is the entry point to OpenEO. You typically create one connection object in your script or application and re-use it for all calls to that backend.

If the backend requires authentication, you can pass authentication data directly to this function, but it could be easier to authenticate as follows:

>>> # For basic authentication
>>> conn = connect(url).authenticate_basic(username="john", password="foo")
>>> # For OpenID Connect authentication
>>> conn = connect(url).authenticate_oidc(client_id="myclient")
Parameters:
  • url (Optional[str]) – The http url of the OpenEO back-end.

  • auth_type (Optional[str]) – Which authentication to use: None, “basic” or “oidc” (for OpenID Connect)

  • auth_options (Optional[dict]) – Options/arguments specific to the authentication type

  • default_timeout (Optional[int]) – default timeout (in seconds) for requests

  • auto_validate (bool) – toggle to automatically validate process graphs before execution

Return type:

Connection

Added in version 0.24.0: added auto_validate argument

openeo.rest.datacube

The main module for creating earth observation processes. It aims to easily build complex process chains, that can be evaluated by an openEO backend.

openeo.rest.datacube.THIS

Symbolic reference to the current data cube, to be used as argument in DataCube.process() calls

class openeo.rest.datacube.DataCube(graph, connection=None, metadata=None)[source]

Class representing a openEO (raster) data cube.

The data cube is represented by its corresponding openeo “process graph” and this process graph can be “grown” to a desired workflow by calling the appropriate methods.

__init__(graph, connection=None, metadata=None)[source]
add(other, reverse=False)[source]
Return type:

DataCube

See also

openeo.org documentation on process “add”.

add_dimension(name, label, type=None)[source]

Adds a new named dimension to the data cube. Afterwards, the dimension can be referenced with the specified name. If a dimension with the specified name exists, the process fails with a DimensionExists error. The dimension label of the dimension is set to the specified label.

This call does not modify the datacube in place, but returns a new datacube with the additional dimension.

Parameters:
  • name (str) – The name of the dimension to add

  • label (str) – The dimension label.

  • type (Optional[str]) – Dimension type, allowed values: ‘spatial’, ‘temporal’, ‘bands’, ‘other’, default value is ‘other’

Returns:

The data cube with a newly added dimension. The new dimension has exactly one dimension label. All other dimensions remain unchanged.

See also

openeo.org documentation on process “add_dimension”.

aggregate_spatial(geometries, reducer, target_dimension=None, crs=None, context=None)[source]

Aggregates statistics for one or more geometries (e.g. zonal statistics for polygons) over the spatial dimensions.

Parameters:
  • geometries (Union[BaseGeometry, dict, str, Path, Parameter, VectorCube]) –

    The geometries to aggregate in. Can be provided in different ways:

    • a shapely geometry

    • a GeoJSON-style dictionary,

    • a public URL to the geometries in a vector format that is supported by the backend (also see Connection.list_file_formats()), e.g. GeoJSON, GeoParquet, etc. A load_url process will automatically be added to the process graph.

    • a path (str or Path) to a local, client-side GeoJSON file, which will be loaded automatically to get the geometries as GeoJSON construct.

    • a VectorCube instance.

    • a Parameter instance.

  • reducer (Union[str, Callable, PGNode]) –

    the “child callback”: the name of a single openEO process, or a callback function as discussed in Processes with child “callbacks”, or a UDF instance.

    The callback should correspond to a process that receives an array of numerical values and returns a single numerical value. For example:

  • target_dimension (Optional[str]) – The new dimension name to be used for storing the results.

  • crs (Union[int, str, None]) –

    The spatial reference system of the provided polygon. By default, longitude-latitude (EPSG:4326) is assumed. See openeo.util.normalize_crs() for more details about additional normalization that is applied to this argument.

    Note

    this crs argument is a non-standard/experimental feature, only supported by specific back-ends. See https://github.com/Open-EO/openeo-processes/issues/235 for details.

  • context (Optional[dict]) – Additional data to be passed to the reducer process.

Return type:

VectorCube

Changed in version 0.36.0: Support passing a URL as geometries argument, which will be loaded with the load_url process.

Changed in version 0.36.0: Support for passing a backend-side path as geometries argument was removed (also see Legacy read_vector usage). Instead, it’s possible to provide a client-side path to a GeoJSON file (which will be loaded client-side to get the geometries as GeoJSON construct).

See also

openeo.org documentation on process “aggregate_spatial”.

aggregate_spatial_window(reducer, size, boundary='pad', align='upper-left', context=None)[source]

Aggregates statistics over the horizontal spatial dimensions (axes x and y) of the data cube.

The pixel grid for the axes x and y is divided into non-overlapping windows with the size specified in the parameter size. If the number of values for the axes x and y is not a multiple of the corresponding window size, the behavior specified in the parameters boundary and align is applied. For each of these windows, the reducer process computes the result.

Parameters:
  • reducer (Union[str, Callable, PGNode]) – the “child callback”: the name of a single openEO process, or a callback function as discussed in Processes with child “callbacks”, or a UDF instance.

  • size (List[int]) – Window size in pixels along the horizontal spatial dimensions. The first value corresponds to the x axis, the second value corresponds to the y axis.

  • boundary (str) –

    Behavior to apply if the number of values for the axes x and y is not a multiple of the corresponding value in the size parameter. Options are:

    • pad (default): pad the data cube with the no-data value null to fit the required window size.

    • trim: trim the data cube to fit the required window size.

    Use the parameter align to align the data to the desired corner.

  • align (str) – If the data requires padding or trimming (see parameter boundary), specifies to which corner of the spatial extent the data is aligned to. For example, if the data is aligned to the upper left, the process pads/trims at the lower-right.

  • context (Optional[dict]) – Additional data to be passed to the process.

Return type:

DataCube

Returns:

A data cube with the newly computed values and the same dimensions.

See also

openeo.org documentation on process “aggregate_spatial_window”.

aggregate_temporal(intervals, reducer, labels=None, dimension=None, context=None)[source]

Computes a temporal aggregation based on an array of date and/or time intervals.

Calendar hierarchies such as year, month, week etc. must be transformed into specific intervals by the clients. For each interval, all data along the dimension will be passed through the reducer. The computed values will be projected to the labels, so the number of labels and the number of intervals need to be equal.

If the dimension is not set, the data cube is expected to only have one temporal dimension.

Parameters:
  • intervals (List[list]) – Temporal left-closed intervals so that the start time is contained, but not the end time.

  • reducer (Union[str, Callable, PGNode]) –

    the “child callback”: the name of a single openEO process, or a callback function as discussed in Processes with child “callbacks”, or a UDF instance.

    The callback should correspond to a process that receives an array of numerical values and returns a single numerical value. For example:

  • labels (Optional[List[str]]) – Labels for the intervals. The number of labels and the number of groups need to be equal.

  • dimension (Optional[str]) – The temporal dimension for aggregation. All data along the dimension will be passed through the specified reducer. If the dimension is not set, the data cube is expected to only have one temporal dimension.

  • context (Optional[dict]) – Additional data to be passed to the reducer. Not set by default.

Return type:

DataCube

Returns:

A DataCube containing a result for each time window

See also

openeo.org documentation on process “aggregate_temporal”.

aggregate_temporal_period(period, reducer, dimension=None, context=None)[source]

Computes a temporal aggregation based on calendar hierarchies such as years, months or seasons. For other calendar hierarchies aggregate_temporal can be used.

For each interval, all data along the dimension will be passed through the reducer.

If the dimension is not set or is set to null, the data cube is expected to only have one temporal dimension.

The period argument specifies the time intervals to aggregate. The following pre-defined values are available:

  • hour: Hour of the day

  • day: Day of the year

  • week: Week of the year

  • dekad: Ten day periods, counted per year with three periods per month (day 1 - 10, 11 - 20 and 21 - end of month). The third dekad of the month can range from 8 to 11 days. For example, the fourth dekad is Feb, 1 - Feb, 10 each year.

  • month: Month of the year

  • season: Three month periods of the calendar seasons (December - February, March - May, June - August, September - November).

  • tropical-season: Six month periods of the tropical seasons (November - April, May - October).

  • year: Proleptic years

  • decade: Ten year periods (0-to-9 decade), from a year ending in a 0 to the next year ending in a 9.

  • decade-ad: Ten year periods (1-to-0 decade) better aligned with the Anno Domini (AD) calendar era, from a year ending in a 1 to the next year ending in a 0.

Parameters:
  • period (str) – The period of the time intervals to aggregate.

  • reducer (Union[str, PGNode, Callable]) – A reducer to be applied on all values along the specified dimension. The reducer must be a callable process (or a set processes) that accepts an array and computes a single return value of the same type as the input values, for example median.

  • dimension (Optional[str]) – The temporal dimension for aggregation. All data along the dimension will be passed through the specified reducer. If the dimension is not set, the data cube is expected to only have one temporal dimension.

  • context (Optional[Dict]) – Additional data to be passed to the reducer.

Return type:

DataCube

Returns:

A data cube with the same dimensions. The dimension properties (name, type, labels, reference system and resolution) remain unchanged.

See also

openeo.org documentation on process “aggregate_temporal_period”.

apply(process, context=None)[source]

Applies a unary process (a local operation) to each value of the specified or all dimensions in the data cube.

Parameters:
  • process (Union[str, Callable, UDF, PGNode]) –

    the “child callback”: the name of a single process, or a callback function as discussed in Processes with child “callbacks”, or a UDF instance.

    The callback should correspond to a process that receives a single numerical value and returns a single numerical value. For example:

  • context (Optional[dict]) – Additional data to be passed to the process.

Return type:

DataCube

Returns:

A data cube with the newly computed values. The resolution, cardinality and the number of dimensions are the same as for the original data cube.

See also

openeo.org documentation on process “apply”.

apply_dimension(code=None, runtime=None, process=None, version=None, dimension='t', target_dimension=None, context=None)[source]

Applies a process to all pixel values along a dimension of a raster data cube. For example, if the temporal dimension is specified the process will work on a time series of pixel values.

The process to apply is specified by either code and runtime in case of a UDF, or by providing a callback function in the process argument.

The process reduce_dimension also applies a process to pixel values along a dimension, but drops the dimension afterwards. The process apply applies a process to each pixel value in the data cube.

The target dimension is the source dimension if not specified otherwise in the target_dimension parameter. The pixel values in the target dimension get replaced by the computed pixel values. The name, type and reference system are preserved.

The dimension labels are preserved when the target dimension is the source dimension and the number of pixel values in the source dimension is equal to the number of values computed by the process. Otherwise, the dimension labels will be incrementing integers starting from zero, which can be changed using rename_labels afterwards. The number of labels will equal to the number of values computed by the process.

Parameters:
  • code (Optional[str]) – [deprecated] UDF code or process identifier (optional)

  • runtime – [deprecated] UDF runtime to use (optional)

  • process (Union[str, Callable, UDF, PGNode]) –

    the “child callback”: the name of a single process, or a callback function as discussed in Processes with child “callbacks”, or a UDF instance.

    The callback should correspond to a process that receives an array of numerical values and returns an array of numerical values. For example:

  • version (Optional[str]) – [deprecated] Version of the UDF runtime to use

  • dimension (str) – The name of the source dimension to apply the process on. Fails with a DimensionNotAvailable error if the specified dimension does not exist.

  • target_dimension (Optional[str]) – The name of the target dimension or null (the default) to use the source dimension specified in the parameter dimension. By specifying a target dimension, the source dimension is removed. The target dimension with the specified name and the type other (see add_dimension) is created, if it doesn’t exist yet.

  • context (Optional[dict]) – Additional data to be passed to the process.

Return type:

DataCube

Returns:

A datacube with the UDF applied to the given dimension.

Raises:

DimensionNotAvailable

Changed in version 0.13.0: arguments code, runtime and version are deprecated if favor of the standard approach of using an UDF object in the process argument. See openeo.UDF API and usage changes in version 0.13.0 for more background about the changes.

See also

openeo.org documentation on process “apply_dimension”.

apply_kernel(kernel, factor=1.0, border=0, replace_invalid=0)[source]

Applies a focal operation based on a weighted kernel to each value of the specified dimensions in the data cube.

The border parameter determines how the data is extended when the kernel overlaps with the borders. The following options are available:

  • numeric value - fill with a user-defined constant number n: nnnnnn|abcdefgh|nnnnnn (default, with n = 0)

  • replicate - repeat the value from the pixel at the border: aaaaaa|abcdefgh|hhhhhh

  • reflect - mirror/reflect from the border: fedcba|abcdefgh|hgfedc

  • reflect_pixel - mirror/reflect from the center of the pixel at the border: gfedcb|abcdefgh|gfedcb

  • wrap - repeat/wrap the image: cdefgh|abcdefgh|abcdef

Parameters:
  • kernel (Union[ndarray, List[List[float]]]) – The kernel to be applied on the data cube. The kernel has to be as many dimensions as the data cube has dimensions.

  • factor – A factor that is multiplied to each value computed by the focal operation. This is basically a shortcut for explicitly multiplying each value by a factor afterwards, which is often required for some kernel-based algorithms such as the Gaussian blur.

  • border – Determines how the data is extended when the kernel overlaps with the borders. Defaults to fill the border with zeroes.

  • replace_invalid – This parameter specifies the value to replace non-numerical or infinite numerical values with. By default, those values are replaced with zeroes.

Return type:

DataCube

Returns:

A data cube with the newly computed values. The resolution, cardinality and the number of dimensions are the same as for the original data cube.

See also

openeo.org documentation on process “apply_kernel”.

apply_neighborhood(process, size, overlap=None, context=None)[source]

Applies a focal process to a data cube.

A focal process is a process that works on a ‘neighbourhood’ of pixels. The neighbourhood can extend into multiple dimensions, this extent is specified by the size argument. It is not only (part of) the size of the input window, but also the size of the output for a given position of the sliding window. The sliding window moves with multiples of size.

An overlap can be specified so that neighbourhoods can have overlapping boundaries. This allows for continuity of the output. The values included in the data cube as overlap can’t be modified by the given process.

The neighbourhood size should be kept small enough, to avoid running beyond computational resources, but a too small size will result in a larger number of process invocations, which may slow down processing. Window sizes for spatial dimensions typically are in the range of 64 to 512 pixels, while overlaps of 8 to 32 pixels are common.

The process must not add new dimensions, or remove entire dimensions, but the result can have different dimension labels.

For the special case of 2D convolution, it is recommended to use apply_kernel().

Parameters:
  • size (List[Dict])

  • overlap (List[dict])

  • process (Union[str, PGNode, Callable, UDF]) – a callback function that creates a process graph, see Processes with child “callbacks”

  • context (Optional[dict]) – Additional data to be passed to the process.

Return type:

DataCube

Returns:

See also

openeo.org documentation on process “apply_neighborhood”.

apply_polygon(geometries=None, process=None, mask_value=None, context=None, **kwargs)[source]

Apply a process to segments of the data cube that are defined by the given polygons. For each polygon provided, all pixels for which the point at the pixel center intersects with the polygon (as defined in the Simple Features standard by the OGC) are collected into sub data cubes. If a pixel is part of multiple of the provided polygons (e.g., when the polygons overlap), the GeometriesOverlap exception is thrown. Each sub data cube is passed individually to the given process.

Parameters:
  • geometries (Union[BaseGeometry, dict, str, Path, Parameter, VectorCube]) –

    Can be provided in different ways:

    • a shapely geometry

    • a GeoJSON-style dictionary,

    • a public URL to the geometries in a vector format that is supported by the backend (also see Connection.list_file_formats()), e.g. GeoJSON, GeoParquet, etc. A load_url process will automatically be added to the process graph.

    • a path (str or Path) to a local, client-side GeoJSON file, which will be loaded automatically to get the geometries as GeoJSON construct.

    • a VectorCube instance.

    • a Parameter instance.

  • process (Union[str, PGNode, Callable, UDF]) – “child callback” function, see Processes with child “callbacks”

  • mask_value (Optional[float]) – The value used for pixels outside the polygon.

  • context (Optional[dict]) – Additional data to be passed to the process.

Return type:

DataCube

Warning

experimental process: not generally supported, API subject to change.

Changed in version 0.32.0: Argument polygons was renamed to geometries. While deprecated, the old name polygons is still supported as keyword argument for backwards compatibility.

Changed in version 0.36.0: Support passing a URL as geometries argument, which will be loaded with the load_url process.

Changed in version 0.36.0: Support for passing a backend-side path as geometries argument was removed (also see Legacy read_vector usage). Instead, it’s possible to provide a client-side path to a GeoJSON file (which will be loaded client-side to get the geometries as GeoJSON construct).

See also

openeo.org documentation on process “apply_polygon”.

ard_normalized_radar_backscatter(elevation_model=None, contributing_area=False, ellipsoid_incidence_angle=False, noise_removal=True)[source]

Computes CARD4L compliant backscatter (gamma0) from SAR input. This method is a variant of sar_backscatter(), with restricted parameters to generate backscatter according to CARD4L specifications.

Note that backscatter computation may require instrument specific metadata that is tightly coupled to the original SAR products. As a result, this process may only work in combination with loading data from specific collections, not with general data cubes.

Parameters:
  • elevation_model (str) – The digital elevation model to use. Set to None (the default) to allow the back-end to choose, which will improve portability, but reduce reproducibility.

  • contributing_area – If set to true, a DEM-based local contributing area band named contributing_area is added. The values are given in square meters.

  • ellipsoid_incidence_angle (bool) – If set to True, an ellipsoidal incidence angle band named ellipsoid_incidence_angle is added. The values are given in degrees.

  • noise_removal (bool) – If set to false, no noise removal is applied. Defaults to True, which removes noise.

Return type:

DataCube

Returns:

Backscatter values expressed as gamma0. The data returned is CARD4L compliant and contains metadata. By default, the backscatter values are given in linear scale.

See also

openeo.org documentation on process “ard_normalized_radar_backscatter”.

ard_surface_reflectance(atmospheric_correction_method, cloud_detection_method, elevation_model=None, atmospheric_correction_options=None, cloud_detection_options=None)[source]

Computes CARD4L compliant surface reflectance values from optical input.

Parameters:
  • atmospheric_correction_method (str) – The atmospheric correction method to use.

  • cloud_detection_method (str) – The cloud detection method to use.

  • elevation_model (str) – The digital elevation model to use, leave empty to allow the back-end to make a suitable choice.

  • atmospheric_correction_options (dict) – Proprietary options for the atmospheric correction method.

  • cloud_detection_options (dict) – Proprietary options for the cloud detection method.

Return type:

DataCube

Returns:

Data cube containing bottom of atmosphere reflectances with atmospheric disturbances like clouds and cloud shadows removed. The data returned is CARD4L compliant and contains metadata.

See also

openeo.org documentation on process “ard_surface_reflectance”.

atmospheric_correction(method=None, elevation_model=None, options=None)[source]

Applies an atmospheric correction that converts top of atmosphere reflectance values into bottom of atmosphere/top of canopy reflectance values.

Note that multiple atmospheric methods exist, but may not be supported by all backends. The method parameter gives you the option of requiring a specific method, but this may result in an error if the backend does not support it.

Parameters:
  • method (str) – The atmospheric correction method to use. To get reproducible results, you have to set a specific method. Set to null to allow the back-end to choose, which will improve portability, but reduce reproducibility as you may get different results if you run the processes multiple times.

  • elevation_model (str) – The digital elevation model to use, leave empty to allow the back-end to make a suitable choice.

  • options (dict) – Proprietary options for the atmospheric correction method.

Return type:

DataCube

Returns:

datacube with bottom of atmosphere reflectances

See also

openeo.org documentation on process “atmospheric_correction”.

band(band)[source]

Filter out a single band

Parameters:

band (Union[str, int]) – band name, band common name or band index.

Return type:

DataCube

Returns:

a DataCube instance

band_filter(bands)
Return type:

DataCube

Deprecated since version 0.1.0: Usage of this legacy method is deprecated. Use filter_bands() instead.

chunk_polygon(chunks, process, mask_value=None, context=None)[source]
Return type:

DataCube

Deprecated since version 0.26.0: Use apply_polygon().

count_time()[source]

Counts the number of images with a valid mask in a time series for all bands of the input dataset.

Return type:

DataCube

Returns:

a DataCube instance

See also

openeo.org documentation on process “count”.

classmethod create_collection(cls, collection_id, connection=None, spatial_extent=None, temporal_extent=None, bands=None, fetch_metadata=True, properties=None, max_cloud_cover=None)
Return type:

DataCube

Deprecated since version 0.4.6: Usage of this legacy class method is deprecated. Use load_collection() instead.

create_job(out_format=None, *, title=None, description=None, plan=None, budget=None, additional=None, job_options=None, validate=None, auto_add_save_result=True, **format_options)[source]

Sends the datacube’s process graph as a batch job to the back-end and return a BatchJob instance.

Note that the batch job will just be created at the back-end, it still needs to be started and tracked explicitly. Use execute_batch() instead to have the openEO Python client take care of that job management.

Parameters:
  • out_format (Optional[str]) – output file format.

  • title (Optional[str]) – job title

  • description (Optional[str]) – job description

  • plan (Optional[str]) – The billing plan to process and charge the job with

  • budget (Optional[float]) – Maximum budget to be spent on executing the job. Note that some backends do not honor this limit.

  • additional (Optional[dict]) – additional (top-level) properties to set in the request body

  • job_options (Optional[dict]) – dictionary of job options to pass to the backend (under top-level property “job_options”)

  • validate (Optional[bool]) – Optional toggle to enable/prevent validation of the process graphs before execution (overruling the connection’s auto_validate setting).

  • auto_add_save_result (bool) – Automatically add a save_result node to the process graph if there is none yet.

Return type:

BatchJob

Returns:

Created job.

Added in version 0.32.0: Added auto_add_save_result option

Added in version 0.36.0: Added additional argument.

dimension_labels(dimension)[source]

Gives all labels for a dimension in the data cube. The labels have the same order as in the data cube.

Parameters:

dimension (str) – The name of the dimension to get the labels for.

Return type:

DataCube

See also

openeo.org documentation on process “dimension_labels”.

divide(other, reverse=False)[source]
Return type:

DataCube

See also

openeo.org documentation on process “divide”.

download(outputfile=None, format=None, options=None, *, validate=None, auto_add_save_result=True, additional=None, job_options=None)[source]

Execute synchronously and download the raster data cube, e.g. as GeoTIFF.

If outputfile is provided, the result is stored on disk locally, otherwise, a bytes object is returned. The bytes object can be passed on to a suitable decoder for decoding.

Parameters:
  • outputfile (Union[str, Path, None]) – Optional, an output file if the result needs to be stored on disk.

  • format (Optional[str]) – Optional, an output format supported by the backend.

  • options (Optional[dict]) – Optional, file format options

  • validate (Optional[bool]) – Optional toggle to enable/prevent validation of the process graphs before execution (overruling the connection’s auto_validate setting).

  • auto_add_save_result (bool) – Automatically add a save_result node to the process graph if there is none yet.

  • additional (Optional[dict]) – additional (top-level) properties to set in the request body

  • job_options (Optional[dict]) – dictionary of job options to pass to the backend (under top-level property “job_options”)

Return type:

Optional[bytes]

Returns:

None if the result is stored to disk, or a bytes object returned by the backend.

Changed in version 0.32.0: Added auto_add_save_result option

Added in version 0.36.0: Added arguments additional and job_options.

drop_dimension(name)[source]

Drops a dimension from the data cube. Dropping a dimension only works on dimensions with a single dimension label left, otherwise the process fails with a DimensionLabelCountMismatch exception. Dimension values can be reduced to a single value with a filter such as filter_bands or the reduce_dimension process. If a dimension with the specified name does not exist, the process fails with a DimensionNotAvailable exception.

Parameters:

name (str) – The name of the dimension to drop

Returns:

The data cube with the given dimension dropped.

See also

openeo.org documentation on process “drop_dimension”.

execute(*, validate=None, auto_decode=True)[source]

Execute a process graph synchronously and return the result. If the result is a JSON object, it will be parsed.

Parameters:
  • validate (Optional[bool]) – Optional toggle to enable/prevent validation of the process graphs before execution (overruling the connection’s auto_validate setting).

  • auto_decode (bool) – Boolean flag to enable/disable automatic JSON decoding of the response. Defaults to True.

Return type:

Union[dict, Response]

Returns:

parsed JSON response as a dict if auto_decode is True, otherwise response object

execute_batch(outputfile=None, out_format=None, *, title=None, description=None, plan=None, budget=None, print=<built-in function print>, max_poll_interval=60, connection_retry_interval=30, additional=None, job_options=None, validate=None, auto_add_save_result=True, **format_options)[source]

Evaluate the process graph by creating a batch job, and retrieving the results when it is finished. This method is mostly recommended if the batch job is expected to run in a reasonable amount of time.

For very long-running jobs, you probably do not want to keep the client running.

Parameters:
  • outputfile (Union[str, Path, None]) – The path of a file to which a result can be written

  • out_format (Optional[str]) – (optional) File format to use for the job result.

  • additional (Optional[dict]) – additional (top-level) properties to set in the request body

  • job_options (Optional[dict]) – dictionary of job options to pass to the backend (under top-level property “job_options”)

  • validate (Optional[bool]) – Optional toggle to enable/prevent validation of the process graphs before execution (overruling the connection’s auto_validate setting).

  • auto_add_save_result (bool) – Automatically add a save_result node to the process graph if there is none yet.

Return type:

BatchJob

Changed in version 0.32.0: Added auto_add_save_result option

Added in version 0.36.0: Added argument additional.

static execute_local_udf(udf, datacube=None, fmt='netcdf')[source]

Deprecated since version 0.7.0: Use openeo.udf.run_code.execute_local_udf() instead

filter_bands(bands)[source]

Filter the data cube by the given bands

Parameters:

bands (Union[List[Union[str, int]], str]) – list of band names, common names or band indices. Single band name can also be given as string.

Return type:

DataCube

Returns:

a DataCube instance

See also

openeo.org documentation on process “filter_bands”.

filter_bbox(*args, west=None, south=None, east=None, north=None, crs=None, base=None, height=None, bbox=None)[source]

Limits the data cube to the specified bounding box.

The bounding box can be specified in multiple ways.

  • With keyword arguments:

    >>> cube.filter_bbox(west=3, south=51, east=4, north=52, crs=4326)
    
  • With a (west, south, east, north) list or tuple (note that EPSG:4326 is the default CRS, so it’s not necessary to specify it explicitly):

    >>> cube.filter_bbox([3, 51, 4, 52])
    >>> cube.filter_bbox(bbox=[3, 51, 4, 52])
    
  • With a bbox dictionary:

    >>> bbox = {"west": 3, "south": 51, "east": 4, "north": 52, "crs": 4326}
    >>> cube.filter_bbox(bbox)
    >>> cube.filter_bbox(bbox=bbox)
    >>> cube.filter_bbox(**bbox)
    
  • With a shapely geometry (of which the bounding box will be used):

    >>> cube.filter_bbox(geometry)
    >>> cube.filter_bbox(bbox=geometry)
    
  • Passing a parameter:

    >>> bbox_param = Parameter(name="my_bbox", schema="object")
    >>> cube.filter_bbox(bbox_param)
    >>> cube.filter_bbox(bbox=bbox_param)
    
  • With a CRS other than EPSG 4326:

    >>> cube.filter_bbox(
    ... west=652000, east=672000, north=5161000, south=5181000,
    ... crs=32632
    ... )
    
  • Deprecated: positional arguments are also supported, but follow a non-standard order for legacy reasons:

    >>> west, east, north, south = 3, 4, 52, 51
    >>> cube.filter_bbox(west, east, north, south)
    
Parameters:

crs (Union[int, str, None]) – value describing the coordinate reference system. Typically just an int (interpreted as EPSG code, e.g. 4326) or a string (handled as authority string, e.g. "EPSG:4326"). See openeo.util.normalize_crs() for more details about additional normalization that is applied to this argument.

Return type:

DataCube

See also

openeo.org documentation on process “filter_bbox”.

filter_labels(condition, dimension, context=None)[source]

Filters the dimension labels in the data cube for the given dimension. Only the dimension labels that match the specified condition are preserved, all other labels with their corresponding data get removed.

Parameters:
  • condition (Union[PGNode, Callable]) – the “child callback” which will be given a single label value (number or string) and returns a boolean expressing if the label should be preserved. Also see Processes with child “callbacks”.

  • dimension (str) – The name of the dimension to filter on.

Return type:

DataCube

Added in version 0.27.0.

See also

openeo.org documentation on process “filter_labels”.

filter_spatial(geometries)[source]

Limits the data cube over the spatial dimensions to the specified geometries.

  • For polygons, the filter retains a pixel in the data cube if the point at the pixel center intersects with at least one of the polygons (as defined in the Simple Features standard by the OGC).

  • For points, the process considers the closest pixel center.

  • For lines (line strings), the process considers all the pixels whose centers are closest to at least one point on the line.

More specifically, pixels outside of the bounding box of the given geometry will not be available after filtering. All pixels inside the bounding box that are not retained will be set to null (no data).

Parameters:

geometries (Union[BaseGeometry, dict, str, Path, Parameter, VectorCube]) –

One or more geometries used for filtering, Can be provided in different ways:

  • a shapely geometry

  • a GeoJSON-style dictionary,

  • a public URL to the geometries in a vector format that is supported by the backend (also see Connection.list_file_formats()), e.g. GeoJSON, GeoParquet, etc. A load_url process will automatically be added to the process graph.

  • a path (str or Path) to a local, client-side GeoJSON file, which will be loaded automatically to get the geometries as GeoJSON construct.

  • a VectorCube instance.

  • a Parameter instance.

Return type:

DataCube

Returns:

A data cube restricted to the specified geometries. The dimensions and dimension properties (name, type, labels, reference system and resolution) remain unchanged, except that the spatial dimensions have less (or the same) dimension labels.

Changed in version 0.36.0: Support passing a URL as geometries argument, which will be loaded with the load_url process.

Changed in version 0.36.0: Support for passing a backend-side path as geometries argument was removed (also see Legacy read_vector usage). Instead, it’s possible to provide a client-side path to a GeoJSON file (which will be loaded client-side to get the geometries as GeoJSON construct).

See also

openeo.org documentation on process “filter_spatial”.

filter_temporal(*args, start_date=None, end_date=None, extent=None)[source]

Limit the DataCube to a certain date range, which can be specified in several ways:

>>> cube.filter_temporal("2019-07-01", "2019-08-01")
>>> cube.filter_temporal(["2019-07-01", "2019-08-01"])
>>> cube.filter_temporal(extent=["2019-07-01", "2019-08-01"])
>>> cube.filter_temporal(start_date="2019-07-01", end_date="2019-08-01"])

See Filter on temporal extent for more details on temporal extent handling and shorthand notation.

Parameters:
  • start_date (Union[str, date, Parameter, PGNode, ProcessBuilderBase, None]) – start date of the filter (inclusive), as a string or date object

  • end_date (Union[str, date, Parameter, PGNode, ProcessBuilderBase, None]) – end date of the filter (exclusive), as a string or date object

  • extent (Union[Sequence[Union[str, date, Parameter, PGNode, ProcessBuilderBase, None]], Parameter, str, None]) – temporal extent. Typically, specified as a two-item list or tuple containing start and end date.

Return type:

DataCube

Changed in version 0.23.0: Arguments start_date, end_date and extent: add support for year/month shorthand notation as discussed at Year/month shorthand notation.

See also

openeo.org documentation on process “filter_temporal”.

fit_curve(parameters, function, dimension)[source]

Use non-linear least squares to fit a model function y = f(x, parameters) to data.

The process throws an InvalidValues exception if invalid values are encountered. Invalid values are finite numbers (see also is_valid()).

Warning

experimental process: not generally supported, API subject to change. https://github.com/Open-EO/openeo-processes/pull/240

Parameters:

See also

openeo.org documentation on process “fit_curve”.

flat_graph()

Get the process graph in internal flat dict representation. :rtype: Dict[str, dict]

Warning

This method is mainly intended for internal use. It is not recommended for general use and is subject to change.

Instead, it is recommended to use to_json() or print_json() to obtain a standardized, interoperable JSON representation of the process graph. See Export a process graph for more information.

flatten_dimensions(dimensions, target_dimension, label_separator=None)[source]

Combines multiple given dimensions into a single dimension by flattening the values and merging the dimension labels with the given label_separator. Non-string dimension labels will be converted to strings. This process is the opposite of the process unflatten_dimension() but executing both processes subsequently doesn’t necessarily create a data cube that is equal to the original data cube.

Parameters:
  • dimensions (List[str]) – The names of the dimension to combine.

  • target_dimension (str) – The name of a target dimension with a single dimension label to replace.

  • label_separator (Optional[str]) – The string that will be used as a separator for the concatenated dimension labels.

Returns:

A data cube with the new shape.

Warning

experimental process: not generally supported, API subject to change.

Added in version 0.10.0.

See also

openeo.org documentation on process “flatten_dimensions”.

graph_add_node(process_id, arguments=None, metadata=None, namespace=None, **kwargs)
Return type:

DataCube

Deprecated since version 0.1.1: Usage of this legacy method is deprecated. Use process() instead.

linear_scale_range(input_min, input_max, output_min, output_max)[source]

Performs a linear transformation between the input and output range.

The given number in x is clipped to the bounds specified in inputMin and inputMax so that the underlying formula

((x - inputMin) / (inputMax - inputMin)) * (outputMax - outputMin) + outputMin

never returns any value lower than outputMin or greater than outputMax.

Potential use case include scaling values to the 8-bit range (0 - 255) often used for numeric representation of values in one of the channels of the RGB colour model or calculating percentages (0 - 100).

The no-data value null is passed through and therefore gets propagated.

Parameters:
  • input_min – Minimum input value

  • input_max – Maximum input value

  • output_min – Minimum value of the desired output range.

  • output_max – Maximum value of the desired output range.

Return type:

DataCube

Returns:

a DataCube instance

See also

openeo.org documentation on process “linear_scale_range”.

ln()[source]
Return type:

DataCube

See also

openeo.org documentation on process “ln”.

classmethod load_collection(collection_id, connection=None, spatial_extent=None, temporal_extent=None, bands=None, fetch_metadata=True, properties=None, max_cloud_cover=None)[source]

Create a new Raster Data cube.

Parameters:
  • collection_id (Union[str, Parameter]) – image collection identifier

  • connection (Optional[Connection]) – The backend connection to use. Can be None to work without connection and collection metadata.

  • spatial_extent (Union[Dict[str, float], Parameter, None]) – limit data to specified bounding box or polygons

  • temporal_extent (Union[Sequence[Union[str, date, Parameter, PGNode, ProcessBuilderBase, None]], Parameter, str, None]) – limit data to specified temporal interval. Typically, just a two-item list or tuple containing start and end date. See Filter on temporal extent for more details on temporal extent handling and shorthand notation.

  • bands (Union[None, List[str], Parameter]) – only add the specified bands.

  • properties (Union[None, Dict[str, Union[str, PGNode, Callable]], List[CollectionProperty], CollectionProperty]) – limit data by metadata property predicates. See collection_property() for easy construction of such predicates.

  • max_cloud_cover (Optional[float]) – shortcut to set maximum cloud cover (“eo:cloud_cover” collection property)

Return type:

DataCube

Returns:

new DataCube containing the collection

Changed in version 0.13.0: added the max_cloud_cover argument.

Changed in version 0.23.0: Argument temporal_extent: add support for year/month shorthand notation as discussed at Year/month shorthand notation.

Changed in version 0.26.0: Add collection_property() support to properties argument.

See also

openeo.org documentation on process “load_collection”.

classmethod load_disk_collection(connection, file_format, glob_pattern, **options)[source]

Loads image data from disk as a DataCube. This is backed by a non-standard process (‘load_disk_data’). This will eventually be replaced by standard options such as openeo.rest.connection.Connection.load_stac() or https://processes.openeo.org/#load_uploaded_files

Parameters:
  • connection (Connection) – The connection to use to connect with the backend.

  • file_format (str) – the file format, e.g. ‘GTiff’

  • glob_pattern (str) – a glob pattern that matches the files to load from disk

  • options – options specific to the file format

Return type:

DataCube

Returns:

the data as a DataCube

Deprecated since version 0.25.0: Depends on non-standard process, replace with openeo.rest.connection.Connection.load_stac() where possible.

classmethod load_stac(url, spatial_extent=None, temporal_extent=None, bands=None, properties=None, connection=None)[source]

Loads data from a static STAC catalog or a STAC API Collection and returns the data as a processable DataCube. A batch job result can be loaded by providing a reference to it.

If supported by the underlying metadata and file format, the data that is added to the data cube can be restricted with the parameters spatial_extent, temporal_extent and bands. If no data is available for the given extents, a NoDataAvailable error is thrown.

Remarks:

  • The bands (and all dimensions that specify nominal dimension labels) are expected to be ordered as specified in the metadata if the bands parameter is set to null.

  • If no additional parameter is specified this would imply that the whole data set is expected to be loaded. Due to the large size of many data sets, this is not recommended and may be optimized by back-ends to only load the data that is actually required after evaluating subsequent processes such as filters. This means that the values should be processed only after the data has been limited to the required extent and as a consequence also to a manageable size.

Parameters:
  • url (str) –

    The URL to a static STAC catalog (STAC Item, STAC Collection, or STAC Catalog) or a specific STAC API Collection that allows to filter items and to download assets. This includes batch job results, which itself are compliant to STAC. For external URLs, authentication details such as API keys or tokens may need to be included in the URL.

    Batch job results can be specified in two ways:

    • For Batch job results at the same back-end, a URL pointing to the corresponding batch job results endpoint should be provided. The URL usually ends with /jobs/{id}/results and {id} is the corresponding batch job ID.

    • For external results, a signed URL must be provided. Not all back-ends support signed URLs, which are provided as a link with the link relation canonical in the batch job result metadata.

  • spatial_extent (Union[Dict[str, float], Parameter, None]) –

    Limits the data to load to the specified bounding box or polygons.

    For raster data, the process loads the pixel into the data cube if the point at the pixel center intersects with the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC).

    For vector data, the process loads the geometry into the data cube if the geometry is fully within the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC). Empty geometries may only be in the data cube if no spatial extent has been provided.

    The GeoJSON can be one of the following feature types:

    • A Polygon or MultiPolygon geometry,

    • a Feature with a Polygon or MultiPolygon geometry, or

    • a FeatureCollection containing at least one Feature with Polygon or MultiPolygon geometries.

    Set this parameter to None to set no limit for the spatial extent. Be careful with this when loading large datasets. It is recommended to use this parameter instead of using filter_bbox() or filter_spatial() directly after loading unbounded data.

  • temporal_extent (Union[Sequence[Union[str, date, Parameter, PGNode, ProcessBuilderBase, None]], Parameter, str, None]) –

    Limits the data to load to the specified left-closed temporal interval. Applies to all temporal dimensions. The interval has to be specified as an array with exactly two elements:

    1. The first element is the start of the temporal interval. The specified instance in time is included in the interval.

    2. The second element is the end of the temporal interval. The specified instance in time is excluded from the interval.

    The second element must always be greater/later than the first element. Otherwise, a TemporalExtentEmpty exception is thrown.

    Also supports open intervals by setting one of the boundaries to None, but never both.

    Set this parameter to None to set no limit for the temporal extent. Be careful with this when loading large datasets. It is recommended to use this parameter instead of using filter_temporal() directly after loading unbounded data.

  • bands (Optional[List[str]]) –

    Only adds the specified bands into the data cube so that bands that don’t match the list of band names are not available. Applies to all dimensions of type bands.

    Either the unique band name (metadata field name in bands) or one of the common band names (metadata field common_name in bands) can be specified. If the unique band name and the common name conflict, the unique band name has a higher priority.

    The order of the specified array defines the order of the bands in the data cube. If multiple bands match a common name, all matched bands are included in the original order.

    It is recommended to use this parameter instead of using filter_bands() directly after loading unbounded data.

  • properties (Optional[Dict[str, Union[str, PGNode, Callable]]]) –

    Limits the data by metadata properties to include only data in the data cube which all given conditions return True for (AND operation).

    Specify key-value-pairs with the key being the name of the metadata property, which can be retrieved with the openEO Data Discovery for Collections. The value must be a condition (user-defined process) to be evaluated against a STAC API. This parameter is not supported for static STAC.

  • connection (Optional[Connection]) – The connection to use to connect with the backend.

Return type:

DataCube

Added in version 0.33.0.

log10()[source]
Return type:

DataCube

See also

openeo.org documentation on process “log”.

log2()[source]
Return type:

DataCube

See also

openeo.org documentation on process “log”.

logarithm(base)[source]
Return type:

DataCube

See also

openeo.org documentation on process “log”.

logical_and(other)[source]

Apply element-wise logical and operation

Parameters:

other (DataCube)

Return type:

DataCube

Returns:

logical_and(this, other)

See also

openeo.org documentation on process “and”.

logical_or(other)[source]

Apply element-wise logical or operation

Parameters:

other (DataCube)

Return type:

DataCube

Returns:

logical_or(this, other)

See also

openeo.org documentation on process “or”.

mask(mask=None, replacement=None)[source]

Applies a mask to a raster data cube. To apply a vector mask use mask_polygon.

A mask is a raster data cube for which corresponding pixels among data and mask are compared and those pixels in data are replaced whose pixels in mask are non-zero (for numbers) or true (for boolean values). The pixel values are replaced with the value specified for replacement, which defaults to null (no data).

Parameters:
  • mask (DataCube) – the raster mask

  • replacement – the value to replace the masked pixels with

Return type:

DataCube

See also

openeo.org documentation on process “mask”.

mask_polygon(mask, srs=None, replacement=None, inside=None)[source]

Applies a polygon mask to a raster data cube. To apply a raster mask use mask.

All pixels for which the point at the pixel center does not intersect with any polygon (as defined in the Simple Features standard by the OGC) are replaced. This behaviour can be inverted by setting the parameter inside to true.

The pixel values are replaced with the value specified for replacement, which defaults to no data.

Parameters:
  • mask (Union[BaseGeometry, dict, str, Path, Parameter, VectorCube]) –

    The geometry to mask with.an be provided in different ways:

    • a shapely geometry

    • a GeoJSON-style dictionary,

    • a public URL to the geometries in a vector format that is supported by the backend (also see Connection.list_file_formats()), e.g. GeoJSON, GeoParquet, etc. A load_url process will automatically be added to the process graph.

    • a path (str or Path) to a local, client-side GeoJSON file, which will be loaded automatically to get the geometries as GeoJSON construct.

    • a VectorCube instance.

    • a Parameter instance.

  • srs (str) –

    The spatial reference system of the provided polygon. By default longitude-latitude (EPSG:4326) is assumed.

    Note

    this srs argument is a non-standard/experimental feature, only supported by specific back-ends. See https://github.com/Open-EO/openeo-processes/issues/235 for details.

  • replacement – the value to replace the masked pixels with

Return type:

DataCube

Changed in version 0.36.0: Support passing a URL as geometries argument, which will be loaded with the load_url process.

Changed in version 0.36.0: Support for passing a backend-side path as geometries argument was removed (also see Legacy read_vector usage). Instead, it’s possible to provide a client-side path to a GeoJSON file (which will be loaded client-side to get the geometries as GeoJSON construct).

See also

openeo.org documentation on process “mask_polygon”.

max_time()[source]

Finds the maximum value of a time series for all bands of the input dataset.

Return type:

DataCube

Returns:

a DataCube instance

See also

openeo.org documentation on process “max”.

mean_time()[source]

Finds the mean value of a time series for all bands of the input dataset.

Return type:

DataCube

Returns:

a DataCube instance

See also

openeo.org documentation on process “mean”.

median_time()[source]

Finds the median value of a time series for all bands of the input dataset.

Return type:

DataCube

Returns:

a DataCube instance

See also

openeo.org documentation on process “median”.

merge(other, overlap_resolver=None, context=None)
Return type:

DataCube

Deprecated since version 0.4.6: Usage of this legacy method is deprecated. Use merge_cubes() instead.

merge_cubes(other, overlap_resolver=None, context=None)[source]

Merging two data cubes

The data cubes have to be compatible. A merge operation without overlap should be reversible with (a set of) filter operations for each of the two cubes. The process performs the join on overlapping dimensions, with the same name and type. An overlapping dimension has the same name, type, reference system and resolution in both dimensions, but can have different labels. One of the dimensions can have different labels, for all other dimensions the labels must be equal. If data overlaps, the parameter overlap_resolver must be specified to resolve the overlap.

Examples for merging two data cubes:

  1. Data cubes with the dimensions x, y, t and bands have the same dimension labels in x,y and t, but the labels for the dimension bands are B1 and B2 for the first cube and B3 and B4. An overlap resolver is not needed. The merged data cube has the dimensions x, y, t and bands and the dimension bands has four dimension labels: B1, B2, B3, B4.

  2. Data cubes with the dimensions x, y, t and bands have the same dimension labels in x,y and t, but the labels for the dimension bands are B1 and B2 for the first data cube and B2 and B3 for the second. An overlap resolver is required to resolve overlap in band B2. The merged data cube has the dimensions x, y, t and bands and the dimension bands has three dimension labels: B1, B2, B3.

  3. Data cubes with the dimensions x, y and t have the same dimension labels in x,y and t. There are two options:
    • Keep the overlapping values separately in the merged data cube: An overlap resolver is not needed, but for each data cube you need to add a new dimension using add_dimension. The new dimensions must be equal, except that the labels for the new dimensions must differ by name. The merged data cube has the same dimensions and labels as the original data cubes, plus the dimension added with add_dimension, which has the two dimension labels after the merge.

    • Combine the overlapping values into a single value: An overlap resolver is required to resolve the overlap for all pixels. The merged data cube has the same dimensions and labels as the original data cubes, but all pixel values have been processed by the overlap resolver.

  4. Merging a data cube with dimensions x, y, t with another cube with dimensions x, y will join on the x, y dimension, so the lower dimension cube is merged with each time step in the higher dimensional cube. This can for instance be used to apply a digital elevation model to a spatiotemporal data cube.

Parameters:
  • other (DataCube) – The data cube to merge with.

  • overlap_resolver (Union[str, PGNode, Callable]) – A reduction operator that resolves the conflict if the data overlaps. The reducer must return a value of the same data type as the input values are. The reduction operator may be a single process such as multiply or consist of multiple sub-processes. null (the default) can be specified if no overlap resolver is required.

  • context (Optional[dict]) – Additional data to be passed to the process.

Return type:

DataCube

Returns:

The merged data cube.

See also

openeo.org documentation on process “merge_cubes”.

min_time()[source]

Finds the minimum value of a time series for all bands of the input dataset.

Return type:

DataCube

Returns:

a DataCube instance

See also

openeo.org documentation on process “min”.

multiply(other, reverse=False)[source]
Return type:

DataCube

See also

openeo.org documentation on process “multiply”.

ndvi(nir=None, red=None, target_band=None)[source]

Normalized Difference Vegetation Index (NDVI)

Parameters:
  • nir (str) – (optional) name of NIR band

  • red (str) – (optional) name of red band

  • target_band (str) – (optional) name of the newly created band

Return type:

DataCube

Returns:

a DataCube instance

See also

openeo.org documentation on process “ndvi”.

normalized_difference(other)[source]
Return type:

DataCube

See also

openeo.org documentation on process “normalized_difference”.

polygonal_histogram_timeseries(polygon)[source]

Extract a histogram time series for the given (multi)polygon. Its points are expected to be in the EPSG:4326 coordinate reference system.

Parameters:

polygon (Union[Polygon, MultiPolygon, str]) – The (multi)polygon; or a file path or HTTP URL to a GeoJSON file or shape file

Return type:

VectorCube

Deprecated since version 0.10.0: Use aggregate_spatial() with reducer 'histogram'.

polygonal_mean_timeseries(polygon)[source]

Extract a mean time series for the given (multi)polygon. Its points are expected to be in the EPSG:4326 coordinate reference system.

Parameters:

polygon (Union[Polygon, MultiPolygon, str]) – The (multi)polygon; or a file path or HTTP URL to a GeoJSON file or shape file

Return type:

VectorCube

Deprecated since version 0.10.0: Use aggregate_spatial() with reducer 'mean'.

polygonal_median_timeseries(polygon)[source]

Extract a median time series for the given (multi)polygon. Its points are expected to be in the EPSG:4326 coordinate reference system.

Parameters:

polygon (Union[Polygon, MultiPolygon, str]) – The (multi)polygon; or a file path or HTTP URL to a GeoJSON file or shape file

Return type:

VectorCube

Deprecated since version 0.10.0: Use aggregate_spatial() with reducer 'median'.

polygonal_standarddeviation_timeseries(polygon)[source]

Extract a time series of standard deviations for the given (multi)polygon. Its points are expected to be in the EPSG:4326 coordinate reference system.

Parameters:

polygon (Union[Polygon, MultiPolygon, str]) – The (multi)polygon; or a file path or HTTP URL to a GeoJSON file or shape file

Return type:

VectorCube

Deprecated since version 0.10.0: Use aggregate_spatial() with reducer 'sd'.

power(p)[source]

See also

openeo.org documentation on process “power”.

predict_curve(parameters, function, dimension, labels=None)[source]

Predict values using a model function and pre-computed parameters.

Warning

experimental process: not generally supported, API subject to change. https://github.com/Open-EO/openeo-processes/pull/240

Parameters:

See also

openeo.org documentation on process “predict_curve”.

predict_random_forest(model, dimension='bands')[source]

Apply reduce_dimension process with a predict_random_forest reducer.

Parameters:
  • model (Union[str, BatchJob, MlModel]) –

    a reference to a trained model, one of

    • a MlModel instance (e.g. loaded from Connection.load_ml_model())

    • a BatchJob instance of a batch job that saved a single random forest model

    • a job id (str) of a batch job that saved a single random forest model

    • a STAC item URL (str) to load the random forest from. (The STAC Item must implement the ml-model extension.)

  • dimension (str) – dimension along which to apply the reduce_dimension process.

Added in version 0.10.0.

See also

openeo.org documentation on process “predict_random_forest”.

preview(center=None, zoom=None)[source]

Creates a service with the process graph and displays a map widget. Only supports XYZ.

Parameters:
  • center (Optional[Iterable]) – (optional) Map center. Default is (0,0).

  • zoom (Optional[int]) – (optional) Zoom level of the map. Default is 1.

Returns:

ipyleaflet Map object and the displayed Service

Warning

experimental feature, subject to change.

Added in version 0.19.0.

print_json(*, file=None, indent=2, separators=None, end='\\n')

Print interoperable JSON representation of the process graph.

See DataCube.to_json() to get the JSON representation as a string and Export a process graph for more usage information.

Also see json.dumps docs for more information on the JSON formatting options.

Parameters:
  • file – file-like object (stream) to print to (current sys.stdout by default). Or a path (string or pathlib.Path) to a file to write to.

  • indent (Optional[int]) – JSON indentation level.

  • separators (Optional[Tuple[str, str]]) – (optional) tuple of item/key separators.

  • end (str) – additional string to be printed at the end (newline by default).

Added in version 0.12.0.

Added in version 0.23.0: added the end argument.

process(process_id, arguments=None, metadata=None, namespace=None, **kwargs)[source]

Generic helper to create a new DataCube by applying a process.

Parameters:
  • process_id (str) – process id of the process.

  • arguments (Optional[dict]) – argument dictionary for the process.

  • metadata (Optional[CollectionMetadata]) – optional: metadata to override original cube metadata (e.g. when reducing dimensions)

  • namespace (Optional[str]) – optional: process namespace

Return type:

DataCube

Returns:

new DataCube instance

process_with_node(pg, metadata=None)[source]

Generic helper to create a new DataCube by applying a process (given as process graph node)

Parameters:
  • pg (PGNode) – process graph node (containing process id and arguments)

  • metadata (Optional[CollectionMetadata]) – optional: metadata to override original cube metadata (e.g. when reducing dimensions)

Return type:

DataCube

Returns:

new DataCube instance

raster_to_vector()[source]

Converts this raster data cube into a VectorCube. The bounding polygon of homogenous areas of pixels is constructed.

Warning

experimental process: not generally supported, API subject to change.

Return type:

VectorCube

Returns:

a VectorCube

reduce_bands(reducer)[source]

Shortcut for reduce_dimension() along the band dimension

Parameters:

reducer (Union[str, PGNode, Callable, UDF]) – “child callback” function, see Processes with child “callbacks”

Return type:

DataCube

reduce_bands_udf(code, runtime=None, version=None)[source]

Use reduce_dimension process with given UDF along band/spectral dimension. :rtype: DataCube

Deprecated since version 0.13.0: Use reduce_bands() with UDF as reducer.

reduce_dimension(dimension, reducer, context=None, process_id='reduce_dimension', band_math_mode=False)[source]

Add a reduce process with given reducer callback along given dimension

Parameters:
  • dimension (str) – the label of the dimension to reduce

  • reducer (Union[str, Callable, UDF, PGNode]) –

    the “child callback”: the name of a single openEO process, or a callback function as discussed in Processes with child “callbacks”, or a UDF instance.

    The callback should correspond to a process that receives an array of numerical values and returns a single numerical value. For example:

  • context (Optional[dict]) – Additional data to be passed to the process.

Return type:

DataCube

See also

openeo.org documentation on process “reduce_dimension”.

reduce_spatial(reducer, context=None)[source]

Add a reduce process with given reducer callback along the spatial dimensions

Parameters:
  • reducer (Union[str, Callable, UDF, PGNode]) –

    the “child callback”: the name of a single openEO process, or a callback function as discussed in Processes with child “callbacks”, or a UDF instance.

    The callback should correspond to a process that receives an array of numerical values and returns a single numerical value. For example:

  • context (Optional[dict]) – Additional data to be passed to the process.

Return type:

DataCube

See also

openeo.org documentation on process “reduce_spatial”.

reduce_temporal(reducer)[source]

Shortcut for reduce_dimension() along the temporal dimension

Parameters:

reducer (Union[str, PGNode, Callable, UDF]) – “child callback” function, see Processes with child “callbacks”

Return type:

DataCube

reduce_temporal_simple(reducer)
Return type:

DataCube

Deprecated since version 0.13.0: Usage of this legacy method is deprecated. Use reduce_temporal() instead.

reduce_temporal_udf(code, runtime='Python', version='latest')[source]

Apply reduce (reduce_dimension) process with given UDF along temporal dimension.

Parameters:
  • code (str) – The UDF code, compatible with the given runtime and version

  • runtime – The UDF runtime

  • version – The UDF runtime version

Deprecated since version 0.13.0: Use reduce_temporal() with UDF as reducer

reduce_tiles_over_time(code, runtime='Python', version='latest')

Deprecated since version 0.1.1: Usage of this legacy method is deprecated. Use reduce_temporal_udf() instead.

rename_dimension(source, target)[source]

Renames a dimension in the data cube while preserving all other properties.

Parameters:
  • source (str) – The current name of the dimension. Fails with a DimensionNotAvailable error if the specified dimension does not exist.

  • target (str) – A new Name for the dimension. Fails with a DimensionExists error if a dimension with the specified name exists.

Returns:

A new datacube with the dimension renamed.

See also

openeo.org documentation on process “rename_dimension”.

rename_labels(dimension, target, source=None)[source]

Renames the labels of the specified dimension in the data cube from source to target.

Parameters:
  • dimension (str) – Dimension name

  • target (list) – The new names for the labels.

  • source (list) – The names of the labels as they are currently in the data cube.

Return type:

DataCube

Returns:

An DataCube instance

See also

openeo.org documentation on process “rename_labels”.

resample_cube_spatial(target, method='near')[source]

Resamples the spatial dimensions (x,y) from a source data cube to align with the corresponding dimensions of the given target data cube. Returns a new data cube with the resampled dimensions.

To resample a data cube to a specific resolution or projection regardless of an existing target data cube, refer to resample_spatial().

Parameters:
  • target (DataCube) – A data cube that describes the spatial target resolution.

  • method (str) – Resampling method to use.

Return type:

DataCube

Returns:

resample_cube_temporal(target, dimension=None, valid_within=None)[source]

Resamples one or more given temporal dimensions from a source data cube to align with the corresponding dimensions of the given target data cube using the nearest neighbor method. Returns a new data cube with the resampled dimensions.

By default, this process simply takes the nearest neighbor independent of the value (including values such as no-data / null). Depending on the data cubes this may lead to values being assigned to two target timestamps. To only consider valid values in a specific range around the target timestamps, use the parameter valid_within.

The rare case of ties is resolved by choosing the earlier timestamps.

Parameters:
  • target (DataCube) – A data cube that describes the temporal target resolution.

  • dimension (Optional[str]) – The name of the temporal dimension to resample.

  • valid_within (Optional[int])

Return type:

DataCube

Returns:

Added in version 0.10.0.

See also

openeo.org documentation on process “resample_cube_temporal”.

resample_spatial(resolution, projection=None, method='near', align='upper-left')[source]
Return type:

DataCube

See also

openeo.org documentation on process “resample_spatial”.

resolution_merge(high_resolution_bands, low_resolution_bands, method=None)[source]

Resolution merging algorithms try to improve the spatial resolution of lower resolution bands (e.g. Sentinel-2 20M) based on higher resolution bands. (e.g. Sentinel-2 10M).

External references:

Pansharpening explained

Example publication: ‘Improving the Spatial Resolution of Land Surface Phenology by Fusing Medium- and Coarse-Resolution Inputs’

Warning

experimental process: not generally supported, API subject to change.

Parameters:
  • high_resolution_bands (List[str]) – A list of band names to use as ‘high-resolution’ band. Either the unique band name (metadata field name in bands) or one of the common band names (metadata field common_name in bands). If unique band name and common name conflict, the unique band name has higher priority. The order of the specified array defines the order of the bands in the data cube. If multiple bands match a common name, all matched bands are included in the original order. These bands will remain unmodified.

  • low_resolution_bands (List[str]) – A list of band names for which the spatial resolution should be increased. Either the unique band name (metadata field name in bands) or one of the common band names (metadata field common_name in bands). If unique band name and common name conflict, the unique band name has higher priority. The order of the specified array defines the order of the bands in the data cube. If multiple bands match a common name, all matched bands are included in the original order. These bands will be modified by the process.

  • method (str) – The method to use. The supported algorithms can vary between back-ends. Set to null (the default) to allow the back-end to choose, which will improve portability, but reduce reproducibility..

Return type:

DataCube

Returns:

A datacube with the same bands and metadata as the input, but algorithmically increased spatial resolution for the selected bands.

See also

openeo.org documentation on process “resolution_merge”.

result_node()

Get the current result node (PGNode) of the process graph. :rtype: PGNode

Added in version 0.10.1.

sar_backscatter(coefficient='gamma0-terrain', elevation_model=None, mask=False, contributing_area=False, local_incidence_angle=False, ellipsoid_incidence_angle=False, noise_removal=True, options=None)[source]

Computes backscatter from SAR input.

Note that backscatter computation may require instrument specific metadata that is tightly coupled to the original SAR products. As a result, this process may only work in combination with loading data from specific collections, not with general data cubes.

Parameters:
  • coefficient (Optional[str]) –

    Select the radiometric correction coefficient. The following options are available:

    • ”beta0”: radar brightness

    • ”sigma0-ellipsoid”: ground area computed with ellipsoid earth model

    • ”sigma0-terrain”: ground area computed with terrain earth model

    • ”gamma0-ellipsoid”: ground area computed with ellipsoid earth model in sensor line of sight

    • ”gamma0-terrain”: ground area computed with terrain earth model in sensor line of sight (default)

    • None: non-normalized backscatter

  • elevation_model (Optional[str]) – The digital elevation model to use. Set to None (the default) to allow the back-end to choose, which will improve portability, but reduce reproducibility.

  • mask (bool) – If set to true, a data mask is added to the bands with the name mask. It indicates which values are valid (1), invalid (0) or contain no-data (null).

  • contributing_area (bool) – If set to true, a DEM-based local contributing area band named contributing_area is added. The values are given in square meters.

  • local_incidence_angle (bool) – If set to true, a DEM-based local incidence angle band named local_incidence_angle is added. The values are given in degrees.

  • ellipsoid_incidence_angle (bool) – If set to true, an ellipsoidal incidence angle band named ellipsoid_incidence_angle is added. The values are given in degrees.

  • noise_removal (bool) – If set to false, no noise removal is applied. Defaults to true, which removes noise.

  • options (Optional[dict]) – dictionary with additional (backend-specific) options.

Return type:

DataCube

Returns:

Added in version 0.4.9.

Changed in version 0.4.10: replace orthorectify and rtc arguments with coefficient.

See also

openeo.org documentation on process “sar_backscatter”.

save_result(format='GTiff', options=None)[source]
Return type:

DataCube

See also

openeo.org documentation on process “save_result”.

save_user_defined_process(user_defined_process_id, public=False, summary=None, description=None, returns=None, categories=None, examples=None, links=None)[source]

Saves this process graph in the backend as a user-defined process for the authenticated user.

Parameters:
  • user_defined_process_id (str) – unique identifier for the process

  • public (bool) – visible to other users?

  • summary (Optional[str]) – A short summary of what the process does.

  • description (Optional[str]) – Detailed description to explain the entity. CommonMark 0.29 syntax MAY be used for rich text representation.

  • returns (Optional[dict]) – Description and schema of the return value.

  • categories (Optional[List[str]]) – A list of categories.

  • examples (Optional[List[dict]]) – A list of examples.

  • links (Optional[List[dict]]) – A list of links.

Return type:

RESTUserDefinedProcess

Returns:

a RESTUserDefinedProcess instance

send_job(out_format=None, *, title=None, description=None, plan=None, budget=None, additional=None, job_options=None, validate=None, auto_add_save_result=True, **format_options)
Return type:

BatchJob

Deprecated since version 0.10.0: Usage of this legacy method is deprecated. Use create_job() instead.

subtract(other, reverse=False)[source]
Return type:

DataCube

See also

openeo.org documentation on process “subtract”.

to_json(*, indent=2, separators=None)

Get interoperable JSON representation of the process graph.

See DataCube.print_json() to directly print the JSON representation and Export a process graph for more usage information.

Also see json.dumps docs for more information on the JSON formatting options.

Parameters:
  • indent (Optional[int]) – JSON indentation level.

  • separators (Optional[Tuple[str, str]]) – (optional) tuple of item/key separators.

Return type:

str

Returns:

JSON string

unflatten_dimension(dimension, target_dimensions, label_separator=None)[source]

Splits a single dimension into multiple dimensions by systematically extracting values and splitting the dimension labels by the given label_separator. This process is the opposite of the process flatten_dimensions() but executing both processes subsequently doesn’t necessarily create a data cube that is equal to the original data cube.

Parameters:
  • dimension (str) – The name of the dimension to split.

  • target_dimensions (List[str]) – The names of the target dimensions.

  • label_separator (Optional[str]) – The string that will be used as a separator to split the dimension labels.

Returns:

A data cube with the new shape.

Warning

experimental process: not generally supported, API subject to change.

Added in version 0.10.0.

See also

openeo.org documentation on process “unflatten_dimension”.

validate()[source]

Validate a process graph without executing it.

Return type:

List[dict]

Returns:

list of errors (dictionaries with “code” and “message” fields)

class openeo.rest._datacube.UDF(code, runtime=None, data=None, version=None, context=None, _source=None)[source]

Helper class to load UDF code (e.g. from file) and embed them as “callback” or child process in a process graph.

Usage example:

udf = UDF.from_file("my-udf-code.py")
cube = cube.apply(process=udf)

Changed in version 0.13.0: Added auto-detection of runtime. Specifying the data argument is not necessary anymore, and actually deprecated. Added from_file() to simplify loading UDF code from a file. See openeo.UDF API and usage changes in version 0.13.0 for more background about the changes.

classmethod from_file(path, runtime=None, version=None, context=None)[source]

Load a UDF from a local file.

See also

from_url() for loading from a URL.

Parameters:
  • path (Union[str, Path]) – path to the local file with UDF source code

  • runtime (Optional[str]) – optional UDF runtime identifier, will be auto-detected from source code if omitted.

  • version (Optional[str]) – optional UDF runtime version string

  • context (Optional[dict]) – optional additional UDF context data

Return type:

UDF

classmethod from_url(url, runtime=None, version=None, context=None)[source]

Load a UDF from a URL.

See also

from_file() for loading from a local file.

Parameters:
  • url (str) – URL path to load the UDF source code from

  • runtime (Optional[str]) – optional UDF runtime identifier, will be auto-detected from source code if omitted.

  • version (Optional[str]) – optional UDF runtime version string

  • context (Optional[dict]) – optional additional UDF context data

Return type:

UDF

get_run_udf_callback(connection=None, data_parameter='data')[source]

For internal use: construct run_udf node to be used as callback in apply, reduce_dimension, …

Return type:

PGNode

openeo.rest.vectorcube

class openeo.rest.vectorcube.VectorCube(graph, connection, metadata=None)[source]

A Vector Cube, or ‘Vector Collection’ is a data structure containing ‘Features’: https://www.w3.org/TR/sdw-bp/#dfn-feature

The features in this cube are restricted to have a geometry. Geometries can be points, lines, polygons etcetera. A geometry is specified in a ‘coordinate reference system’. https://www.w3.org/TR/sdw-bp/#dfn-coordinate-reference-system-(crs)

apply_dimension(process, dimension, target_dimension=None, context=None)[source]

Applies a process to all values along a dimension of a data cube. For example, if the temporal dimension is specified the process will work on the values of a time series.

The process to apply is specified by providing a callback function in the process argument.

Parameters:
  • process (Union[str, Callable, UDF, PGNode]) –

    the “child callback”: the name of a single process, or a callback function as discussed in Processes with child “callbacks”, or a UDF instance.

    The callback should correspond to a process that receives an array of numerical values and returns an array of numerical values. For example:

  • dimension (str) – The name of the source dimension to apply the process on. Fails with a DimensionNotAvailable error if the specified dimension does not exist.

  • target_dimension (Optional[str]) – The name of the target dimension or null (the default) to use the source dimension specified in the parameter dimension. By specifying a target dimension, the source dimension is removed. The target dimension with the specified name and the type other (see add_dimension) is created, if it doesn’t exist yet.

  • context (Optional[dict]) – Additional data to be passed to the process.

Return type:

VectorCube

Returns:

A datacube with the UDF applied to the given dimension.

Raises:

DimensionNotAvailable

Added in version 0.22.0.

See also

openeo.org documentation on process “apply_dimension”.

create_job(out_format=None, *, title=None, description=None, plan=None, budget=None, additional=None, job_options=None, validate=None, auto_add_save_result=True, **format_options)[source]

Sends a job to the backend and returns a ClientJob instance.

Parameters:
  • out_format (Optional[str]) – String Format of the job result.

  • title (Optional[str]) – job title

  • description (Optional[str]) – job description

  • plan (Optional[str]) – The billing plan to process and charge the job with

  • budget (Optional[float]) – Maximum budget to be spent on executing the job. Note that some backends do not honor this limit.

  • additional (Optional[dict]) – additional (top-level) properties to set in the request body

  • job_options (Optional[dict]) – dictionary of job options to pass to the backend (under top-level property “job_options”)

  • format_options – String Parameters for the job result format

  • validate (Optional[bool]) – Optional toggle to enable/prevent validation of the process graphs before execution (overruling the connection’s auto_validate setting).

  • auto_add_save_result (bool) – Automatically add a save_result node to the process graph if there is none yet.

Return type:

BatchJob

Returns:

Created job.

Changed in version 0.32.0: Added auto_add_save_result option

download(outputfile=None, format=None, options=None, *, validate=None, auto_add_save_result=True)[source]

Execute synchronously and download the vector cube.

The result will be stored to the output path, when specified. If no output path (or None) is given, the raw download content will be returned as bytes object.

Parameters:
  • outputfile (Union[str, Path, None]) – (optional) output file to store the result to

  • format (Optional[str]) – (optional) output format to use.

  • options (Optional[dict]) – (optional) additional output format options.

  • validate (Optional[bool]) – Optional toggle to enable/prevent validation of the process graphs before execution (overruling the connection’s auto_validate setting).

  • auto_add_save_result (bool) – Automatically add a save_result node to the process graph if there is none yet.

Return type:

Optional[bytes]

Changed in version 0.21.0: When not specified explicitly, output format is guessed from output file extension.

Changed in version 0.32.0: Added auto_add_save_result option

execute(*, validate=None)[source]

Executes the process graph.

Return type:

dict

execute_batch(outputfile=None, out_format=None, *, title=None, description=None, plan=None, budget=None, print=<built-in function print>, max_poll_interval=60, connection_retry_interval=30, additional=None, job_options=None, validate=None, auto_add_save_result=True, **format_options)[source]

Evaluate the process graph by creating a batch job, and retrieving the results when it is finished. This method is mostly recommended if the batch job is expected to run in a reasonable amount of time.

For very long running jobs, you probably do not want to keep the client running.

Parameters:
  • additional (Optional[dict]) – additional (top-level) properties to set in the request body

  • job_options (Optional[dict]) – dictionary of job options to pass to the backend (under top-level property “job_options”)

  • outputfile (Union[str, Path, None]) – The path of a file to which a result can be written

  • out_format (Optional[str]) – (optional) output format to use.

  • format_options – (optional) additional output format options

  • validate (Optional[bool]) – Optional toggle to enable/prevent validation of the process graphs before execution (overruling the connection’s auto_validate setting).

  • auto_add_save_result (bool) – Automatically add a save_result node to the process graph if there is none yet.

Return type:

BatchJob

Changed in version 0.21.0: When not specified explicitly, output format is guessed from output file extension.

Changed in version 0.32.0: Added auto_add_save_result option

Added in version 0.36.0: Added argument additional.

filter_bands(bands)[source]
Return type:

VectorCube

Added in version 0.22.0.

See also

openeo.org documentation on process “filter_bands”.

filter_bbox(*, west=None, south=None, east=None, north=None, extent=None, crs=None)[source]
Return type:

VectorCube

Added in version 0.22.0.

See also

openeo.org documentation on process “filter_bbox”.

filter_labels(condition, dimension, context=None)[source]

Filters the dimension labels in the data cube for the given dimension. Only the dimension labels that match the specified condition are preserved, all other labels with their corresponding data get removed.

Parameters:
  • condition (Union[PGNode, Callable]) – the “child callback” which will be given a single label value (number or string) and returns a boolean expressing if the label should be preserved. Also see Processes with child “callbacks”.

  • dimension (str) – The name of the dimension to filter on.

Return type:

VectorCube

Added in version 0.22.0.

See also

openeo.org documentation on process “filter_labels”.

filter_vector(geometries, relation='intersects')[source]
Return type:

VectorCube

Added in version 0.22.0.

See also

openeo.org documentation on process “filter_vector”.

fit_class_random_forest(target, max_variables=None, num_trees=100, seed=None)[source]

Executes the fit of a random forest classification based on the user input of target and predictors. The Random Forest classification model is based on the approach by Breiman (2001).

Warning

EXPERIMENTAL: not generally supported, API subject to change.

Parameters:
  • target (dict) – The training sites for the classification model as a vector data cube. This is associated with the target variable for the Random Forest model. The geometry has to be associated with a value to predict (e.g. fractional forest canopy cover).

  • max_variables (Optional[int]) – Specifies how many split variables will be used at a node. Default value is null, which corresponds to the number of predictors divided by 3.

  • num_trees (int) – The number of trees build within the Random Forest classification.

  • seed (Optional[int]) – A randomization seed to use for the random sampling in training.

Return type:

MlModel

Added in version 0.16.0: Originally added in version 0.10.0 as DataCube method, but moved to VectorCube in version 0.16.0.

See also

openeo.org documentation on process “fit_class_random_forest”.

fit_regr_random_forest(target, max_variables=None, num_trees=100, seed=None)[source]

Executes the fit of a random forest regression based on training data. The Random Forest regression model is based on the approach by Breiman (2001).

Warning

EXPERIMENTAL: not generally supported, API subject to change.

Parameters:
  • target (dict) – The training sites for the regression model as a vector data cube. This is associated with the target variable for the Random Forest model. The geometry has to associated with a value to predict (e.g. fractional forest canopy cover).

  • max_variables (Optional[int]) – Specifies how many split variables will be used at a node. Default value is null, which corresponds to the number of predictors divided by 3.

  • num_trees (int) – The number of trees build within the Random Forest classification.

  • seed (Optional[int]) – A randomization seed to use for the random sampling in training.

Return type:

MlModel

Added in version 0.16.0: Originally added in version 0.10.0 as DataCube method, but moved to VectorCube in version 0.16.0.

See also

openeo.org documentation on process “fit_regr_random_forest”.

flat_graph()

Get the process graph in internal flat dict representation. :rtype: Dict[str, dict]

Warning

This method is mainly intended for internal use. It is not recommended for general use and is subject to change.

Instead, it is recommended to use to_json() or print_json() to obtain a standardized, interoperable JSON representation of the process graph. See Export a process graph for more information.

classmethod load_geojson(connection, data, properties=None)[source]

Converts GeoJSON data as defined by RFC 7946 into a vector data cube.

Parameters:
  • connection (Connection) – the connection to use to connect with the openEO back-end.

  • data (Union[dict, str, Path, BaseGeometry, Parameter]) –

    the geometry to load. One of:

    • GeoJSON-style data structure: e.g. a dictionary with "type": "Polygon" and "coordinates" fields

    • a path to a local GeoJSON file

    • a GeoJSON string

    • a shapely geometry object

  • properties (Optional[List[str]]) – A list of properties from the GeoJSON file to construct an additional dimension from.

Return type:

VectorCube

Returns:

new VectorCube instance

Warning

EXPERIMENTAL: this process is experimental with the potential for major things to change.

Added in version 0.22.0.

See also

openeo.org documentation on process “load_geojson”.

classmethod load_url(connection, url, format, options=None)[source]

Loads a file from a URL

Parameters:
  • connection (Connection) – the connection to use to connect with the openEO back-end.

  • url (str) – The URL to read from. Authentication details such as API keys or tokens may need to be included in the URL.

  • format (str) – The file format to use when loading the data.

  • options (Optional[dict]) – The file format parameters to use when reading the data. Must correspond to the parameters that the server reports as supported parameters for the chosen format

Return type:

VectorCube

Returns:

new VectorCube instance

Warning

EXPERIMENTAL: this process is experimental with the potential for major things to change.

Added in version 0.22.0.

See also

openeo.org documentation on process “load_url”.

print_json(*, file=None, indent=2, separators=None, end='\\n')

Print interoperable JSON representation of the process graph.

See DataCube.to_json() to get the JSON representation as a string and Export a process graph for more usage information.

Also see json.dumps docs for more information on the JSON formatting options.

Parameters:
  • file – file-like object (stream) to print to (current sys.stdout by default). Or a path (string or pathlib.Path) to a file to write to.

  • indent (Optional[int]) – JSON indentation level.

  • separators (Optional[Tuple[str, str]]) – (optional) tuple of item/key separators.

  • end (str) – additional string to be printed at the end (newline by default).

Added in version 0.12.0.

Added in version 0.23.0: added the end argument.

process(process_id, arguments=None, metadata=None, namespace=None, **kwargs)[source]

Generic helper to create a new VectorCube by applying a process.

Parameters:
  • process_id (str) – process id of the process.

  • args – argument dictionary for the process.

Return type:

VectorCube

Returns:

new VectorCube instance

result_node()

Get the current result node (PGNode) of the process graph. :rtype: PGNode

Added in version 0.10.1.

run_udf(udf, runtime=None, version=None, context=None)[source]

Run a UDF on the vector cube.

It is recommended to provide the UDF just as UDF instance. (the other arguments could be used to override UDF parameters if necessary).

Parameters:
  • udf (Union[str, UDF]) – UDF code as a string or UDF instance

  • runtime (Optional[str]) – UDF runtime

  • version (Optional[str]) – UDF version

  • context (Optional[dict]) – UDF context

Return type:

VectorCube

Warning

EXPERIMENTAL: not generally supported, API subject to change.

Added in version 0.10.0.

Changed in version 0.16.0: Added support to pass self-contained UDF instance.

See also

openeo.org documentation on process “run_udf”.

save_result(format='GeoJSON', options=None)[source]

See also

openeo.org documentation on process “save_result”.

send_job(out_format=None, *, title=None, description=None, plan=None, budget=None, additional=None, job_options=None, validate=None, auto_add_save_result=True, **format_options)
Return type:

BatchJob

Deprecated since version 0.10.0: Usage of this legacy method is deprecated. Use create_job() instead.

to_json(*, indent=2, separators=None)

Get interoperable JSON representation of the process graph.

See DataCube.print_json() to directly print the JSON representation and Export a process graph for more usage information.

Also see json.dumps docs for more information on the JSON formatting options.

Parameters:
  • indent (Optional[int]) – JSON indentation level.

  • separators (Optional[Tuple[str, str]]) – (optional) tuple of item/key separators.

Return type:

str

Returns:

JSON string

vector_to_raster(target)[source]

Converts this vector cube (VectorCube) into a raster data cube (DataCube). The bounding polygon of homogenous areas of pixels is constructed.

Parameters:

target (DataCube) – a reference raster data cube to adopt the CRS/projection/resolution from.

Return type:

DataCube

Warning

vector_to_raster is an experimental, non-standard process. It is not widely supported, and its API is subject to change.

Added in version 0.28.0.

openeo.rest.mlmodel

class openeo.rest.mlmodel.MlModel(graph, connection)[source]

A machine learning model.

It is the result of a training procedure, e.g. output of a fit_... process, and can be used for prediction (classification or regression) with the corresponding predict_... process.

Added in version 0.10.0.

create_job(*, title=None, description=None, plan=None, budget=None, additional=None, job_options=None)[source]

Sends a job to the backend and returns a ClientJob instance.

Parameters:
  • title (Optional[str]) – job title

  • description (Optional[str]) – job description

  • plan (Optional[str]) – The billing plan to process and charge the job with

  • budget (Optional[float]) – Maximum budget to be spent on executing the job. Note that some backends do not honor this limit.

  • additional (Optional[dict]) – additional (top-level) properties to set in the request body

  • job_options (Optional[dict]) – dictionary of job options to pass to the backend (under top-level property “job_options”)

  • format_options – String Parameters for the job result format

Return type:

BatchJob

Returns:

Created job.

Added in version 0.36.0: Added argument additional.

execute_batch(outputfile, *, title=None, description=None, plan=None, budget=None, print=<built-in function print>, max_poll_interval=60, connection_retry_interval=30, additional=None, job_options=None)[source]

Evaluate the process graph by creating a batch job, and retrieving the results when it is finished. This method is mostly recommended if the batch job is expected to run in a reasonable amount of time.

For very long running jobs, you probably do not want to keep the client running.

Parameters:
  • job_options (Optional[dict])

  • outputfile (Union[str, Path]) – The path of a file to which a result can be written

  • out_format – (optional) Format of the job result.

  • format_options – String Parameters for the job result format

  • additional (Optional[dict]) – additional (top-level) properties to set in the request body

  • job_options – dictionary of job options to pass to the backend (under top-level property “job_options”)

Return type:

BatchJob

Added in version 0.36.0: Added argument additional.

flat_graph()

Get the process graph in internal flat dict representation. :rtype: Dict[str, dict]

Warning

This method is mainly intended for internal use. It is not recommended for general use and is subject to change.

Instead, it is recommended to use to_json() or print_json() to obtain a standardized, interoperable JSON representation of the process graph. See Export a process graph for more information.

static load_ml_model(connection, id)[source]

Loads a machine learning model from a STAC Item.

Parameters:
  • connection (Connection) – connection object

  • id (Union[str, BatchJob]) – STAC item reference, as URL, batch job (id) or user-uploaded file

Return type:

MlModel

Returns:

Added in version 0.10.0.

See also

openeo.org documentation on process “load_ml_model”.

print_json(*, file=None, indent=2, separators=None, end='\\n')

Print interoperable JSON representation of the process graph.

See DataCube.to_json() to get the JSON representation as a string and Export a process graph for more usage information.

Also see json.dumps docs for more information on the JSON formatting options.

Parameters:
  • file – file-like object (stream) to print to (current sys.stdout by default). Or a path (string or pathlib.Path) to a file to write to.

  • indent (Optional[int]) – JSON indentation level.

  • separators (Optional[Tuple[str, str]]) – (optional) tuple of item/key separators.

  • end (str) – additional string to be printed at the end (newline by default).

Added in version 0.12.0.

Added in version 0.23.0: added the end argument.

result_node()

Get the current result node (PGNode) of the process graph. :rtype: PGNode

Added in version 0.10.1.

save_ml_model(options=None)[source]

Saves a machine learning model as part of a batch job.

Parameters:

options (Optional[dict]) – Additional parameters to create the file(s).

to_json(*, indent=2, separators=None)

Get interoperable JSON representation of the process graph.

See DataCube.print_json() to directly print the JSON representation and Export a process graph for more usage information.

Also see json.dumps docs for more information on the JSON formatting options.

Parameters:
  • indent (Optional[int]) – JSON indentation level.

  • separators (Optional[Tuple[str, str]]) – (optional) tuple of item/key separators.

Return type:

str

Returns:

JSON string

openeo.rest.multiresult

class openeo.rest.multiresult.MultiResult(leaves, connection=None)[source]

Helper to create and run batch jobs with process graphs that contain multiple result nodes or, more generally speaking, multiple process graph “leaf” nodes.

Provide multiple DataCube/VectorCube instances to the constructor, and start a batch job from that, for example as follows:

from openeo import MultiResult

cube1 = ...
cube2 = ...
multi_result = MultiResult([cube1, cube2])
job = multi_result.create_job()

Added in version 0.35.0.

__init__(leaves, connection=None)[source]

Build a MultiResult instance from multiple leaf nodes

Parameters:
  • leaves (List[FlatGraphableMixin]) – list of objects that can be converted to an openEO-style (flat) process graph representation, typically DataCube or VectorCube instances.

  • connection (Optional[Connection]) – Optional connection to use for creating/starting batch jobs, for special use cases where the provided leaf instances are not already associated with a connection.

print_json(*, file=None, indent=2, separators=None, end='\\n')

Print interoperable JSON representation of the process graph.

See DataCube.to_json() to get the JSON representation as a string and Export a process graph for more usage information.

Also see json.dumps docs for more information on the JSON formatting options.

Parameters:
  • file – file-like object (stream) to print to (current sys.stdout by default). Or a path (string or pathlib.Path) to a file to write to.

  • indent (Optional[int]) – JSON indentation level.

  • separators (Optional[Tuple[str, str]]) – (optional) tuple of item/key separators.

  • end (str) – additional string to be printed at the end (newline by default).

Added in version 0.12.0.

Added in version 0.23.0: added the end argument.

to_json(*, indent=2, separators=None)

Get interoperable JSON representation of the process graph.

See DataCube.print_json() to directly print the JSON representation and Export a process graph for more usage information.

Also see json.dumps docs for more information on the JSON formatting options.

Parameters:
  • indent (Optional[int]) – JSON indentation level.

  • separators (Optional[Tuple[str, str]]) – (optional) tuple of item/key separators.

Return type:

str

Returns:

JSON string

openeo.metadata

class openeo.metadata.BandDimension(name, bands)[source]
append_band(band)[source]

Create new BandDimension with appended band.

Return type:

BandDimension

band_index(band)[source]

Resolve a given band (common) name/index to band index

Parameters:

band (Union[int, str]) – band name, common name or index

Return int:

band index

Return type:

int

band_name(band, allow_common=True)[source]

Resolve (common) name or index to a valid (common) name

Return type:

str

filter_bands(bands)[source]

Construct new BandDimension with subset of bands, based on given band indices or (common) names

Return type:

BandDimension

rename(name)[source]

Create new dimension with new name.

Return type:

Dimension

rename_labels(target, source)[source]

Rename labels, if the type of dimension allows it.

Parameters:
  • target – List of target labels

  • source – Source labels, or empty list

Return type:

Dimension

Returns:

A new dimension with modified labels, or the same if no change is applied.

class openeo.metadata.CollectionMetadata(metadata, dimensions=None)[source]

Wrapper for EO Data Collection metadata.

Simplifies getting values from deeply nested mappings, allows additional parsing and normalizing compatibility issues.

Metadata is expected to follow format defined by https://openeo.org/documentation/1.0/developers/api/reference.html#operation/describe-collection (with partial support for older versions)

class openeo.metadata.SpatialDimension(name, extent, crs=4326, step=None)[source]
rename(name)[source]

Create new dimension with new name.

Return type:

Dimension

class openeo.metadata.TemporalDimension(name, extent)[source]
rename(name)[source]

Create new dimension with new name.

Return type:

Dimension

rename_labels(target, source)[source]

Rename labels, if the type of dimension allows it.

Parameters:
  • target – List of target labels

  • source – Source labels, or empty list

Return type:

Dimension

Returns:

A new dimension with modified labels, or the same if no change is applied.

openeo.api.process

class openeo.api.process.Parameter(name, description=None, schema=None, default=<object object>, optional=None)[source]

A (process) parameter to build parameterized user-defined processes.

Parameter objects can be defined with at least a name and expected schema (e.g. is the parameter a placeholder for a string, a bounding box, a date, …) and can then be used with various functions and classes, like DataCube, to build parameterized user-defined processes.

Apart from the generic Parameter constructor, this class also provides various helpers (class methods) to easily create parameters for common parameter types.

Parameters:
  • name (str) – parameter name, which will be used to assign concrete values to. It is recommended to stick to the convention of snake case naming (using lowercase with underscores).

  • description (Optional[str]) – human-readable description of the parameter.

  • schema (Union[list, dict, str, None]) – JSON schema describing the expected data type and structure of the parameter.

  • default – default value for the parameter when it’s optional.

  • optional (Optional[bool]) – toggle to indicate whether the parameter is optional or required.

classmethod array(name, description=None, *, item_schema=None, **kwargs)[source]

Helper to easily create parameter with an ‘array’ schema.

Parameters:
  • name (str) – parameter name, which will be used to assign concrete values to. It is recommended to stick to the convention of snake case naming (using lowercase with underscores).

  • description (Optional[str]) – human-readable description of the parameter.

  • item_schema (Union[str, dict, None]) – Schema of the array items given in JSON Schema style, e.g. {"type": "string"}. Simple schemas can also be specified as single string: e.g. "string" will be expanded to {"type": "string"}.

Return type:

Parameter

See the generic Parameter constructor for information on additional arguments (except schema).

Changed in version 0.23.0: Added item_schema argument.

classmethod boolean(name, description=None, **kwargs)[source]

Helper to easily create a ‘boolean’ parameter.

Parameters:
  • name (str) – parameter name, which will be used to assign concrete values to. It is recommended to stick to the convention of snake case naming (using lowercase with underscores).

  • description (Optional[str]) – human-readable description of the parameter.

Return type:

Parameter

See the generic Parameter constructor for information on additional arguments (except schema).

classmethod bounding_box(name, description="Spatial extent specified as a bounding box with 'west', 'south', 'east' and 'north' fields.", **kwargs)[source]

Helper to easily create a ‘bounding box’ parameter, which allows to specify a spatial extent with “west”, “south”, “east” and “north” bounds (and optionally a CRS identifier).

Parameters:
  • name (str) – parameter name, which will be used to assign concrete values to. It is recommended to stick to the convention of snake case naming (using lowercase with underscores).

  • description (str) – human-readable description of the parameter.

Return type:

Parameter

See the generic Parameter constructor for information on additional arguments (except schema).

Added in version 0.30.0.

classmethod datacube(name='data', description='A data cube.', **kwargs)[source]

Helper to easily create a ‘datacube’ parameter.

Parameters:
  • name (str) – parameter name, which will be used to assign concrete values to. It is recommended to stick to the convention of snake case naming (using lowercase with underscores).

  • description (str) – human-readable description of the parameter.

Return type:

Parameter

See the generic Parameter constructor for information on additional arguments (except schema).

Added in version 0.22.0.

classmethod date(name, description='A date.', **kwargs)[source]

Helper to easily create a ‘date’ parameter.

Parameters:
  • name (str) – parameter name, which will be used to assign concrete values to. It is recommended to stick to the convention of snake case naming (using lowercase with underscores).

  • description (str) – human-readable description of the parameter.

Return type:

Parameter

See the generic Parameter constructor for information on additional arguments (except schema).

Added in version 0.30.0.

classmethod date_time(name, description='A date with time.', **kwargs)[source]

Helper to easily create a ‘date-time’ parameter.

Parameters:
  • name (str) – parameter name, which will be used to assign concrete values to. It is recommended to stick to the convention of snake case naming (using lowercase with underscores).

  • description (str) – human-readable description of the parameter.

Return type:

Parameter

See the generic Parameter constructor for information on additional arguments (except schema).

Added in version 0.30.0.

classmethod geojson(name, description='Geometries specified as GeoJSON object.', **kwargs)[source]

Helper to easily create a ‘geojson’ parameter, which allows to specify geometries as an inline GeoJSON object.

Parameters:
  • name (str) – parameter name, which will be used to assign concrete values to. It is recommended to stick to the convention of snake case naming (using lowercase with underscores).

  • description (str) – human-readable description of the parameter.

Return type:

Parameter

See the generic Parameter constructor for information on additional arguments (except schema).

Added in version 0.30.0.

classmethod integer(name, description=None, **kwargs)[source]

Helper to create an ‘integer’ parameter.

Parameters:
  • name (str) – parameter name, which will be used to assign concrete values to. It is recommended to stick to the convention of snake case naming (using lowercase with underscores).

  • description (Optional[str]) – human-readable description of the parameter.

Return type:

Parameter

See the generic Parameter constructor for information on additional arguments (except schema).

classmethod number(name, description=None, **kwargs)[source]

Helper to easily create a ‘number’ parameter.

Parameters:
  • name (str) – parameter name, which will be used to assign concrete values to. It is recommended to stick to the convention of snake case naming (using lowercase with underscores).

  • description (Optional[str]) – human-readable description of the parameter.

Return type:

Parameter

See the generic Parameter constructor for information on additional arguments (except schema).

classmethod object(name, description=None, *, subtype=None, **kwargs)[source]

Helper to create an ‘object’ type parameter

Parameters:
  • name (str) – parameter name, which will be used to assign concrete values to. It is recommended to stick to the convention of snake case naming (using lowercase with underscores).

  • description (Optional[str]) – human-readable description of the parameter.

  • subtype (Optional[str]) – subtype of the ‘object’ schema

Return type:

Parameter

See the generic Parameter constructor for information on additional arguments (except schema).

Added in version 0.26.0.

classmethod raster_cube(name='data', description='A data cube.', **kwargs)[source]

Helper to easily create a ‘raster-cube’ parameter.

Parameters:
  • name (str) – parameter name, which will be used to assign concrete values to. It is recommended to stick to the convention of snake case naming (using lowercase with underscores).

  • description (str) – human-readable description of the parameter.

Return type:

Parameter

See the generic Parameter constructor for information on additional arguments (except schema).

classmethod spatial_extent(name='spatial_extent', description=None, **kwargs)[source]

Helper to easily create a ‘spatial_extent’ parameter, which is compatible with the load_collection argument of the same name. This allows to conveniently create user-defined processes that can be applied to a bounding box and vector data for spatial filtering. It is also possible for users to set to null, and define spatial filtering using other processes.

Parameters:
  • name (str) – parameter name, which will be used to assign concrete values to. It is recommended to stick to the convention of snake case naming (using lowercase with underscores).

  • description (Optional[str]) – human-readable description of the parameter.

Return type:

Parameter

See the generic Parameter constructor for information on additional arguments (except schema).

Added in version 0.32.0.

classmethod string(name, description=None, *, values=None, subtype=None, format=None, **kwargs)[source]

Helper to easily create a ‘string’ parameter.

Parameters:
  • name (str) – parameter name, which will be used to assign concrete values to. It is recommended to stick to the convention of snake case naming (using lowercase with underscores).

  • description (Optional[str]) – human-readable description of the parameter.

  • values (Optional[List[str]]) – Optional list of allowed string values to make this an “enum”.

  • subtype (Optional[str]) – Optional subtype of the ‘string’ schema.

  • format (Optional[str]) – Optional format of the ‘string’ schema.

Return type:

Parameter

See the generic Parameter constructor for information on additional arguments (except schema).

classmethod temporal_interval(name='temporal_extent', description='Temporal extent specified as two-element array with start and end date/date-time.', **kwargs)[source]

Helper to easily create a ‘temporal-interval’ parameter, which allows to specify a temporal extent as a two-element array with start and end date/date-time.

Parameters:
  • name (str) – parameter name, which will be used to assign concrete values to. It is recommended to stick to the convention of snake case naming (using lowercase with underscores).

  • description (str) – human-readable description of the parameter.

Return type:

Parameter

See the generic Parameter constructor for information on additional arguments (except schema).

Added in version 0.30.0.

to_dict()[source]

Convert to dictionary for JSON-serialization.

Return type:

dict

openeo.api.logs

class openeo.api.logs.LogEntry(*args, **kwargs)[source]

Log message and info for jobs and services

Fields:
  • id: Unique ID for the log, string, REQUIRED

  • code: Error code, string, optional

  • level: Severity level, string (error, warning, info or debug), REQUIRED

  • message: Error message, string, REQUIRED

  • time: Date and time of the error event as RFC3339 date-time, string, available since API 1.1.0

  • path: A “stack trace” for the process, array of dicts

  • links: Related links, array of dicts

  • usage: Usage metrics available as property ‘usage’, dict, available since API 1.1.0 May contain the following metrics: cpu, memory, duration, network, disk, storage and other custom ones Each of the metrics is also a dict with the following parts: value (numeric) and unit (string)

  • data: Arbitrary data the user wants to “log” for debugging purposes. Please note that this property may not exist as there’s a difference between None and non-existing. None for example refers to no-data in many cases while the absence of the property means that the user did not provide any data for debugging.

openeo.api.logs.normalize_log_level(log_level, default=10)[source]

Helper function to convert a openEO API log level (e.g. string “error”) to the integer constants defined in Python’s standard library logging module (e.g. logging.ERROR).

Parameters:
  • log_level (Union[int, str, None]) – log level to normalize: a log level string in the style of the openEO API (“error”, “warning”, “info”, or “debug”), an integer value (e.g. a logging constant), or None.

  • default (int) – fallback log level to return on unknown log level strings or None input.

Raises:

TypeError – when log_level is any other type than str, an int or None.

Return type:

int

Returns:

One of the following log level constants from the standard module logging: logging.ERROR, logging.WARNING, logging.INFO, or logging.DEBUG .

openeo.rest.connection

This module provides a Connection object to manage and persist settings when interacting with the OpenEO API.

class openeo.rest.connection.Connection(url, *, session=None, default_timeout=None, auto_validate=True, slow_response_threshold=None, auth_config=None, refresh_token_store=None, oidc_auth_renewer=None, auth=None)[source]

Connection to an openEO backend.

Parameters:
  • url (str) – Backend root url

  • session (Optional[Session]) – Optional requests.Session object to use for requests.

  • default_timeout (Optional[int]) – Default timeout for requests in seconds.

  • auto_validate (bool) – toggle to automatically validate process graphs before execution

  • slow_response_threshold (Optional[float]) – Optional threshold in seconds to consider a response as slow and log a warning.

  • auth_config (Optional[AuthConfig]) – Optional AuthConfig object to fetch authentication related configuration from.

  • refresh_token_store (Optional[RefreshTokenStore]) – For advanced usage: custom RefreshTokenStore object to use for storing/loading refresh tokens.

  • oidc_auth_renewer (Optional[OidcAuthenticator]) – For advanced usage: optional OidcAuthenticator object to use for renewing OIDC tokens.

  • auth (Optional[AuthBase]) – Optional requests.auth.AuthBase object to use for requests. Usage of this parameter is deprecated, use the specific authentication methods instead.

as_curl(data, *, path='/result', method='POST', obfuscate_auth=False, additional=None, job_options=None)[source]

Build curl command to evaluate given process graph or data cube (including authorization and content-type headers).

>>> print(connection.as_curl(cube))
curl -i -X POST -H 'Content-Type: application/json' -H 'Authorization: Bearer ...' \
    --data '{"process":{"process_graph":{...}}' \
    https://openeo.example/openeo/1.1/result
Parameters:
  • data (Union[dict, DataCube, FlatGraphableMixin]) – something that is convertable to an openEO process graph: a dictionary, a DataCube object, a ProcessBuilder, …

  • path – endpoint to send request to: typically "/result" (default) for synchronous requests or "/jobs" for batch jobs

  • method – HTTP method to use (typically "POST")

  • obfuscate_auth (bool) – don’t show actual bearer token

  • additional (Optional[dict]) – additional (top-level) properties to set in the request body

  • job_options (Optional[dict]) – dictionary of job options to pass to the backend (under top-level property “job_options”)

Return type:

str

Returns:

curl command as a string

Added in version 0.36.0: Added arguments additional and job_options.

assert_user_defined_process_support()[source]

Capabilities document based verification that back-end supports user-defined processes.

Added in version 0.23.0.

authenticate_basic(username=None, password=None)[source]

Authenticate a user to the backend using basic username and password.

Parameters:
  • username (Optional[str]) – User name

  • password (Optional[str]) – User passphrase

Return type:

Connection

authenticate_oidc(provider_id=None, client_id=None, client_secret=None, *, store_refresh_token=True, use_pkce=None, display=<built-in function print>, max_poll_time=300)[source]

Generic method to do OpenID Connect authentication.

In the context of interactive usage, this method first tries to use refresh tokens and falls back on device code flow.

For non-interactive, machine-to-machine contexts, it is also possible to trigger the usage of the “client_credentials” flow through environment variables. Assuming you have set up a OIDC client (with a secret): set OPENEO_AUTH_METHOD to client_credentials, set OPENEO_AUTH_CLIENT_ID to the client id, and set OPENEO_AUTH_CLIENT_SECRET to the client secret.

See OIDC Authentication: Dynamic Method Selection for more details.

Parameters:
  • provider_id (Optional[str]) – provider id to use

  • client_id (Optional[str]) – client id to use

  • client_secret (Optional[str]) – client secret to use

  • max_poll_time (float) – maximum time in seconds to keep polling for successful authentication.

Added in version 0.6.0.

Changed in version 0.17.0: Add max_poll_time argument

Changed in version 0.18.0: Add support for client credentials flow.

authenticate_oidc_access_token(access_token, provider_id=None)[source]

Set up authorization headers directly with an OIDC access token.

Connection provides multiple methods to handle various OIDC authentication flows end-to-end. If you already obtained a valid OIDC access token in another “out-of-band” way, you can use this method to set up the authorization headers appropriately.

Parameters:
  • access_token (str) – OIDC access token

  • provider_id (Optional[str]) – id of the OIDC provider as listed by the openEO backend (/credentials/oidc). If not specified, the first (default) OIDC provider will be used.

  • skip_verification – Skip clients-side verification of the provider_id against the backend’s list of providers to avoid and related OIDC configuration

Return type:

Connection

Added in version 0.31.0.

Changed in version 0.33.0: Return connection object to support chaining.

authenticate_oidc_authorization_code(client_id=None, client_secret=None, provider_id=None, timeout=None, server_address=None, webbrowser_open=None, store_refresh_token=False)[source]

OpenID Connect Authorization Code Flow (with PKCE). :rtype: Connection

Deprecated since version 0.19.0: Usage of the Authorization Code flow is deprecated (because of its complexity) and will be removed. It is recommended to use the Device Code flow with authenticate_oidc_device() or Client Credentials flow with authenticate_oidc_client_credentials().

authenticate_oidc_client_credentials(client_id=None, client_secret=None, provider_id=None)[source]

Authenticate with OIDC Client Credentials flow

Client id, secret and provider id can be specified directly through the available arguments. It is also possible to leave these arguments empty and specify them through environment variables OPENEO_AUTH_CLIENT_ID, OPENEO_AUTH_CLIENT_SECRET and OPENEO_AUTH_PROVIDER_ID respectively as discussed in OIDC Client Credentials Using Environment Variables.

Parameters:
  • client_id (Optional[str]) – client id to use

  • client_secret (Optional[str]) – client secret to use

  • provider_id (Optional[str]) – provider id to use Fallback value can be set through environment variable OPENEO_AUTH_PROVIDER_ID.

Return type:

Connection

Changed in version 0.18.0: Allow specifying client id, secret and provider id through environment variables.

authenticate_oidc_device(client_id=None, client_secret=None, provider_id=None, *, store_refresh_token=False, use_pkce=None, max_poll_time=300, **kwargs)[source]

Authenticate with the OIDC Device Code flow

Parameters:
  • client_id (Optional[str]) – client id to use instead of the default one

  • client_secret (Optional[str]) – client secret to use instead of the default one

  • provider_id (Optional[str]) – provider id to use. Fallback value can be set through environment variable OPENEO_AUTH_PROVIDER_ID.

  • store_refresh_token (bool) – whether to store the received refresh token automatically

  • use_pkce (Optional[bool]) – Use PKCE instead of client secret. If not set explicitly to True (use PKCE) or False (use client secret), it will be attempted to detect the best mode automatically. Note that PKCE for device code is not widely supported among OIDC providers.

  • max_poll_time (float) – maximum time in seconds to keep polling for successful authentication.

Return type:

Connection

Changed in version 0.5.1: Add use_pkce argument

Changed in version 0.17.0: Add max_poll_time argument

Changed in version 0.19.0: Support fallback provider id through environment variable OPENEO_AUTH_PROVIDER_ID.

authenticate_oidc_refresh_token(client_id=None, refresh_token=None, client_secret=None, provider_id=None, *, store_refresh_token=False)[source]

Authenticate with OIDC Refresh Token flow

Parameters:
  • client_id (Optional[str]) – client id to use

  • refresh_token (Optional[str]) – refresh token to use

  • client_secret (Optional[str]) – client secret to use

  • provider_id (Optional[str]) – provider id to use. Fallback value can be set through environment variable OPENEO_AUTH_PROVIDER_ID.

  • store_refresh_token (bool) – whether to store the received refresh token automatically

Return type:

Connection

Changed in version 0.19.0: Support fallback provider id through environment variable OPENEO_AUTH_PROVIDER_ID.

authenticate_oidc_resource_owner_password_credentials(username, password, client_id=None, client_secret=None, provider_id=None, store_refresh_token=False)[source]

OpenId Connect Resource Owner Password Credentials

Return type:

Connection

capabilities()[source]

Loads all available capabilities.

Return type:

RESTCapabilities

collection_items(name, spatial_extent=None, temporal_extent=None, limit=None)[source]

Loads items for a specific image collection. May not be available for all collections.

This is an experimental API and is subject to change.

Parameters:
  • name – String Id of the collection

  • spatial_extent (Optional[List[float]]) – Limits the items to the given bounding box in WGS84: 1. Lower left corner, coordinate axis 1 2. Lower left corner, coordinate axis 2 3. Upper right corner, coordinate axis 1 4. Upper right corner, coordinate axis 2

  • temporal_extent (Optional[List[Union[str, datetime]]]) – Limits the items to the specified temporal interval.

  • limit (Optional[int]) – The amount of items per request/page. If None, the back-end decides. The interval has to be specified as an array with exactly two elements (start, end). Also supports open intervals by setting one of the boundaries to None, but never both.

Return type:

Iterator[dict]

Returns:

data_list: List A list of items

create_job(process_graph, *, title=None, description=None, plan=None, budget=None, additional=None, job_options=None, validate=None)[source]

Create a new job from given process graph on the back-end.

Parameters:
  • process_graph (Union[dict, FlatGraphableMixin, str, Path, List[FlatGraphableMixin]]) – openEO-style (flat) process graph representation, or an object that can be converted to such a representation: a dictionary, a DataCube object, a string with a JSON representation, a local file path or URL to a JSON representation, a MultiResult object, …

  • title (Optional[str]) – job title

  • description (Optional[str]) – job description

  • plan (Optional[str]) – The billing plan to process and charge the job with

  • budget (Optional[float]) – Maximum budget to be spent on executing the job. Note that some backends do not honor this limit.

  • additional (Optional[dict]) – additional (top-level) properties to set in the request body

  • job_options (Optional[dict]) – dictionary of job options to pass to the backend (under top-level property “job_options”)

  • validate (Optional[bool]) – Optional toggle to enable/prevent validation of the process graphs before execution (overruling the connection’s auto_validate setting).

Return type:

BatchJob

Returns:

Created job

Changed in version 0.35.0: Add multi-result support.

Added in version 0.36.0: Added argument job_options.

datacube_from_flat_graph(flat_graph, parameters=None)[source]

Construct a DataCube from a flat dictionary representation of a process graph.

Parameters:
  • flat_graph (dict) – flat dictionary representation of a process graph or a process dictionary with such a flat process graph under a “process_graph” field (and optionally parameter metadata under a “parameters” field).

  • parameters (Optional[dict]) – Optional dictionary mapping parameter names to parameter values to use for parameters occurring in the process graph (e.g. as used in user-defined processes)

Return type:

DataCube

Returns:

A DataCube corresponding with the operations encoded in the process graph

datacube_from_json(src, parameters=None)[source]

Construct a DataCube from JSON resource containing (flat) process graph representation.

Parameters:
  • src (Union[str, Path]) – raw JSON string, URL to JSON resource or path to local JSON file

  • parameters (Optional[dict]) – Optional dictionary mapping parameter names to parameter values to use for parameters occurring in the process graph (e.g. as used in user-defined processes)

Return type:

DataCube

Returns:

A DataCube corresponding with the operations encoded in the process graph

datacube_from_process(process_id, namespace=None, **kwargs)[source]

Load a data cube from a (custom) process.

Parameters:
  • process_id (str) – The process id.

  • namespace (Optional[str]) – optional: process namespace

  • kwargs – The arguments of the custom process

Return type:

DataCube

Returns:

A DataCube, without valid metadata, as the client is not aware of this custom process.

describe_account()[source]

Describes the currently authenticated user account.

Return type:

dict

describe_collection(collection_id)[source]

Get full collection metadata for given collection id.

See also

list_collection_ids() to list all collection ids provided by the back-end.

Parameters:

collection_id (str) – collection id

Return type:

dict

Returns:

collection metadata.

describe_process(id, namespace=None)[source]

Returns a single process from the back end.

Parameters:
  • id (str) – The id of the process.

  • namespace (Optional[str]) – The namespace of the process.

Return type:

dict

Returns:

The process definition.

download(graph, outputfile=None, *, timeout=None, validate=None, chunk_size=10000000, additional=None, job_options=None)[source]

Downloads the result of a process graph synchronously, and save the result to the given file or return bytes object if no outputfile is specified. This method is useful to export binary content such as images. For json content, the execute method is recommended.

Parameters:
  • graph (Union[dict, FlatGraphableMixin, str, Path, List[FlatGraphableMixin]]) – (flat) dict representing a process graph, or process graph as raw JSON string, or as local file path or URL

  • outputfile (Union[Path, str, None]) – output file

  • timeout (Optional[int]) – timeout to wait for response

  • validate (Optional[bool]) – Optional toggle to enable/prevent validation of the process graphs before execution (overruling the connection’s auto_validate setting).

  • chunk_size (int) – chunk size for streaming response.

  • additional (Optional[dict]) – additional (top-level) properties to set in the request body

  • job_options (Optional[dict]) – dictionary of job options to pass to the backend (under top-level property “job_options”)

Return type:

Optional[bytes]

Added in version 0.36.0: Added arguments additional and job_options.

execute(process_graph, *, timeout=None, validate=None, auto_decode=True, additional=None, job_options=None)[source]

Execute a process graph synchronously and return the result. If the result is a JSON object, it will be parsed.

Parameters:
  • process_graph (Union[dict, FlatGraphableMixin, str, Path, List[FlatGraphableMixin]]) – (flat) dict representing a process graph, or process graph as raw JSON string, or as local file path or URL

  • validate (Optional[bool]) – Optional toggle to enable/prevent validation of the process graphs before execution (overruling the connection’s auto_validate setting).

  • auto_decode (bool) – Boolean flag to enable/disable automatic JSON decoding of the response. Defaults to True.

  • additional (Optional[dict]) – additional (top-level) properties to set in the request body

  • job_options (Optional[dict]) – dictionary of job options to pass to the backend (under top-level property “job_options”)

Return type:

Union[dict, Response]

Returns:

parsed JSON response as a dict if auto_decode is True, otherwise response object

Added in version 0.36.0: Added arguments additional and job_options.

get_file(path, metadata=None)[source]

Gets a handle to a user-uploaded file in the user workspace on the back-end.

Parameters:

path (Union[str, PurePosixPath]) – The path on the user workspace.

Return type:

UserFile

imagecollection(collection_id, spatial_extent=None, temporal_extent=None, bands=None, properties=None, max_cloud_cover=None, fetch_metadata=True)
Return type:

DataCube

Deprecated since version 0.4.10: Usage of this legacy method is deprecated. Use load_collection() instead.

job(job_id)[source]

Get the job based on the id. The job with the given id should already exist.

Use openeo.rest.connection.Connection.create_job() to create new jobs

Parameters:

job_id (str) – the job id of an existing job

Return type:

BatchJob

Returns:

A job object.

job_logs(job_id, offset)[source]

Get batch job logs. :rtype: list

Deprecated since version 0.4.10: Use openeo.rest.job.BatchJob.logs() instead.

job_results(job_id)[source]

Get batch job results metadata. :rtype: dict

Deprecated since version 0.4.10: Use openeo.rest.job.BatchJob.get_results() instead.

list_collection_ids()[source]

List all collection ids provided by the back-end.

See also

describe_collection() to get the metadata of a particular collection.

Return type:

List[str]

Returns:

list of collection ids

list_collections()[source]

List basic metadata of all collections provided by the back-end.

Caution

Only the basic collection metadata will be returned. To obtain full metadata of a particular collection, it is recommended to use describe_collection() instead.

Return type:

List[dict]

Returns:

list of dictionaries with basic collection metadata.

list_file_formats()[source]

Get available input and output formats

Return type:

dict

list_file_types()
Return type:

dict

Deprecated since version 0.4.6: Usage of this legacy method is deprecated. Use list_output_formats() instead.

list_files()[source]

Lists all user-uploaded files in the user workspace on the back-end.

Return type:

List[UserFile]

Returns:

List of the user-uploaded files.

list_jobs(limit=None)[source]

Lists all jobs of the authenticated user.

Parameters:

limit (Optional[int]) – maximum number of jobs to return. Setting this limit enables pagination.

Return type:

List[dict]

Returns:

job_list: Dict of all jobs of the user.

Added in version 0.36.0: Added limit argument

list_processes(namespace=None)[source]

Loads all available processes of the back end.

Parameters:

namespace (Optional[str]) – The namespace for which to list processes.

Return type:

List[dict]

Returns:

processes_dict: Dict All available processes of the back end.

list_service_types()[source]

Loads all available service types.

Return type:

dict

Returns:

data_dict: Dict All available service types

list_services()[source]

Loads all available services of the authenticated user.

Return type:

dict

Returns:

data_dict: Dict All available services

list_udf_runtimes()[source]

List information about the available UDF runtimes.

Return type:

dict

Returns:

A dictionary with metadata about each available UDF runtime.

list_user_defined_processes()[source]

Lists all user-defined processes of the authenticated user.

Return type:

List[dict]

load_collection(collection_id, spatial_extent=None, temporal_extent=None, bands=None, properties=None, max_cloud_cover=None, fetch_metadata=True)[source]

Load a DataCube by collection id.

Parameters:
  • collection_id (Union[str, Parameter]) – image collection identifier

  • spatial_extent (Union[Dict[str, float], Parameter, None]) – limit data to specified bounding box or polygons

  • temporal_extent (Union[Sequence[Union[str, date, Parameter, PGNode, ProcessBuilderBase, None]], Parameter, str, None]) – limit data to specified temporal interval. Typically, just a two-item list or tuple containing start and end date. See Filter on temporal extent for more details on temporal extent handling and shorthand notation.

  • bands (Union[None, List[str], Parameter]) – only add the specified bands.

  • properties (Union[None, Dict[str, Union[str, PGNode, Callable]], List[CollectionProperty], CollectionProperty]) – limit data by collection metadata property predicates. See collection_property() for easy construction of such predicates.

  • max_cloud_cover (Optional[float]) – shortcut to set maximum cloud cover (“eo:cloud_cover” collection property)

Return type:

DataCube

Returns:

a datacube containing the requested data

Added in version 0.13.0: added the max_cloud_cover argument.

Changed in version 0.23.0: Argument temporal_extent: add support for year/month shorthand notation as discussed at Year/month shorthand notation.

Changed in version 0.26.0: Add collection_property() support to properties argument.

See also

openeo.org documentation on process “load_collection”.

load_disk_collection(format, glob_pattern, options=None)[source]

Loads image data from disk as a DataCube.

This is backed by a non-standard process (‘load_disk_data’). This will eventually be replaced by standard options such as openeo.rest.connection.Connection.load_stac() or https://processes.openeo.org/#load_uploaded_files

Parameters:
  • format (str) – the file format, e.g. ‘GTiff’

  • glob_pattern (str) – a glob pattern that matches the files to load from disk

  • options (Optional[dict]) – options specific to the file format

Return type:

DataCube

Deprecated since version 0.25.0: Depends on non-standard process, replace with openeo.rest.connection.Connection.load_stac() where possible.

load_geojson(data, properties=None)[source]

Converts GeoJSON data as defined by RFC 7946 into a vector data cube.

Parameters:
  • data (Union[dict, str, Path, BaseGeometry, Parameter]) –

    the geometry to load. One of:

    • GeoJSON-style data structure: e.g. a dictionary with "type": "Polygon" and "coordinates" fields

    • a path to a local GeoJSON file

    • a GeoJSON string

    • a shapely geometry object

  • properties (Optional[List[str]]) – A list of properties from the GeoJSON file to construct an additional dimension from.

Returns:

new VectorCube instance

Warning

EXPERIMENTAL: this process is experimental with the potential for major things to change.

Added in version 0.22.0.

See also

openeo.org documentation on process “load_geojson”.

load_ml_model(id)[source]

Loads a machine learning model from a STAC Item.

Parameters:

id (Union[str, BatchJob]) – STAC item reference, as URL, batch job (id) or user-uploaded file

Return type:

MlModel

Returns:

Added in version 0.10.0.

load_result(id, spatial_extent=None, temporal_extent=None, bands=None)[source]

Loads batch job results by job id from the server-side user workspace. The job must have been stored by the authenticated user on the back-end currently connected to.

Parameters:
  • id (str) – The id of a batch job with results.

  • spatial_extent (Optional[Dict[str, float]]) – limit data to specified bounding box or polygons

  • temporal_extent (Union[Sequence[Union[str, date, Parameter, PGNode, ProcessBuilderBase, None]], Parameter, str, None]) – limit data to specified temporal interval. Typically, just a two-item list or tuple containing start and end date. See Filter on temporal extent for more details on temporal extent handling and shorthand notation.

  • bands (Optional[List[str]]) – only add the specified bands

Return type:

DataCube

Returns:

a DataCube

Changed in version 0.23.0: Argument temporal_extent: add support for year/month shorthand notation as discussed at Year/month shorthand notation.

See also

openeo.org documentation on process “load_result”.

load_stac(url, spatial_extent=None, temporal_extent=None, bands=None, properties=None)[source]

Loads data from a static STAC catalog or a STAC API Collection and returns the data as a processable DataCube. A batch job result can be loaded by providing a reference to it.

If supported by the underlying metadata and file format, the data that is added to the data cube can be restricted with the parameters spatial_extent, temporal_extent and bands. If no data is available for the given extents, a NoDataAvailable error is thrown.

Remarks:

  • The bands (and all dimensions that specify nominal dimension labels) are expected to be ordered as specified in the metadata if the bands parameter is set to null.

  • If no additional parameter is specified this would imply that the whole data set is expected to be loaded. Due to the large size of many data sets, this is not recommended and may be optimized by back-ends to only load the data that is actually required after evaluating subsequent processes such as filters. This means that the values should be processed only after the data has been limited to the required extent and as a consequence also to a manageable size.

Parameters:
  • url (str) –

    The URL to a static STAC catalog (STAC Item, STAC Collection, or STAC Catalog) or a specific STAC API Collection that allows to filter items and to download assets. This includes batch job results, which itself are compliant to STAC. For external URLs, authentication details such as API keys or tokens may need to be included in the URL.

    Batch job results can be specified in two ways:

    • For Batch job results at the same back-end, a URL pointing to the corresponding batch job results endpoint should be provided. The URL usually ends with /jobs/{id}/results and {id} is the corresponding batch job ID.

    • For external results, a signed URL must be provided. Not all back-ends support signed URLs, which are provided as a link with the link relation canonical in the batch job result metadata.

  • spatial_extent (Union[Dict[str, float], Parameter, None]) –

    Limits the data to load to the specified bounding box or polygons.

    For raster data, the process loads the pixel into the data cube if the point at the pixel center intersects with the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC).

    For vector data, the process loads the geometry into the data cube if the geometry is fully within the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC). Empty geometries may only be in the data cube if no spatial extent has been provided.

    The GeoJSON can be one of the following feature types:

    • A Polygon or MultiPolygon geometry,

    • a Feature with a Polygon or MultiPolygon geometry, or

    • a FeatureCollection containing at least one Feature with Polygon or MultiPolygon geometries.

    Set this parameter to None to set no limit for the spatial extent. Be careful with this when loading large datasets. It is recommended to use this parameter instead of using filter_bbox() or filter_spatial() directly after loading unbounded data.

  • temporal_extent (Union[Sequence[Union[str, date, Parameter, PGNode, ProcessBuilderBase, None]], Parameter, str, None]) –

    Limits the data to load to the specified left-closed temporal interval. Applies to all temporal dimensions. The interval has to be specified as an array with exactly two elements:

    1. The first element is the start of the temporal interval. The specified instance in time is included in the interval.

    2. The second element is the end of the temporal interval. The specified instance in time is excluded from the interval.

    The second element must always be greater/later than the first element. Otherwise, a TemporalExtentEmpty exception is thrown.

    Also supports open intervals by setting one of the boundaries to None, but never both.

    Set this parameter to None to set no limit for the temporal extent. Be careful with this when loading large datasets. It is recommended to use this parameter instead of using filter_temporal() directly after loading unbounded data.

  • bands (Optional[List[str]]) –

    Only adds the specified bands into the data cube so that bands that don’t match the list of band names are not available. Applies to all dimensions of type bands.

    Either the unique band name (metadata field name in bands) or one of the common band names (metadata field common_name in bands) can be specified. If the unique band name and the common name conflict, the unique band name has a higher priority.

    The order of the specified array defines the order of the bands in the data cube. If multiple bands match a common name, all matched bands are included in the original order.

    It is recommended to use this parameter instead of using filter_bands() directly after loading unbounded data.

  • properties (Optional[Dict[str, Union[str, PGNode, Callable]]]) –

    Limits the data by metadata properties to include only data in the data cube which all given conditions return True for (AND operation).

    Specify key-value-pairs with the key being the name of the metadata property, which can be retrieved with the openEO Data Discovery for Collections. The value must be a condition (user-defined process) to be evaluated against a STAC API. This parameter is not supported for static STAC.

Return type:

DataCube

Added in version 0.17.0.

Changed in version 0.23.0: Argument temporal_extent: add support for year/month shorthand notation as discussed at Year/month shorthand notation.

See also

openeo.org documentation on process “load_stac”.

load_stac_from_job(job, spatial_extent=None, temporal_extent=None, bands=None, properties=None)[source]

Convenience function to directly load the results of a finished openEO job (as a STAC collection) with load_stac() in a new openEO process graph.

When available, the “canonical” link (signed URL) of the job results will be used.

Parameters:
  • job (Union[BatchJob, str]) – a BatchJob or job id pointing to a finished job. Note that the BatchJob approach allows to point to a batch job on a different back-end.

  • spatial_extent (Union[Dict[str, float], Parameter, None]) – limit data to specified bounding box or polygons

  • temporal_extent (Union[Sequence[Union[str, date, Parameter, PGNode, ProcessBuilderBase, None]], Parameter, str, None]) – limit data to specified temporal interval.

  • bands (Optional[List[str]]) – limit data to the specified bands

Return type:

DataCube

Added in version 0.30.0.

load_url(url, format, options=None)[source]

Loads a file from a URL

Parameters:
  • url (str) – The URL to read from. Authentication details such as API keys or tokens may need to be included in the URL.

  • format (str) – The file format to use when loading the data.

  • options (Optional[dict]) – The file format parameters to use when reading the data. Must correspond to the parameters that the server reports as supported parameters for the chosen format

Returns:

new VectorCube instance

Warning

EXPERIMENTAL: this process is experimental with the potential for major things to change.

Added in version 0.22.0.

See also

openeo.org documentation on process “load_url”.

remove_service(service_id)[source]

Stop and remove a secondary web service.

Parameters:

service_id (str) – service identifier

Returns:

Deprecated since version 0.8.0: Use openeo.rest.service.Service.delete_service() instead.

request(method, path, headers=None, auth=None, check_error=True, expected_status=None, **kwargs)[source]

Generic request send

save_user_defined_process(user_defined_process_id, process_graph, parameters=None, public=False, summary=None, description=None, returns=None, categories=None, examples=None, links=None)[source]

Store a process graph and its metadata on the backend as a user-defined process for the authenticated user.

Parameters:
  • user_defined_process_id (str) – unique identifier for the user-defined process

  • process_graph (Union[dict, ProcessBuilderBase]) – a process graph

  • parameters (List[Union[dict, Parameter]]) – a list of parameters

  • public (bool) – visible to other users?

  • summary (Optional[str]) – A short summary of what the process does.

  • description (Optional[str]) – Detailed description to explain the entity. CommonMark 0.29 syntax MAY be used for rich text representation.

  • returns (Optional[dict]) – Description and schema of the return value.

  • categories (Optional[List[str]]) – A list of categories.

  • examples (Optional[List[dict]]) – A list of examples.

  • links (Optional[List[dict]]) – A list of links.

Return type:

RESTUserDefinedProcess

Returns:

a RESTUserDefinedProcess instance

service(service_id)[source]

Get the secondary web service based on the id. The service with the given id should already exist.

Use openeo.rest.connection.Connection.create_service() to create new services

Parameters:

job_id – the service id of an existing secondary web service

Return type:

Service

Returns:

A service object.

upload_file(source, target=None)[source]

Uploads a file to the given target location in the user workspace on the back-end.

If a file at the target path exists in the user workspace it will be replaced.

Parameters:
  • source (Union[Path, str]) – A path to a file on the local file system to upload.

  • target (Union[str, PurePosixPath, None]) – The desired path (which can contain a folder structure if desired) on the user workspace. If not set: defaults to the original filename (without any folder structure) of the local file .

Return type:

UserFile

user_defined_process(user_defined_process_id)[source]

Get the user-defined process based on its id. The process with the given id should already exist.

Parameters:

user_defined_process_id (str) – the id of the user-defined process

Return type:

RESTUserDefinedProcess

Returns:

a RESTUserDefinedProcess instance

user_jobs()[source]
Return type:

List[dict]

Deprecated since version 0.4.10: use list_jobs() instead

validate_process_graph(process_graph)[source]

Validate a process graph without executing it.

Parameters:

process_graph (Union[dict, FlatGraphableMixin, str, Path, List[FlatGraphableMixin]]) – openEO-style (flat) process graph representation, or an object that can be converted to such a representation: a dictionary, a DataCube object, a string with a JSON representation, a local file path or URL to a JSON representation, a MultiResult object, …

Return type:

List[dict]

Returns:

list of errors (dictionaries with “code” and “message” fields)

vectorcube_from_paths(paths, format, options={})[source]

Loads one or more files referenced by url or path that is accessible by the backend.

Parameters:
  • paths (List[str]) – The files to read.

  • format (str) – The file format to read from. It must be one of the values that the server reports as supported input file formats.

  • options (dict) – The file format parameters to be used to read the files. Must correspond to the parameters that the server reports as supported parameters for the chosen format.

Return type:

VectorCube

Returns:

A VectorCube.

Added in version 0.14.0.

classmethod version_discovery(url, session=None, timeout=None)[source]

Do automatic openEO API version discovery from given url, using a “well-known URI” strategy.

Parameters:

url (str) – initial backend url (not including “/.well-known/openeo”)

Return type:

str

Returns:

root url of highest supported backend version

version_info()[source]

List version of the openEO client, API, back-end, etc.

openeo.rest.job

class openeo.rest.job.BatchJob(job_id, connection)[source]

Handle for an openEO batch job, allowing it to describe, start, cancel, inspect results, etc.

Added in version 0.11.0: This class originally had the more cryptic name RESTJob, which is still available as legacy alias, but BatchJob is recommended since version 0.11.0.

delete()[source]

Delete this batch job.

Added in version 0.20.0: This method was previously called delete_job().

This method uses openEO endpoint DELETE /jobs/{job_id}

delete_job()

Delete this batch job.

Deprecated since version 0.20.0: Usage of this legacy method is deprecated. Use delete() instead.

describe()[source]

Get detailed metadata about a submitted batch job (title, process graph, status, progress, …). :rtype: dict

Added in version 0.20.0: This method was previously called describe_job().

This method uses openEO endpoint GET /jobs/{job_id}

describe_job()

Get detailed metadata about a submitted batch job (title, process graph, status, progress, …). :rtype: dict

Deprecated since version 0.20.0: Usage of this legacy method is deprecated. Use describe() instead.

download_result(target=None)[source]

Download single job result to the target file path or into folder (current working dir by default).

Fails if there are multiple result files.

Parameters:

target (Union[str, Path]) – String or path where the file should be downloaded to.

Return type:

Path

download_results(target=None)[source]

Download all job result files into given folder (current working dir by default).

The names of the files are taken directly from the backend.

Parameters:

target (Union[str, Path]) – String/path, folder where to put the result files.

Return type:

Dict[Path, dict]

Returns:

file_list: Dict containing the downloaded file path as value and asset metadata

Deprecated since version 0.4.10: Instead use BatchJob.get_results() and the more flexible download functionality of JobResults

estimate()[source]

Calculate time/cost estimate for a job.

This method uses openEO endpoint GET /jobs/{job_id}/estimate

estimate_job()

Calculate time/cost estimate for a job.

Deprecated since version 0.20.0: Usage of this legacy method is deprecated. Use estimate() instead.

get_result()[source]

Deprecated since version 0.4.10: Use BatchJob.get_results() instead.

get_results()[source]

Get handle to batch job results for result metadata inspection or downloading resulting assets. :rtype: JobResults

Added in version 0.4.10.

get_results_metadata_url(*, full=False)[source]

Get results metadata URL

Return type:

str

job_id

Unique identifier of the batch job (string).

list_results()[source]

Get batch job results metadata. :rtype: dict

Deprecated since version 0.4.10: Use get_results() instead.

logs(offset=None, level=None)[source]

Retrieve job logs.

Parameters:
  • offset (Optional[str]) –

    The last identifier (property id of a LogEntry) the client has received.

    If provided, the back-ends only sends the entries that occurred after the specified identifier. If not provided or empty, start with the first entry.

    Defaults to None.

  • level (Union[int, str, None]) –

    Minimum log level to retrieve.

    You can use either constants from Python’s standard module logging or their names (case-insensitive).

    For example:

    logging.INFO, "info" or "INFO" can all be used to show the messages for level logging.INFO and above, i.e. also logging.WARNING and logging.ERROR will be included.

    Default is to show all log levels, in other words logging.DEBUG. This is also the result when you explicitly pass log_level=None or log_level=””.

Return type:

List[LogEntry]

Returns:

A list containing the log entries for the batch job.

run_synchronous(outputfile=None, print=<built-in function print>, max_poll_interval=60, connection_retry_interval=30)[source]

Start the job, wait for it to finish and download result

Return type:

BatchJob

start()[source]

Start this batch job.

Return type:

BatchJob

Returns:

Started batch job

Added in version 0.20.0: This method was previously called start_job().

This method uses openEO endpoint POST /jobs/{job_id}/results

start_and_wait(print=<built-in function print>, max_poll_interval=60, connection_retry_interval=30, soft_error_max=10)[source]

Start the batch job, poll its status and wait till it finishes (or fails)

Parameters:
  • print – print/logging function to show progress/status

  • max_poll_interval (int) – maximum number of seconds to sleep between status polls

  • connection_retry_interval (int) – how long to wait when status poll failed due to connection issue

  • soft_error_max – maximum number of soft errors (e.g. temporary connection glitches) to allow

Return type:

BatchJob

Returns:

start_job()

Start this batch job. :rtype: BatchJob

Deprecated since version 0.20.0: Usage of this legacy method is deprecated. Use start() instead.

status()[source]

Get the status of the batch job

Return type:

str

Returns:

batch job status, one of “created”, “queued”, “running”, “canceled”, “finished” or “error”.

stop()[source]

Stop this batch job.

Added in version 0.20.0: This method was previously called stop_job().

This method uses openEO endpoint DELETE /jobs/{job_id}/results

stop_job()

Stop this batch job.

Deprecated since version 0.20.0: Usage of this legacy method is deprecated. Use stop() instead.

class openeo.rest.job.JobResults(job)[source]

Results of a batch job: listing of one or more output files (assets) and some metadata.

Added in version 0.4.10.

download_file(target=None, name=None)[source]

Download single asset. Can be used when there is only one asset in the JobResults, or when the desired asset name is given explicitly.

Parameters:
  • target (Union[Path, str]) – path to download to. Can be an existing directory (in which case the filename advertised by backend will be used) or full file name. By default, the working directory will be used.

  • name (str) – asset name to download (not required when there is only one asset)

Return type:

Path

Returns:

path of downloaded asset

download_files(target=None, include_stac_metadata=True)[source]

Download all assets to given folder.

Parameters:
  • target (Union[Path, str]) – path to folder to download to (must be a folder if it already exists)

  • include_stac_metadata (bool) – whether to download the job result metadata as a STAC (JSON) file.

Return type:

List[Path]

Returns:

list of paths to the downloaded assets.

get_asset(name=None)[source]

Get single asset by name or without name if there is only one.

Return type:

ResultAsset

get_assets()[source]

Get all assets from the job results.

Return type:

List[ResultAsset]

get_metadata(force=False)[source]

Get batch job results metadata (parsed JSON)

Return type:

dict

class openeo.rest.job.RESTJob(job_id, connection)[source]

Legacy alias for BatchJob.

Deprecated since version 0.11.0: Use BatchJob instead

class openeo.rest.job.ResultAsset(job, name, href, metadata)[source]

Result asset of a batch job (e.g. a GeoTIFF or JSON file)

Added in version 0.4.10.

download(target=None, *, chunk_size=10000000)[source]

Download asset to given location

Parameters:
  • target (Union[str, Path, None]) – download target path. Can be an existing folder (in which case the filename advertised by backend will be used) or full file name. By default, the working directory will be used.

  • chunk_size (int) – chunk size for streaming response.

Return type:

Path

href

Download URL of the asset.

load_bytes()[source]

Load asset in memory as raw bytes.

Return type:

bytes

load_json()[source]

Load asset in memory and parse as JSON.

Return type:

dict

metadata

Asset metadata provided by the backend, possibly containing keys “type” (for media type), “roles”, “title”, “description”.

name

Asset name as advertised by the backend.

openeo.rest.conversions

Helpers for data conversions between Python ecosystem data types and openEO data structures.

exception openeo.rest.conversions.InvalidTimeSeriesException[source]
openeo.rest.conversions.datacube_from_file(filename, fmt='netcdf')[source]
Return type:

XarrayDataCube

Deprecated since version 0.7.0: Use XarrayDataCube.from_file() instead.

openeo.rest.conversions.datacube_plot(datacube, *args, **kwargs)[source]

Deprecated since version 0.7.0: Use XarrayDataCube.plot() instead.

openeo.rest.conversions.datacube_to_file(datacube, filename, fmt='netcdf')[source]

Deprecated since version 0.7.0: Use XarrayDataCube.save_to_file() instead.

openeo.rest.conversions.timeseries_json_to_pandas(timeseries, index='date', auto_collapse=True)[source]

Convert a timeseries JSON object as returned by the aggregate_spatial process to a pandas DataFrame object

This timeseries data has three dimensions in general: date, polygon index and band index. One of these will be used as index of the resulting dataframe (as specified by the index argument), and the other two will be used as multilevel columns. When there is just a single polygon or band in play, the dataframe will be simplified by removing the corresponding dimension if auto_collapse is enabled (on by default).

Parameters:
  • timeseries (dict) – dictionary as returned by aggregate_spatial

  • index (str) – which dimension should be used for the DataFrame index: ‘date’ or ‘polygon’

  • auto_collapse – whether single band or single polygon cases should be simplified automatically

Return type:

DataFrame

Returns:

pandas DataFrame or Series

openeo.rest.udp

class openeo.rest.udp.RESTUserDefinedProcess(user_defined_process_id, connection)[source]

Wrapper for a user-defined process stored (or to be stored) on an openEO back-end

delete()[source]

Remove user-defined process from back-end

Return type:

None

describe()[source]

Get metadata of this user-defined process.

Return type:

dict

store(process_graph, parameters=None, public=False, summary=None, description=None, returns=None, categories=None, examples=None, links=None)[source]

Store a process graph and its metadata on the backend as a user-defined process

update(process_graph, parameters=None, public=False, summary=None, description=None)[source]

Deprecated since version 0.4.11: Use store instead. Method update is misleading: OpenEO API does not provide (partial) updates of user-defined processes, only fully overwriting ‘store’ operations.

openeo.rest.udp.build_process_dict(process_graph, process_id=None, summary=None, description=None, parameters=None, returns=None, categories=None, examples=None, links=None)[source]

Build a dictionary describing a process with metadaa (process_graph, parameters, description, …)

Parameters:
  • process_graph (Union[dict, FlatGraphableMixin, Path, List[FlatGraphableMixin]]) – dict or builder representing a process graph

  • process_id (Optional[str]) – identifier of the process

  • summary (Optional[str]) – short summary of what the process does

  • description (Optional[str]) – detailed description

  • parameters (Optional[List[Union[dict, Parameter]]]) – list of process parameters (which have name, schema, default value, …)

  • returns (Optional[dict]) – description and schema of what the process returns

  • categories (Optional[List[str]]) – list of categories

  • examples (Optional[List[dict]]) – list of examples, may be used for unit tests

  • links (Optional[List[dict]]) – list of links related to the process

Return type:

dict

Returns:

dictionary in openEO “process graph with metadata” format

openeo.rest.userfile

class openeo.rest.userfile.UserFile(path, *, connection, metadata=None)[source]

Handle to a (user-uploaded) file in the user workspace on a openEO back-end.

delete()[source]

Delete the user-uploaded file from the user workspace on the back-end.

download(target=None)[source]

Downloads a user-uploaded file from the user workspace on the back-end locally to the given location.

Parameters:

target (Union[Path, str]) – local download target path. Can be an existing folder (in which case the file name advertised by backend will be used) or full file name. By default, the working directory will be used.

Return type:

Path

classmethod from_metadata(metadata, connection)[source]

Build UserFile from a workspace file metadata dictionary.

Return type:

UserFile

to_dict()[source]

Returns the provided metadata as dict.

Return type:

Dict[str, Any]

upload(source)[source]

Uploads a local file to the path corresponding to this UserFile in the user workspace and returns new UserFile of newly uploaded file.

Tip

Usually you’ll just need Connection.upload_file() instead of this UserFile method.

If the file exists in the user workspace it will be replaced.

Parameters:

source (Union[Path, str]) – A path to a file on the local file system to upload.

Return type:

UserFile

Returns:

new UserFile instance of the newly uploaded file

openeo.udf

class openeo.udf.udf_data.UdfData(proj=None, datacube_list=None, feature_collection_list=None, structured_data_list=None, user_context=None)[source]

Container for data passed to a user defined function (UDF)

property datacube_list: List[XarrayDataCube] | None

Get the data cube list

property feature_collection_list: List[FeatureCollection] | None

get all feature collections as list

classmethod from_dict(udf_dict)[source]

Create a udf data object from a python dictionary that was created from the JSON definition of the UdfData class

Parameters:

udf_dict (dict) – The dictionary that contains the udf data definition

Return type:

UdfData

get_datacube_list()[source]

Get the data cube list

Return type:

Optional[List[XarrayDataCube]]

get_feature_collection_list()[source]

get all feature collections as list

Return type:

Optional[List[FeatureCollection]]

get_structured_data_list()[source]

Get all structured data entries

Return type:

Optional[List[StructuredData]]

Returns:

A list of StructuredData objects

set_datacube_list(datacube_list)[source]

Set the data cube list

Parameters:

datacube_list (Optional[List[XarrayDataCube]]) – A list of data cubes

set_structured_data_list(structured_data_list)[source]

Set the list of structured data

Parameters:

structured_data_list (Optional[List[StructuredData]]) – A list of StructuredData objects

property structured_data_list: List[StructuredData] | None

Get all structured data entries

Returns:

A list of StructuredData objects

to_dict()[source]

Convert this UdfData object into a dictionary that can be converted into a valid JSON representation

Return type:

dict

property user_context: dict

Return the user context that was passed to the run_udf function

class openeo.udf.xarraydatacube.XarrayDataCube(array)[source]

This is a thin wrapper around xarray.DataArray providing a basic “DataCube” interface for openEO UDF usage around multi-dimensional data.

property array: DataArray

Get the xarray.DataArray that contains the data and dimension definition

classmethod from_dict(xdc_dict)[source]

Create a XarrayDataCube from a Python dictionary that was created from the JSON definition of the data cube

Parameters:

data – The dictionary that contains the data cube definition

Return type:

XarrayDataCube

classmethod from_file(path, fmt=None, **kwargs)[source]

Load data file as XarrayDataCube in memory

Parameters:
  • path (Union[str, Path]) – the file on disk

  • fmt – format to load from, e.g. “netcdf” or “json” (will be auto-detected when not specified)

Return type:

XarrayDataCube

Returns:

loaded data cube

get_array()[source]

Get the xarray.DataArray that contains the data and dimension definition

Return type:

DataArray

plot(title=None, limits=None, show_bandnames=True, show_dates=True, show_axeslabels=False, fontsize=10.0, oversample=1, cmap='RdYlBu_r', cbartext=None, to_file=None, to_show=True)[source]

Visualize a XarrayDataCube with matplotlib

Parameters:
  • datacube – data to plot

  • title (str) – title text drawn in the top left corner (default: nothing)

  • limits – range of the contour plot as a tuple(min,max) (default: None, in which case the min/max is computed from the data)

  • show_bandnames (bool) – whether to plot the column names (default: True)

  • show_dates (bool) – whether to show the dates for each row (default: True)

  • show_axeslabels (bool) – whether to show the labels on the axes (default: False)

  • fontsize (float) – font size in pixels (default: 10)

  • oversample (float) – one value is plotted into oversample x oversample number of pixels (default: 1 which means each value is plotted as a single pixel)

  • cmap (Union[str, ‘matplotlib.colors.Colormap’]) – built-in matplotlib color map name or ColorMap object (default: RdYlBu_r which is a blue-yellow-red rainbow)

  • cbartext (str) – text on top of the legend (default: nothing)

  • to_file (str) – filename to save the image to (default: None, which means no file is generated)

  • to_show (bool) – whether to show the image in a matplotlib window (default: True)

Returns:

None

save_to_file(path, fmt=None, **kwargs)[source]

Store XarrayDataCube to file

Parameters:
  • path (Union[str, Path]) – destination file on disk

  • fmt – format to save as, e.g. “netcdf” or “json” (will be auto-detected when not specified)

to_dict()[source]

Convert this hypercube into a dictionary that can be converted into a valid JSON representation

Return type:

dict

>>> example = {
...     "id": "test_data",
...     "data": [
...         [[0.0, 0.1], [0.2, 0.3]],
...         [[0.0, 0.1], [0.2, 0.3]],
...     ],
...     "dimension": [
...         {"name": "time", "coordinates": ["2001-01-01", "2001-01-02"]},
...         {"name": "X", "coordinates": [50.0, 60.0]},
...         {"name": "Y"},
...     ],
... }
class openeo.udf.structured_data.StructuredData(data, description=None, type=None)[source]

This class represents structured data that is produced by an UDF and can not be represented as a raster or vector data cube. For example: the result of a statistical computation.

Usage example:

>>> StructuredData([3, 5, 8, 13])
>>> StructuredData({"mean": 5, "median": 8})
>>> StructuredData([('col_1', 'col_2'), (1, 2), (2, 3)], type="table")

Note: this module was initially developed under the openeo-udf project (https://github.com/Open-EO/openeo-udf)

openeo.udf.run_code.execute_local_udf(udf, datacube, fmt='netcdf')[source]

Locally executes an user defined function on a previously downloaded datacube.

Parameters:
  • udf (Union[str, UDF]) – the code of the user defined function

  • datacube (Union[str, DataArray, XarrayDataCube]) – the path to the downloaded data in disk or a DataCube

  • fmt – format of the file if datacube is string

Returns:

the resulting DataCube

openeo.udf.run_code.extract_udf_dependencies(udf)[source]

Extract dependencies from UDF code declared in a top-level comment block following the inline script metadata specification (PEP 508).

Basic example UDF snippet declaring expected dependencies as embedded metadata in a comment block:

# /// script
# dependencies = [
#     "geojson",
# ]
# ///

import geojson

def apply_datacube(cube: xarray.DataArray, context: dict) -> xarray.DataArray:
    ...

See also

Standard for declaring Python UDF dependencies for more in-depth information.

Parameters:

udf (Union[str, UDF]) – UDF code as a string or UDF object

Return type:

Optional[List[str]]

Returns:

List of extracted dependencies or None when no valid metadata block with dependencies was found.

Added in version 0.30.0.

Debug utilities for UDFs

openeo.udf.debug.inspect(data=None, message='', code='User', level='info')[source]

Implementation of the openEO inspect process for UDF contexts.

Note that it is up to the back-end implementation to properly capture this logging and include it in the batch job logs.

Parameters:
  • data – data to log

  • message (str) – message to send in addition to the data

  • code (str) – A label to help identify one or more log entries

  • level (str) – The severity level of this message. Allowed values: “error”, “warning”, “info”, “debug”

Added in version 0.10.1.

openeo.util

Various utilities and helpers.

class openeo.util.BBoxDict(*, west, south, east, north, crs=None)[source]

Dictionary based helper to easily create/work with bounding box dictionaries (having keys “west”, “south”, “east”, “north”, and optionally “crs”).

Parameters:

crs (Union[int, str, None]) – value describing the coordinate reference system. Typically just an int (interpreted as EPSG code, e.g. 4326) or a string (handled as authority string, e.g. "EPSG:4326"). See openeo.util.normalize_crs() for more details about additional normalization that is applied to this argument.

Added in version 0.10.1.

classmethod from_dict(data)[source]

Build from dictionary with at least keys “west”, “south”, “east”, and “north”.

Return type:

BBoxDict

classmethod from_sequence(seq, crs=None)[source]

Build from sequence of 4 bounds (west, south, east and north).

Return type:

BBoxDict

openeo.util.load_json_resource(src)[source]

Helper to load some kind of JSON resource

Parameters:

src (Union[str, Path]) – a JSON resource: a raw JSON string, a path to (local) JSON file, or a URL to a remote JSON resource

Return type:

dict

Returns:

data structured parsed from JSON

openeo.util.normalize_crs(crs, *, use_pyproj=True)[source]

Normalize the given value (describing a CRS or Coordinate Reference System) to an openEO compatible EPSG code (int) or WKT2 CRS string.

At minimum, the following input values are handled:

  • an integer value (e.g. 4326) is interpreted as an EPSG code

  • a string that just contains an integer (e.g. "4326") or with and additional "EPSG:" prefix (e.g. "EPSG:4326") will also be interpreted as an EPSG value

Additional support and behavior depends on the availability of the pyproj library:

  • When available, it will be used for parsing and validation: everything supported by pyproj.CRS.from_user_input is allowed. See the pyproj docs for more details.

  • Otherwise, some best effort validation is done: EPSG looking integer or string values will be parsed as such as discussed above. Other strings will be assumed to be WKT2 already. Other data structures will not be accepted.

Parameters:
  • crs (Any) – value that encodes a coordinate reference system, typically just an int (EPSG code) or string (authority string). If the pyproj library is available, everything supported by it is allowed.

  • use_pyproj (bool) – whether pyproj should be leveraged at all (mainly useful for testing the “no pyproj available” code path)

Return type:

Union[None, int, str]

Returns:

EPSG code as int, or WKT2 string. Or None if input was empty.

Raises:

ValueError – When the given CRS data can not be parsed/converted/normalized.

openeo.util.to_bbox_dict(x, *, crs=None)[source]

Convert given data or object to a bounding box dictionary (having keys “west”, “south”, “east”, “north”, and optionally “crs”).

Supports various input types/formats:

  • list/tuple (assumed to be in west-south-east-north order)

    >>> to_bbox_dict([3, 50, 4, 51])
    {'west': 3, 'south': 50, 'east': 4, 'north': 51}
    
  • dictionary (unnecessary items will be stripped)

    >>> to_bbox_dict({
    ...     "color": "red", "shape": "triangle",
    ...     "west": 1, "south": 2, "east": 3, "north": 4, "crs": "EPSG:4326",
    ... })
    {'west': 1, 'south': 2, 'east': 3, 'north': 4, 'crs': 'EPSG:4326'}
    
  • a shapely geometry

Added in version 0.10.1.

Parameters:
  • x (Any) – input data that describes west-south-east-north bounds in some way, e.g. as a dictionary, a list, a tuple, ashapely geometry, …

  • crs (Union[int, str, None]) – (optional) CRS field

Return type:

BBoxDict

Returns:

dictionary (subclass) with keys “west”, “south”, “east”, “north”, and optionally “crs”.

openeo.processes

openeo.processes.process(process_id, arguments=None, namespace=None, **kwargs)

Apply process, using given arguments

Parameters:
  • process_id (str) – process id of the process.

  • arguments (dict) – argument dictionary for the process.

  • namespace (Optional[str]) – process namespace (only necessary to specify for non-predefined or non-user-defined processes)

Returns:

new ProcessBuilder instance

Graph building

Various utilities and helpers to simplify the construction of openEO process graphs.

Public openEO process graph building utilities

class openeo.rest.graph_building.CollectionProperty(name, _builder=None)[source]

Helper object to easily create simple collection metadata property filters to be used with Connection.load_collection().

Note

This class should not be used directly by end user code. Use the collection_property() factory instead.

Warning

this is an experimental feature, naming might change.

openeo.rest.graph_building.collection_property(name)[source]

Helper to easily create simple collection metadata property filters to be used with Connection.load_collection().

Usage example:

from openeo import collection_property
...

connection.load_collection(
    ...
    properties=[
        collection_property("eo:cloud_cover") <= 75,
        collection_property("platform") == "Sentinel-2B",
    ]
)

Warning

this is an experimental feature, naming might change.

Added in version 0.26.0.

Parameters:

name (str) – name of the collection property to filter on

Return type:

CollectionProperty

Returns:

an object that supports operators like <=, == to easily build simple property filters.

Internal openEO process graph building utilities

Internal functionality for abstracting, building, manipulating and processing openEO process graphs.

class openeo.internal.graph_building.FlatGraphableMixin[source]

Mixin for classes that can be exported/converted to a “flat graph” representation of an openEO process graph.

print_json(*, file=None, indent=2, separators=None, end='\\n')[source]

Print interoperable JSON representation of the process graph.

See DataCube.to_json() to get the JSON representation as a string and Export a process graph for more usage information.

Also see json.dumps docs for more information on the JSON formatting options.

Parameters:
  • file – file-like object (stream) to print to (current sys.stdout by default). Or a path (string or pathlib.Path) to a file to write to.

  • indent (Optional[int]) – JSON indentation level.

  • separators (Optional[Tuple[str, str]]) – (optional) tuple of item/key separators.

  • end (str) – additional string to be printed at the end (newline by default).

Added in version 0.12.0.

Added in version 0.23.0: added the end argument.

to_json(*, indent=2, separators=None)[source]

Get interoperable JSON representation of the process graph.

See DataCube.print_json() to directly print the JSON representation and Export a process graph for more usage information.

Also see json.dumps docs for more information on the JSON formatting options.

Parameters:
  • indent (Optional[int]) – JSON indentation level.

  • separators (Optional[Tuple[str, str]]) – (optional) tuple of item/key separators.

Return type:

str

Returns:

JSON string

class openeo.internal.graph_building.PGNode(process_id, arguments=None, namespace=None, **kwargs)[source]

A process node in a process graph: has at least a process_id and arguments.

Note that a full openEO “process graph” is essentially a directed acyclic graph of nodes pointing to each other. A full process graph is practically equivalent with its “result” node, as it points (directly or indirectly) to all the other nodes it depends on.

Warning

This class is an implementation detail meant for internal use. It is not recommended for general use in normal user code. Instead, use process graph abstraction builders like Connection.load_collection(), Connection.datacube_from_process(), Connection.datacube_from_flat_graph(), Connection.datacube_from_json(), Connection.load_ml_model(), openeo.processes.process(),

flat_graph()[source]

Get the process graph in internal flat dict representation.

Return type:

Dict[str, dict]

static from_flat_graph(flat_graph, parameters=None)[source]

Unflatten a given flat dict representation of a process graph and return result node.

Return type:

PGNode

to_dict()[source]

Convert process graph to a nested dictionary structure. Uses deep copy style: nodes that are reused in graph will be deduplicated

Return type:

dict

static to_process_graph_argument(value)[source]

Normalize given argument properly to a “process_graph” argument to be used as reducer/subprocess for processes like reduce_dimension, aggregate_spatial, apply, merge_cubes, resample_cube_temporal

Return type:

dict

update_arguments(**kwargs)[source]

Add/Update arguments of the process node.

Added in version 0.10.1.

walk_nodes()[source]

Walk this node and all it’s parents

Return type:

Iterator[PGNode]

Testing

Various utilities for testing use cases (unit tests, integration tests, benchmarking, …)

openeo.testing

Utilities for testing of openEO client workflows.

class openeo.testing.TestDataLoader(root)[source]

Helper to resolve paths to test data files, load them as JSON, optionally preprocess them, etc.

It’s intended to be used as a pytest fixture, e.g. from conftest.py:

@pytest.fixture
def test_data() -> TestDataLoader:
    return TestDataLoader(root=Path(__file__).parent / "data")

Added in version 0.30.0.

get_path(filename)[source]

Get absolute path to a test data file

Return type:

Path

load_json(filename, preprocess=None)[source]

Parse data from a test JSON file

Return type:

dict

openeo.testing.results

Assert functions for comparing actual (batch job) results against expected reference data.

openeo.testing.results.assert_job_results_allclose(actual, expected, *, rtol=1e-06, atol=1e-06, tmp_path=None)[source]

Assert that two job results sets are equal (with tolerance).

Parameters:
  • actual (Union[BatchJob, JobResults, str, Path]) – actual job results, provided as BatchJob object, JobResults() object or path to directory with downloaded assets.

  • expected (Union[BatchJob, JobResults, str, Path]) – expected job results, provided as BatchJob object, JobResults() object or path to directory with downloaded assets.

  • rtol (float) – relative tolerance

  • atol (float) – absolute tolerance

  • tmp_path (Optional[Path]) – root temp path to download results if needed. It’s recommended to pass pytest’s tmp_path fixture here

Raises:

AssertionError – if not equal within the given tolerance

Added in version 0.31.0.

Warning

This function is experimental and subject to change.

openeo.testing.results.assert_xarray_allclose(actual, expected, *, rtol=1e-06, atol=1e-06)[source]

Assert that two Xarray DataSet or DataArray instances are equal (with tolerance).

Parameters:
  • actual (Union[Dataset, DataArray, str, Path]) – actual data, provided as Xarray object or path to NetCDF/GeoTIFF file.

  • expected (Union[Dataset, DataArray, str, Path]) – expected or reference data, provided as Xarray object or path to NetCDF/GeoTIFF file.

  • rtol (float) – relative tolerance

  • atol (float) – absolute tolerance

Raises:

AssertionError – if not equal within the given tolerance

Added in version 0.31.0.

Warning

This function is experimental and subject to change.

openeo.testing.results.assert_xarray_dataarray_allclose(actual, expected, *, rtol=1e-06, atol=1e-06)[source]

Assert that two Xarray DataArray instances are equal (with tolerance).

Parameters:
  • actual (Union[DataArray, str, Path]) – actual data, provided as Xarray DataArray object or path to NetCDF/GeoTIFF file.

  • expected (Union[DataArray, str, Path]) – expected or reference data, provided as Xarray DataArray object or path to NetCDF/GeoTIFF file.

  • rtol (float) – relative tolerance

  • atol (float) – absolute tolerance

Raises:

AssertionError – if not equal within the given tolerance

Added in version 0.31.0.

Warning

This function is experimental and subject to change.

openeo.testing.results.assert_xarray_dataset_allclose(actual, expected, *, rtol=1e-06, atol=1e-06)[source]

Assert that two Xarray DataSet instances are equal (with tolerance).

Parameters:
  • actual (Union[Dataset, str, Path]) – actual data, provided as Xarray Dataset object or path to NetCDF/GeoTIFF file

  • expected (Union[Dataset, str, Path]) – expected or reference data, provided as Xarray Dataset object or path to NetCDF/GeoTIFF file.

  • rtol (float) – relative tolerance

  • atol (float) – absolute tolerance

Raises:

AssertionError – if not equal within the given tolerance

Added in version 0.31.0.

Warning

This function is experimental and subject to change.