API

High level Interface

The high-level interface tries to provide an opinionated, Pythonic, API to interact with openEO back-ends. It’s aim is to hide some of the details of using a web service, so the user can produce concise and readable code.

Users that want to interact with openEO on a lower level, and have more control, can use the lower level classes.

openeo

openeo.connect(url=None, auth_type=None, auth_options=None, session=None, default_timeout=None)[source]

This method is the entry point to OpenEO. You typically create one connection object in your script or application and re-use it for all calls to that backend.

If the backend requires authentication, you can pass authentication data directly to this function but it could be easier to authenticate as follows:

>>> # For basic authentication
>>> conn = connect(url).authenticate_basic(username="john", password="foo")
>>> # For OpenID Connect authentication
>>> conn = connect(url).authenticate_oidc(client_id="myclient")
Parameters:
  • url (Optional[str]) – The http url of the OpenEO back-end.

  • auth_type (Optional[str]) – Which authentication to use: None, “basic” or “oidc” (for OpenID Connect)

  • auth_options (Optional[dict]) – Options/arguments specific to the authentication type

  • default_timeout (Optional[int]) – default timeout (in seconds) for requests

Return type:

openeo.connections.Connection

openeo.rest.datacube

The main module for creating earth observation processes. It aims to easily build complex process chains, that can be evaluated by an openEO backend.

openeo.rest.datacube.THIS

Symbolic reference to the current data cube, to be used as argument in DataCube.process() calls

class openeo.rest.datacube.DataCube(graph, connection, metadata=None)[source]

Class representing a openEO (raster) data cube.

The data cube is represented by its corresponding openeo “process graph” and this process graph can be “grown” to a desired workflow by calling the appropriate methods.

add(other, reverse=False)[source]

See also

openeo.org documentation on process “add”.

Return type:

DataCube

add_dimension(name, label, type=None)[source]

Adds a new named dimension to the data cube. Afterwards, the dimension can be referenced with the specified name. If a dimension with the specified name exists, the process fails with a DimensionExists error. The dimension label of the dimension is set to the specified label.

This call does not modify the datacube in place, but returns a new datacube with the additional dimension.

Parameters:
  • name (str) – The name of the dimension to add

  • label (str) – The dimension label.

  • type (Optional[str]) – Dimension type, allowed values: ‘spatial’, ‘temporal’, ‘bands’, ‘other’, default value is ‘other’

Returns:

The data cube with a newly added dimension. The new dimension has exactly one dimension label. All other dimensions remain unchanged.

See also

openeo.org documentation on process “add_dimension”.

aggregate_spatial(geometries, reducer, target_dimension=None, crs=None, context=None)[source]

Aggregates statistics for one or more geometries (e.g. zonal statistics for polygons) over the spatial dimensions.

Parameters:
  • geometries (Union[BaseGeometry, dict, str, Path, Parameter, VectorCube]) – a shapely geometry, a GeoJSON-style dictionary, a public GeoJSON URL, or a path (that is valid for the back-end) to a GeoJSON file.

  • reducer (Union[str, PGNode, Callable]) – a callback function that creates a process graph, see Processes with child “callbacks”

  • target_dimension (Optional[str]) – The new dimension name to be used for storing the results.

  • crs (Optional[str]) – The spatial reference system of the provided polygon. By default longitude-latitude (EPSG:4326) is assumed.

  • context (Optional[dict]) –

    Additional data to be passed to the reducer process.

    Note

    this crs argument is a non-standard/experimental feature, only supported by specific back-ends. See https://github.com/Open-EO/openeo-processes/issues/235 for details.

See also

openeo.org documentation on process “aggregate_spatial”.

Return type:

DataCube

aggregate_temporal(intervals, reducer, labels=None, dimension=None, context=None)[source]

Computes a temporal aggregation based on an array of date and/or time intervals.

Calendar hierarchies such as year, month, week etc. must be transformed into specific intervals by the clients. For each interval, all data along the dimension will be passed through the reducer. The computed values will be projected to the labels, so the number of labels and the number of intervals need to be equal.

If the dimension is not set, the data cube is expected to only have one temporal dimension.

Parameters:
  • intervals (List[list]) – Temporal left-closed intervals so that the start time is contained, but not the end time.

  • reducer (Union[str, PGNode, Callable]) – A reducer to be applied on all values along the specified dimension. The reducer must be a callable process (or a set processes) that accepts an array and computes a single return value of the same type as the input values, for example median.

  • labels (Optional[List[str]]) – Labels for the intervals. The number of labels and the number of groups need to be equal.

  • dimension (Optional[str]) – The temporal dimension for aggregation. All data along the dimension will be passed through the specified reducer. If the dimension is not set, the data cube is expected to only have one temporal dimension.

  • context (Optional[dict]) – Additional data to be passed to the reducer. Not set by default.

Return type:

DataCube

Returns:

An ImageCollection containing a result for each time window

See also

openeo.org documentation on process “aggregate_temporal”.

aggregate_temporal_period(period, reducer, dimension=None, context=None)[source]

Computes a temporal aggregation based on calendar hierarchies such as years, months or seasons. For other calendar hierarchies aggregate_temporal can be used.

For each interval, all data along the dimension will be passed through the reducer.

If the dimension is not set or is set to null, the data cube is expected to only have one temporal dimension.

The period argument specifies the time intervals to aggregate. The following pre-defined values are available:

  • hour: Hour of the day

  • day: Day of the year

  • week: Week of the year

  • dekad: Ten day periods, counted per year with three periods per month (day 1 - 10, 11 - 20 and 21 - end of month). The third dekad of the month can range from 8 to 11 days. For example, the fourth dekad is Feb, 1 - Feb, 10 each year.

  • month: Month of the year

  • season: Three month periods of the calendar seasons (December - February, March - May, June - August, September - November).

  • tropical-season: Six month periods of the tropical seasons (November - April, May - October).

  • year: Proleptic years

  • decade: Ten year periods (0-to-9 decade), from a year ending in a 0 to the next year ending in a 9.

  • decade-ad: Ten year periods (1-to-0 decade) better aligned with the Anno Domini (AD) calendar era, from a year ending in a 1 to the next year ending in a 0.

Parameters:
  • period (str) – The period of the time intervals to aggregate.

  • reducer (Union[str, PGNode, Callable]) – A reducer to be applied on all values along the specified dimension. The reducer must be a callable process (or a set processes) that accepts an array and computes a single return value of the same type as the input values, for example median.

  • dimension (Optional[str]) – The temporal dimension for aggregation. All data along the dimension will be passed through the specified reducer. If the dimension is not set, the data cube is expected to only have one temporal dimension.

  • context (Optional[Dict]) – Additional data to be passed to the reducer.

Return type:

DataCube

Returns:

A data cube with the same dimensions. The dimension properties (name, type, labels, reference system and resolution) remain unchanged.

See also

openeo.org documentation on process “aggregate_temporal_period”.

apply(process=None, context=None)[source]

Applies a unary process (a local operation) to each value of the specified or all dimensions in the data cube.

Parameters:
  • process (Union[str, PGNode, Callable, None]) – the name of a process, or a callback function that creates a process graph, see Processes with child “callbacks”

  • dimensions – The names of the dimensions to apply the process on. Defaults to an empty array so that all dimensions are used.

  • context (Optional[dict]) – Additional data to be passed to the process.

Return type:

DataCube

Returns:

A data cube with the newly computed values. The resolution, cardinality and the number of dimensions are the same as for the original data cube.

See also

openeo.org documentation on process “apply”.

apply_dimension(code=None, runtime=None, process=None, version='latest', dimension='t', target_dimension=None, context=None)[source]

Applies a process to all pixel values along a dimension of a raster data cube. For example, if the temporal dimension is specified the process will work on a time series of pixel values.

The process to apply is specified by either code and runtime in case of a UDF, or by providing a callback function in the process argument.

The process reduce_dimension also applies a process to pixel values along a dimension, but drops the dimension afterwards. The process apply applies a process to each pixel value in the data cube.

The target dimension is the source dimension if not specified otherwise in the target_dimension parameter. The pixel values in the target dimension get replaced by the computed pixel values. The name, type and reference system are preserved.

The dimension labels are preserved when the target dimension is the source dimension and the number of pixel values in the source dimension is equal to the number of values computed by the process. Otherwise, the dimension labels will be incrementing integers starting from zero, which can be changed using rename_labels afterwards. The number of labels will equal to the number of values computed by the process.

Parameters:
  • code (Optional[str]) – UDF code or process identifier (optional)

  • runtime – UDF runtime to use (optional)

  • process (Union[str, PGNode, Callable, None]) – a callback function that creates a process graph, see Processes with child “callbacks”

  • version – Version of the UDF runtime to use

  • dimension – The name of the source dimension to apply the process on. Fails with a DimensionNotAvailable error if the specified dimension does not exist.

  • target_dimension – The name of the target dimension or null (the default) to use the source dimension specified in the parameter dimension. By specifying a target dimension, the source dimension is removed. The target dimension with the specified name and the type other (see add_dimension) is created, if it doesn’t exist yet.

  • context (Optional[dict]) – Additional data to be passed to the process.

Return type:

DataCube

Returns:

A datacube with the UDF applied to the given dimension.

Raises:

DimensionNotAvailable

See also

openeo.org documentation on process “apply_dimension”.

apply_kernel(kernel, factor=1.0, border=0, replace_invalid=0)[source]

Applies a focal operation based on a weighted kernel to each value of the specified dimensions in the data cube.

The border parameter determines how the data is extended when the kernel overlaps with the borders. The following options are available:

  • numeric value - fill with a user-defined constant number n: nnnnnn|abcdefgh|nnnnnn (default, with n = 0)

  • replicate - repeat the value from the pixel at the border: aaaaaa|abcdefgh|hhhhhh

  • reflect - mirror/reflect from the border: fedcba|abcdefgh|hgfedc

  • reflect_pixel - mirror/reflect from the center of the pixel at the border: gfedcb|abcdefgh|gfedcb

  • wrap - repeat/wrap the image: cdefgh|abcdefgh|abcdef

Parameters:
  • kernel (Union[ndarray, List[List[float]]]) – The kernel to be applied on the data cube. The kernel has to be as many dimensions as the data cube has dimensions.

  • factor – A factor that is multiplied to each value computed by the focal operation. This is basically a shortcut for explicitly multiplying each value by a factor afterwards, which is often required for some kernel-based algorithms such as the Gaussian blur.

  • border – Determines how the data is extended when the kernel overlaps with the borders. Defaults to fill the border with zeroes.

  • replace_invalid – This parameter specifies the value to replace non-numerical or infinite numerical values with. By default, those values are replaced with zeroes.

Return type:

DataCube

Returns:

A data cube with the newly computed values. The resolution, cardinality and the number of dimensions are the same as for the original data cube.

See also

openeo.org documentation on process “apply_kernel”.

apply_neighborhood(process, size, overlap=None, context=None)[source]

Applies a focal process to a data cube.

A focal process is a process that works on a ‘neighbourhood’ of pixels. The neighbourhood can extend into multiple dimensions, this extent is specified by the size argument. It is not only (part of) the size of the input window, but also the size of the output for a given position of the sliding window. The sliding window moves with multiples of size.

An overlap can be specified so that neighbourhoods can have overlapping boundaries. This allows for continuity of the output. The values included in the data cube as overlap can’t be modified by the given process.

The neighbourhood size should be kept small enough, to avoid running beyond computational resources, but a too small size will result in a larger number of process invocations, which may slow down processing. Window sizes for spatial dimensions typically are in the range of 64 to 512 pixels, while overlaps of 8 to 32 pixels are common.

The process must not add new dimensions, or remove entire dimensions, but the result can have different dimension labels.

For the special case of 2D convolution, it is recommended to use apply_kernel().

Parameters:
  • size (List[Dict]) –

  • overlap (Optional[List[dict]]) –

  • process (Union[str, PGNode, Callable]) – a callback function that creates a process graph, see Processes with child “callbacks”

  • context (Optional[dict]) – Additional data to be passed to the process.

Return type:

DataCube

Returns:

See also

openeo.org documentation on process “apply_neighborhood”.

ard_normalized_radar_backscatter(elevation_model=None, contributing_area=False, ellipsoid_incidence_angle=False, noise_removal=True)[source]

Computes CARD4L compliant backscatter (gamma0) from SAR input. This method is a variant of sar_backscatter(), with restricted parameters to generate backscatter according to CARD4L specifications.

Note that backscatter computation may require instrument specific metadata that is tightly coupled to the original SAR products. As a result, this process may only work in combination with loading data from specific collections, not with general data cubes.

Parameters:
  • elevation_model (Optional[str]) – The digital elevation model to use. Set to None (the default) to allow the back-end to choose, which will improve portability, but reduce reproducibility.

  • contributing_area – If set to true, a DEM-based local contributing area band named contributing_area is added. The values are given in square meters.

  • ellipsoid_incidence_angle (bool) – If set to True, an ellipsoidal incidence angle band named ellipsoid_incidence_angle is added. The values are given in degrees.

  • noise_removal (bool) – If set to false, no noise removal is applied. Defaults to True, which removes noise.

Returns:

Backscatter values expressed as gamma0. The data returned is CARD4L compliant and contains metadata. By default, the backscatter values are given in linear scale.

See also

openeo.org documentation on process “ard_normalized_radar_backscatter”.

ard_surface_reflectance(atmospheric_correction_method, cloud_detection_method, elevation_model=None, atmospheric_correction_options=None, cloud_detection_options=None)[source]

Computes CARD4L compliant surface reflectance values from optical input.

Parameters:
  • atmospheric_correction_method (str) – The atmospheric correction method to use.

  • cloud_detection_method (str) – The cloud detection method to use.

  • elevation_model (Optional[str]) – The digital elevation model to use, leave empty to allow the back-end to make a suitable choice.

  • atmospheric_correction_options (Optional[dict]) – Proprietary options for the atmospheric correction method.

  • cloud_detection_options (Optional[dict]) – Proprietary options for the cloud detection method.

Return type:

DataCube

Returns:

Data cube containing bottom of atmosphere reflectances with atmospheric disturbances like clouds and cloud shadows removed. The data returned is CARD4L compliant and contains metadata.

See also

openeo.org documentation on process “ard_surface_reflectance”.

atmospheric_correction(method=None, elevation_model=None, options=None)[source]

Applies an atmospheric correction that converts top of atmosphere reflectance values into bottom of atmosphere/top of canopy reflectance values.

Note that multiple atmospheric methods exist, but may not be supported by all backends. The method parameter gives you the option of requiring a specific method, but this may result in an error if the backend does not support it.

Parameters:
  • method (Optional[str]) – The atmospheric correction method to use. To get reproducible results, you have to set a specific method. Set to null to allow the back-end to choose, which will improve portability, but reduce reproducibility as you may get different results if you run the processes multiple times.

  • elevation_model (Optional[str]) – The digital elevation model to use, leave empty to allow the back-end to make a suitable choice.

  • options (Optional[dict]) – Proprietary options for the atmospheric correction method.

Return type:

DataCube

Returns:

datacube with bottom of atmosphere reflectances

See also

openeo.org documentation on process “atmospheric_correction”.

band(band)[source]

Filter out a single band

Parameters:

band (Union[str, int]) – band name, band common name or band index.

Return type:

DataCube

Returns:

a DataCube instance

band_filter(bands)

Use of this legacy method is deprecated, use filter_bands() instead.

Return type:

DataCube

chunk_polygon(chunks, process, mask_value=None, context=None)[source]

Apply a process to spatial chunks of a data cube.

Warning

experimental process: not generally supported, API subject to change.

Parameters:
  • chunks (Union[BaseGeometry, dict, str, Path, Parameter, VectorCube]) – Polygons, provided as a shapely geometry, a GeoJSON-style dictionary, a public GeoJSON URL, or a path (that is valid for the back-end) to a GeoJSON file.

  • process (Union[str, PGNode, Callable]) – “child callback” function, see Processes with child “callbacks”

  • mask_value (Optional[float]) – The value used for cells outside the polygon. This provides a distinction between NoData cells within the polygon (due to e.g. clouds) and masked cells outside it. If no value is provided, NoData cells are used outside the polygon.

  • context (Optional[dict]) – Additional data to be passed to the process.

Return type:

DataCube

count_time()[source]

Counts the number of images with a valid mask in a time series for all bands of the input dataset.

return:

a DataCube instance

See also

openeo.org documentation on process “count”.

Return type:

DataCube

classmethod create_collection(cls, collection_id, connection=None, spatial_extent=None, temporal_extent=None, bands=None, fetch_metadata=True, properties=None)

Use of this legacy class method is deprecated, use load_collection() instead.

Return type:

DataCube

create_job(out_format=None, title=None, description=None, plan=None, budget=None, job_options=None, **format_options)[source]

Sends a job to the backend and returns a Job instance. The job will still need to be started and managed explicitly. The execute_batch() method allows you to run batch jobs without managing it.

Parameters:
  • out_format – String Format of the job result.

  • job_options – A dictionary containing (custom) job options

  • format_options – String Parameters for the job result format

Return type:

BatchJob

Returns:

status: Job resulting job.

dimension_labels(dimension)[source]

Gives all labels for a dimension in the data cube. The labels have the same order as in the data cube.

Parameters:

dimension (str) – The name of the dimension to get the labels for.

See also

openeo.org documentation on process “dimension_labels”.

Return type:

DataCube

divide(other, reverse=False)[source]

See also

openeo.org documentation on process “divide”.

Return type:

DataCube

download(outputfile=None, format=None, options=None)[source]

Download image collection, e.g. as GeoTIFF. If outputfile is provided, the result is stored on disk locally, otherwise, a bytes object is returned. The bytes object can be passed on to a suitable decoder for decoding.

Parameters:
  • outputfile (Union[str, Path, None]) – Optional, an output file if the result needs to be stored on disk.

  • format (Optional[str]) – Optional, an output format supported by the backend.

  • options (Optional[dict]) – Optional, file format options

Returns:

None if the result is stored to disk, or a bytes object returned by the backend.

drop_dimension(name)[source]

Drops a dimension from the data cube. Dropping a dimension only works on dimensions with a single dimension label left, otherwise the process fails with a DimensionLabelCountMismatch exception. Dimension values can be reduced to a single value with a filter such as filter_bands or the reduce_dimension process. If a dimension with the specified name does not exist, the process fails with a DimensionNotAvailable exception.

Parameters:

name (str) – The name of the dimension to drop

Returns:

The data cube with the given dimension dropped.

See also

openeo.org documentation on process “drop_dimension”.

execute()[source]

Executes the process graph of the imagery.

Return type:

Dict

execute_batch(outputfile=None, out_format=None, print=<built-in function print>, max_poll_interval=60, connection_retry_interval=30, job_options=None, **format_options)[source]

Evaluate the process graph by creating a batch job, and retrieving the results when it is finished. This method is mostly recommended if the batch job is expected to run in a reasonable amount of time.

For very long running jobs, you probably do not want to keep the client running.

Parameters:
  • job_options

  • outputfile (Union[str, Path, None]) – The path of a file to which a result can be written

  • out_format (Optional[str]) – (optional) Format of the job result.

  • format_options – String Parameters for the job result format

Return type:

BatchJob

static execute_local_udf(udf, datacube=None, fmt='netcdf')[source]

Deprecated since version 0.7.0: Use openeo.udf.run_code.execute_local_udf() instead

filter_bands(bands)[source]

Filter the data cube by the given bands

Parameters:

bands (Union[List[Union[int, str]], str]) – list of band names, common names or band indices. Single band name can also be given as string.

Return type:

DataCube

Returns:

a DataCube instance

See also

openeo.org documentation on process “filter_bands”.

filter_bbox(*args, west=None, south=None, east=None, north=None, crs=None, base=None, height=None, bbox=None)[source]

Limits the data cube to the specified bounding box.

The bounding box can be specified in multiple ways.

  • With keyword arguments:

    >>> cube.filter_bbox(west=3, south=51, east=4, north=52, crs=4326)
    
  • With a (west, south, east, north) list or tuple (note that EPSG:4326 is the default CRS, so it’s not nececarry to specify it explicitly):

    >>> cube.filter_bbox([3, 51, 4, 52])
    >>> cube.filter_bbox(bbox=[3, 51, 4, 52])
    
  • With a bbox dictionary:

    >>> bbox = {"west": 3, "south": 51, "east": 4, "north": 52, "crs": 4326}
    >>> cube.filter_bbox(bbox)
    >>> cube.filter_bbox(bbox=bbox)
    >>> cube.filter_bbox(**bbox)
    
  • With a shapely geometry (of which the bounding box will be used):

    >>> cube.filter_bbox(geometry)
    >>> cube.filter_bbox(bbox=geometry)
    
  • Passing a parameter:

    >>> bbox_param = Parameter(name="my_bbox", schema="object")
    >>> cube.filter_bbox(bbox_param)
    >>> cube.filter_bbox(bbox=bbox_param)
    
  • With a CRS other than EPSG 4326:

    >>> cube.filter_bbox(west=652000, east=672000, north=5161000, south=5181000, crs=32632)
    
  • Deprecated: positional arguments are also supported, but follow a non-standard order for legacy reasons:

    >>> west, east, north, south = 3, 4, 52, 51
    >>> cube.filter_bbox(west, east, north, south)
    

See also

openeo.org documentation on process “filter_bbox”.

Return type:

DataCube

filter_spatial(geometries)[source]

Limits the data cube over the spatial dimensions to the specified geometries.

  • For polygons, the filter retains a pixel in the data cube if the point at the pixel center intersects with at least one of the polygons (as defined in the Simple Features standard by the OGC).

  • For points, the process considers the closest pixel center.

  • For lines (line strings), the process considers all the pixels whose centers are closest to at least one point on the line.

More specifically, pixels outside of the bounding box of the given geometry will not be available after filtering. All pixels inside the bounding box that are not retained will be set to null (no data).

Parameters:

geometries – One or more geometries used for filtering, specified as GeoJSON in EPSG:4326.

Return type:

DataCube

Returns:

A data cube restricted to the specified geometries. The dimensions and dimension properties (name, type, labels, reference system and resolution) remain unchanged, except that the spatial dimensions have less (or the same) dimension labels.

See also

openeo.org documentation on process “filter_spatial”.

filter_temporal(*args, start_date=None, end_date=None, extent=None)[source]

Limit the DataCube to a certain date range, which can be specified in several ways:

>>> im.filter_temporal("2019-07-01", "2019-08-01")
>>> im.filter_temporal(["2019-07-01", "2019-08-01"])
>>> im.filter_temporal(extent=["2019-07-01", "2019-08-01"])
>>> im.filter_temporal(start_date="2019-07-01", end_date="2019-08-01"])
Parameters:
  • start_date (Union[str, datetime, date, None]) – start date of the filter (inclusive), as a string or date object

  • end_date (Union[str, datetime, date, None]) – end date of the filter (exclusive), as a string or date object

  • extent (Union[list, tuple, None]) – two element list/tuple start and end date of the filter

Return type:

DataCube

Returns:

An ImageCollection filtered by date.

https://open-eo.github.io/openeo-api/processreference/#filter_temporal

See also

openeo.org documentation on process “filter_temporal”.

fit_class_random_forest(target, max_variables=None, num_trees=100, seed=None)[source]

Executes the fit of a random forest classification based on the user input of target and predictors. The Random Forest classification model is based on the approach by Breiman (2001).

Warning

EXPERIMENTAL: not generally supported, API subject to change.

Parameters:
  • target (dict) – The training sites for the classification model as a vector data cube. This is associated with the target variable for the Random Forest model. The geometry has to be associated with a value to predict (e.g. fractional forest canopy cover).

  • max_variables (Optional[int]) – Specifies how many split variables will be used at a node. Default value is null, which corresponds to the number of predictors divided by 3.

  • num_trees (int) – The number of trees build within the Random Forest classification.

  • seed (Optional[int]) – A randomization seed to use for the random sampling in training.

New in version 0.10.0.

See also

openeo.org documentation on process “fit_class_random_forest”.

Return type:

MlModel

fit_curve(parameters, function, dimension)[source]

Use non-linear least squares to fit a model function y = f(x, parameters) to data.

The process throws an InvalidValues exception if invalid values are encountered. Invalid values are finite numbers (see also is_valid()).

Warning

experimental process: not generally supported, API subject to change. https://github.com/Open-EO/openeo-processes/pull/240

Parameters:

See also

openeo.org documentation on process “fit_curve”.

fit_regr_random_forest(target, max_variables=None, num_trees=100, seed=None)[source]

Executes the fit of a random forest regression based on training data. The Random Forest regression model is based on the approach by Breiman (2001).

Warning

EXPERIMENTAL: not generally supported, API subject to change.

Parameters:
  • target (dict) – The training sites for the regression model as a vector data cube. This is associated with the target variable for the Random Forest model. The geometry has to associated with a value to predict (e.g. fractional forest canopy cover).

  • max_variables (Optional[int]) – Specifies how many split variables will be used at a node. Default value is null, which corresponds to the number of predictors divided by 3.

  • num_trees (int) – The number of trees build within the Random Forest classification.

  • seed (Optional[int]) – A randomization seed to use for the random sampling in training.

New in version 0.10.1.

See also

openeo.org documentation on process “fit_regr_random_forest”.

Return type:

MlModel

flat_graph()

Get the process graph in internal flat dict representation.

Warning

This method is mainly intended for internal use. It is not recommended for general use and is subject to change.

Instead, it is recommended to use to_json() or print_json() to obtain a standardized, interoperable JSON representation of the process graph. See Export a process graph for more information.

Return type:

dict

flatten_dimensions(dimensions, target_dimension, label_separator=None)[source]

Combines multiple given dimensions into a single dimension by flattening the values and merging the dimension labels with the given label_separator. Non-string dimension labels will be converted to strings. This process is the opposite of the process unflatten_dimension() but executing both processes subsequently doesn’t necessarily create a data cube that is equal to the original data cube.

Parameters:
  • dimensions (List[str]) – The names of the dimension to combine.

  • target_dimension (str) – The name of a target dimension with a single dimension label to replace.

  • label_separator (Optional[str]) – The string that will be used as a separator for the concatenated dimension labels.

Returns:

A data cube with the new shape.

Warning

experimental process: not generally supported, API subject to change.

New in version 0.10.0.

See also

openeo.org documentation on process “flatten_dimensions”.

graph_add_node(process_id, arguments=None, metadata=None, namespace=None, **kwargs)

Use of this legacy method is deprecated, use process() instead.

Return type:

DataCube

linear_scale_range(input_min, input_max, output_min, output_max)[source]

Performs a linear transformation between the input and output range.

The given number in x is clipped to the bounds specified in inputMin and inputMax so that the underlying formula

((x - inputMin) / (inputMax - inputMin)) * (outputMax - outputMin) + outputMin

never returns any value lower than outputMin or greater than outputMax.

Potential use case include scaling values to the 8-bit range (0 - 255) often used for numeric representation of values in one of the channels of the RGB colour model or calculating percentages (0 - 100).

The no-data value null is passed through and therefore gets propagated.

Parameters:
  • input_min – Minimum input value

  • input_max – Maximum input value

  • output_min – Minimum value of the desired output range.

  • output_max – Maximum value of the desired output range.

Return type:

DataCube

Returns:

a DataCube instance

See also

openeo.org documentation on process “linear_scale_range”.

ln()[source]

See also

openeo.org documentation on process “ln”.

Return type:

DataCube

classmethod load_collection(collection_id, connection=None, spatial_extent=None, temporal_extent=None, bands=None, fetch_metadata=True, properties=None)[source]

Create a new Raster Data cube.

Parameters:
  • collection_id (str) – image collection identifier

  • connection (Optional[Connection]) – The connection to use to connect with the backend.

  • spatial_extent (Optional[Dict[str, float]]) – limit data to specified bounding box or polygons

  • temporal_extent (Optional[List[Union[str, datetime, date, PGNode]]]) – limit data to specified temporal interval

  • bands (Optional[List[str]]) – only add the specified bands

  • properties (Optional[Dict[str, Union[str, PGNode, Callable]]]) – limit data by metadata property predicates

Return type:

DataCube

Returns:

new DataCube containing the collection

See also

openeo.org documentation on process “load_collection”.

classmethod load_disk_collection(connection, file_format, glob_pattern, **options)[source]

Loads image data from disk as a DataCube. This is backed by a non-standard process (‘load_disk_data’). This will eventually be replaced by standard options such as https://processes.openeo.org/#load_uploaded_files

Parameters:
  • connection (Connection) – The connection to use to connect with the backend.

  • file_format (str) – the file format, e.g. ‘GTiff’

  • glob_pattern (str) – a glob pattern that matches the files to load from disk

  • options – options specific to the file format

Return type:

DataCube

Returns:

the data as a DataCube

log10()[source]

See also

openeo.org documentation on process “log”.

Return type:

DataCube

log2()[source]

See also

openeo.org documentation on process “log”.

Return type:

DataCube

logarithm(base)[source]

See also

openeo.org documentation on process “log”.

Return type:

DataCube

logical_and(other)[source]

Apply element-wise logical and operation

Parameters:

other (DataCube) –

Return type:

DataCube

Returns:

logical_and(this, other)

See also

openeo.org documentation on process “and”.

logical_or(other)[source]

Apply element-wise logical or operation

Parameters:

other (DataCube) –

Return type:

DataCube

Returns:

logical_or(this, other)

See also

openeo.org documentation on process “or”.

mask(mask=None, replacement=None)[source]

Applies a mask to a raster data cube. To apply a vector mask use mask_polygon.

A mask is a raster data cube for which corresponding pixels among data and mask are compared and those pixels in data are replaced whose pixels in mask are non-zero (for numbers) or true (for boolean values). The pixel values are replaced with the value specified for replacement, which defaults to null (no data).

Parameters:
  • mask (Optional[DataCube]) – the raster mask

  • replacement – the value to replace the masked pixels with

See also

openeo.org documentation on process “mask”.

Return type:

DataCube

mask_polygon(mask, srs=None, replacement=None, inside=None)[source]

Applies a polygon mask to a raster data cube. To apply a raster mask use mask.

All pixels for which the point at the pixel center does not intersect with any polygon (as defined in the Simple Features standard by the OGC) are replaced. This behaviour can be inverted by setting the parameter inside to true.

The pixel values are replaced with the value specified for replacement, which defaults to no data.

Parameters:
  • mask (Union[BaseGeometry, dict, str, Path, Parameter, VectorCube]) – The geometry to mask with: a shapely geometry, a GeoJSON-style dictionary, a public GeoJSON URL, or a path (that is valid for the back-end) to a GeoJSON file.

  • srs (Optional[str]) –

    The spatial reference system of the provided polygon. By default longitude-latitude (EPSG:4326) is assumed.

    Note

    this srs argument is a non-standard/experimental feature, only supported by specific back-ends. See https://github.com/Open-EO/openeo-processes/issues/235 for details.

  • replacement – the value to replace the masked pixels with

See also

openeo.org documentation on process “mask_polygon”.

Return type:

DataCube

max_time()[source]

Finds the maximum value of a time series for all bands of the input dataset.

Return type:

DataCube

Returns:

a DataCube instance

See also

openeo.org documentation on process “max”.

mean_time()[source]

Finds the mean value of a time series for all bands of the input dataset.

return:

a DataCube instance

See also

openeo.org documentation on process “mean”.

Return type:

DataCube

median_time()[source]

Finds the median value of a time series for all bands of the input dataset.

return:

a DataCube instance

See also

openeo.org documentation on process “median”.

Return type:

DataCube

merge(other, overlap_resolver=None, context=None)

Use of this legacy method is deprecated, use merge_cubes() instead.

Return type:

DataCube

merge_cubes(other, overlap_resolver=None, context=None)[source]

Merging two data cubes

The data cubes have to be compatible. A merge operation without overlap should be reversible with (a set of) filter operations for each of the two cubes. The process performs the join on overlapping dimensions, with the same name and type. An overlapping dimension has the same name, type, reference system and resolution in both dimensions, but can have different labels. One of the dimensions can have different labels, for all other dimensions the labels must be equal. If data overlaps, the parameter overlap_resolver must be specified to resolve the overlap.

Examples for merging two data cubes:

  1. Data cubes with the dimensions x, y, t and bands have the same dimension labels in x,y and t, but the labels for the dimension bands are B1 and B2 for the first cube and B3 and B4. An overlap resolver is not needed. The merged data cube has the dimensions x, y, t and bands and the dimension bands has four dimension labels: B1, B2, B3, B4.

  2. Data cubes with the dimensions x, y, t and bands have the same dimension labels in x,y and t, but the labels for the dimension bands are B1 and B2 for the first data cube and B2 and B3 for the second. An overlap resolver is required to resolve overlap in band B2. The merged data cube has the dimensions x, y, t and bands and the dimension bands has three dimension labels: B1, B2, B3.

  3. Data cubes with the dimensions x, y and t have the same dimension labels in x,y and t. There are two options:
    • Keep the overlapping values separately in the merged data cube: An overlap resolver is not needed, but for each data cube you need to add a new dimension using add_dimension. The new dimensions must be equal, except that the labels for the new dimensions must differ by name. The merged data cube has the same dimensions and labels as the original data cubes, plus the dimension added with add_dimension, which has the two dimension labels after the merge.

    • Combine the overlapping values into a single value: An overlap resolver is required to resolve the overlap for all pixels. The merged data cube has the same dimensions and labels as the original data cubes, but all pixel values have been processed by the overlap resolver.

  4. Merging a data cube with dimensions x, y, t with another cube with dimensions x, y will join on the x, y dimension, so the lower dimension cube is merged with each time step in the higher dimensional cube. This can for instance be used to apply a digital elevation model to a spatiotemporal data cube.

Parameters:
  • other (DataCube) – The data cube to merge with.

  • overlap_resolver (Union[str, PGNode, Callable, None]) – A reduction operator that resolves the conflict if the data overlaps. The reducer must return a value of the same data type as the input values are. The reduction operator may be a single process such as multiply or consist of multiple sub-processes. null (the default) can be specified if no overlap resolver is required.

  • context (Optional[dict]) – Additional data to be passed to the process.

Return type:

DataCube

Returns:

The merged data cube.

See also

openeo.org documentation on process “merge_cubes”.

min_time()[source]

Finds the minimum value of a time series for all bands of the input dataset.

return:

a DataCube instance

See also

openeo.org documentation on process “min”.

Return type:

DataCube

multiply(other, reverse=False)[source]

See also

openeo.org documentation on process “multiply”.

Return type:

DataCube

ndvi(nir=None, red=None, target_band=None)[source]

Normalized Difference Vegetation Index (NDVI)

Parameters:
  • nir (Optional[str]) – (optional) name of NIR band

  • red (Optional[str]) – (optional) name of red band

  • target_band (Optional[str]) – (optional) name of the newly created band

Return type:

DataCube

Returns:

a DataCube instance

See also

openeo.org documentation on process “ndvi”.

normalized_difference(other)[source]

See also

openeo.org documentation on process “normalized_difference”.

Return type:

DataCube

polygonal_histogram_timeseries(polygon)[source]

Extract a histogram time series for the given (multi)polygon. Its points are expected to be in the EPSG:4326 coordinate reference system.

Parameters:

polygon (Union[Polygon, MultiPolygon, str]) – The (multi)polygon; or a file path or HTTP URL to a GeoJSON file or shape file

Return type:

DataCube

Returns:

DataCube

Deprecated since version 0.10.0: Use aggregate_spatial() with reducer 'histogram'.

polygonal_mean_timeseries(polygon)[source]

Extract a mean time series for the given (multi)polygon. Its points are expected to be in the EPSG:4326 coordinate reference system.

Parameters:

polygon (Union[Polygon, MultiPolygon, str]) – The (multi)polygon; or a file path or HTTP URL to a GeoJSON file or shape file

Return type:

DataCube

Returns:

DataCube

Deprecated since version 0.10.0: Use aggregate_spatial() with reducer 'mean'.

polygonal_median_timeseries(polygon)[source]

Extract a median time series for the given (multi)polygon. Its points are expected to be in the EPSG:4326 coordinate reference system.

Parameters:

polygon (Union[Polygon, MultiPolygon, str]) – The (multi)polygon; or a file path or HTTP URL to a GeoJSON file or shape file

Return type:

DataCube

Returns:

DataCube

Deprecated since version 0.10.0: Use aggregate_spatial() with reducer 'median'.

polygonal_standarddeviation_timeseries(polygon)[source]

Extract a time series of standard deviations for the given (multi)polygon. Its points are expected to be in the EPSG:4326 coordinate reference system.

Parameters:

polygon (Union[Polygon, MultiPolygon, str]) – The (multi)polygon; or a file path or HTTP URL to a GeoJSON file or shape file

Return type:

DataCube

Returns:

DataCube

Deprecated since version 0.10.0: Use aggregate_spatial() with reducer 'sd'.

power(p)[source]

See also

openeo.org documentation on process “power”.

predict_curve(parameters, function, dimension, labels=None)[source]

Predict values using a model function and pre-computed parameters.

Warning

experimental process: not generally supported, API subject to change. https://github.com/Open-EO/openeo-processes/pull/240

Parameters:

See also

openeo.org documentation on process “predict_curve”.

predict_random_forest(model, dimension='bands')[source]

Apply reduce_dimension process with a predict_random_forest reducer.

Parameters:
  • model (Union[str, BatchJob, MlModel]) –

    a reference to a trained model, one of

    • a MlModel instance (e.g. loaded from Connection.load_ml_model())

    • a BatchJob instance of a batch job that saved a single random forest model

    • a job id (str) of a batch job that saved a single random forest model

    • a STAC item URL (str) to load the random forest from. (The STAC Item must implement the ml-model extension.)

  • dimension (str) – dimension along which to apply the reduce_dimension process.

New in version 0.10.0.

See also

openeo.org documentation on process “predict_random_forest”.

print_json(*, file=None, indent=2, separators=None)

Print interoperable JSON representation of the process graph.

See DataCube.to_json() to get the JSON representation as a string and Export a process graph for more usage information.

Also see json.dumps docs for more information on the JSON formatting options.

Parameters:
  • file – file-like object (stream) to print to (current sys.stdout by default). Or a path (string or pathlib.Path) to a file to write to.

  • indent (Optional[int]) – JSON indentation level.

  • separators (Optional[Tuple[str, str]]) – (optional) tuple of item/key separators.

New in version 0.12.0.

process(process_id, arguments=None, metadata=None, namespace=None, **kwargs)[source]

Generic helper to create a new DataCube by applying a process.

Parameters:
  • process_id (str) – process id of the process.

  • arguments (Optional[dict]) – argument dictionary for the process.

  • metadata (Optional[CollectionMetadata]) – optional: metadata to override original cube metadata (e.g. when reducing dimensions)

  • namespace (Optional[str]) – optional: process namespace

Return type:

DataCube

Returns:

new DataCube instance

process_with_node(pg, metadata=None)[source]

Generic helper to create a new DataCube by applying a process (given as process graph node)

Parameters:
  • pg (PGNode) – process graph node (containing process id and arguments)

  • metadata (Optional[CollectionMetadata]) – optional: metadata to override original cube metadata (e.g. when reducing dimensions)

Return type:

DataCube

Returns:

new DataCube instance

raster_to_vector()[source]

Converts this raster data cube into a VectorCube. The bounding polygon of homogenous areas of pixels is constructed.

Warning

experimental process: not generally supported, API subject to change.

Return type:

VectorCube

Returns:

a VectorCube

reduce_bands_udf(code, runtime='Python', version='latest')[source]

Apply reduce (reduce_dimension) process with given UDF along band/spectral dimension.

Return type:

DataCube

reduce_dimension(dimension, reducer, context=None, process_id='reduce_dimension', band_math_mode=False)[source]

Add a reduce process with given reducer callback along given dimension

Parameters:
  • dimension (str) – the label of the dimension to reduce

  • reducer (Union[str, PGNode, Callable]) – “child callback” function, see Processes with child “callbacks”

  • context (Optional[dict]) – Additional data to be passed to the process.

See also

openeo.org documentation on process “reduce_dimension”.

Return type:

DataCube

reduce_temporal_simple(process_id)[source]

Do temporal reduce with a simple given process as callback.

See also

openeo.org documentation on process “reduce_dimension”.

Return type:

DataCube

reduce_temporal_udf(code, runtime='Python', version='latest')[source]

Apply reduce (reduce_dimension) process with given UDF along temporal dimension.

Parameters:
  • code (str) – The UDF code, compatible with the given runtime and version

  • runtime – The UDF runtime

  • version – The UDF runtime version

See also

openeo.org documentation on process “reduce_dimension”.

reduce_tiles_over_time(code, runtime='Python', version='latest')

Use of this legacy method is deprecated, use reduce_temporal_udf() instead.

rename_dimension(source, target)[source]

Renames a dimension in the data cube while preserving all other properties.

Parameters:
  • source (str) – The current name of the dimension. Fails with a DimensionNotAvailable error if the specified dimension does not exist.

  • target (str) – A new Name for the dimension. Fails with a DimensionExists error if a dimension with the specified name exists.

Returns:

A new datacube with the dimension renamed.

See also

openeo.org documentation on process “rename_dimension”.

rename_labels(dimension, target, source=None)[source]

Renames the labels of the specified dimension in the data cube from source to target.

Parameters:
  • dimension (str) – Dimension name

  • target (list) – The new names for the labels.

  • source (Optional[list]) – The names of the labels as they are currently in the data cube.

Return type:

DataCube

Returns:

An DataCube instance

See also

openeo.org documentation on process “rename_labels”.

resample_cube_spatial(target, method='near')[source]

Resamples the spatial dimensions (x,y) from a source data cube to align with the corresponding dimensions of the given target data cube. Returns a new data cube with the resampled dimensions.

To resample a data cube to a specific resolution or projection regardless of an existing target data cube, refer to resample_spatial().

Parameters:
  • target (DataCube) – A data cube that describes the spatial target resolution.

  • method (str) – Resampling method to use.

Return type:

DataCube

Returns:

resample_cube_temporal(target, dimension=None, valid_within=None)[source]

Resamples one or more given temporal dimensions from a source data cube to align with the corresponding dimensions of the given target data cube using the nearest neighbor method. Returns a new data cube with the resampled dimensions.

By default, this process simply takes the nearest neighbor independent of the value (including values such as no-data / null). Depending on the data cubes this may lead to values being assigned to two target timestamps. To only consider valid values in a specific range around the target timestamps, use the parameter valid_within.

The rare case of ties is resolved by choosing the earlier timestamps.

Parameters:
  • target (DataCube) – A data cube that describes the temporal target resolution.

  • dimension (Optional[str]) – The name of the temporal dimension to resample.

  • valid_within (Optional[int]) –

Return type:

DataCube

Returns:

New in version 0.10.0.

See also

openeo.org documentation on process “resample_cube_temporal”.

resample_spatial(resolution, projection=None, method='near', align='upper-left')[source]

See also

openeo.org documentation on process “resample_spatial”.

Return type:

DataCube

resolution_merge(high_resolution_bands, low_resolution_bands, method=None)[source]

Resolution merging algorithms try to improve the spatial resolution of lower resolution bands (e.g. Sentinel-2 20M) based on higher resolution bands. (e.g. Sentinel-2 10M).

External references:

Pansharpening explained

Example publication: ‘Improving the Spatial Resolution of Land Surface Phenology by Fusing Medium- and Coarse-Resolution Inputs’

Warning

experimental process: not generally supported, API subject to change.

Parameters:
  • high_resolution_bands (List[str]) – A list of band names to use as ‘high-resolution’ band. Either the unique band name (metadata field name in bands) or one of the common band names (metadata field common_name in bands). If unique band name and common name conflict, the unique band name has higher priority. The order of the specified array defines the order of the bands in the data cube. If multiple bands match a common name, all matched bands are included in the original order. These bands will remain unmodified.

  • low_resolution_bands (List[str]) – A list of band names for which the spatial resolution should be increased. Either the unique band name (metadata field name in bands) or one of the common band names (metadata field common_name in bands). If unique band name and common name conflict, the unique band name has higher priority. The order of the specified array defines the order of the bands in the data cube. If multiple bands match a common name, all matched bands are included in the original order. These bands will be modified by the process.

  • method (Optional[str]) – The method to use. The supported algorithms can vary between back-ends. Set to null (the default) to allow the back-end to choose, which will improve portability, but reduce reproducibility..

Return type:

DataCube

Returns:

A datacube with the same bands and metadata as the input, but algorithmically increased spatial resolution for the selected bands.

See also

openeo.org documentation on process “resolution_merge”.

result_node()

Get the current result node (PGNode) of the process graph.

New in version 0.10.1.

Return type:

PGNode

sar_backscatter(coefficient='gamma0-terrain', elevation_model=None, mask=False, contributing_area=False, local_incidence_angle=False, ellipsoid_incidence_angle=False, noise_removal=True, options=None)[source]

Computes backscatter from SAR input.

Note that backscatter computation may require instrument specific metadata that is tightly coupled to the original SAR products. As a result, this process may only work in combination with loading data from specific collections, not with general data cubes.

Parameters:
  • coefficient (Optional[str]) –

    Select the radiometric correction coefficient. The following options are available:

    • ”beta0”: radar brightness

    • ”sigma0-ellipsoid”: ground area computed with ellipsoid earth model

    • ”sigma0-terrain”: ground area computed with terrain earth model

    • ”gamma0-ellipsoid”: ground area computed with ellipsoid earth model in sensor line of sight

    • ”gamma0-terrain”: ground area computed with terrain earth model in sensor line of sight (default)

    • None: non-normalized backscatter

  • elevation_model (Optional[str]) – The digital elevation model to use. Set to None (the default) to allow the back-end to choose, which will improve portability, but reduce reproducibility.

  • mask (bool) – If set to true, a data mask is added to the bands with the name mask. It indicates which values are valid (1), invalid (0) or contain no-data (null).

  • contributing_area (bool) – If set to true, a DEM-based local contributing area band named contributing_area is added. The values are given in square meters.

  • local_incidence_angle (bool) – If set to true, a DEM-based local incidence angle band named local_incidence_angle is added. The values are given in degrees.

  • ellipsoid_incidence_angle (bool) – If set to true, an ellipsoidal incidence angle band named ellipsoid_incidence_angle is added. The values are given in degrees.

  • noise_removal (bool) – If set to false, no noise removal is applied. Defaults to true, which removes noise.

  • options (Optional[dict]) – dictionary with additional (backend-specific) options.

Return type:

DataCube

Returns:

New in version 0.4.9.

Changed in version 0.4.10: replace orthorectify and rtc arguments with coefficient.

See also

openeo.org documentation on process “sar_backscatter”.

save_result(format='GTiff', options=None)[source]

See also

openeo.org documentation on process “save_result”.

Return type:

DataCube

save_user_defined_process(user_defined_process_id, public=False, summary=None, description=None, returns=None, categories=None, examples=None, links=None)[source]

Saves this process graph in the backend as a user-defined process for the authenticated user.

Parameters:
  • user_defined_process_id (str) – unique identifier for the process

  • public (bool) – visible to other users?

  • summary (Optional[str]) – A short summary of what the process does.

  • description (Optional[str]) – Detailed description to explain the entity. CommonMark 0.29 syntax MAY be used for rich text representation.

  • returns (Optional[dict]) – Description and schema of the return value.

  • categories (Optional[List[str]]) – A list of categories.

  • examples (Optional[List[dict]]) – A list of examples.

  • links (Optional[List[dict]]) – A list of links.

Return type:

RESTUserDefinedProcess

Returns:

a RESTUserDefinedProcess instance

send_job(out_format=None, title=None, description=None, plan=None, budget=None, job_options=None, **format_options)

Use of this legacy method is deprecated, use create_job() instead.

Return type:

BatchJob

subtract(other, reverse=False)[source]

See also

openeo.org documentation on process “subtract”.

Return type:

DataCube

to_json(*, indent=2, separators=None)

Get interoperable JSON representation of the process graph.

See DataCube.print_json() to directly print the JSON representation and Export a process graph for more usage information.

Also see json.dumps docs for more information on the JSON formatting options.

Parameters:
  • indent (Optional[int]) – JSON indentation level.

  • separators (Optional[Tuple[str, str]]) – (optional) tuple of item/key separators.

Return type:

str

Returns:

JSON string

unflatten_dimension(dimension, target_dimensions, label_separator=None)[source]

Splits a single dimension into multiple dimensions by systematically extracting values and splitting the dimension labels by the given label_separator. This process is the opposite of the process flatten_dimensions() but executing both processes subsequently doesn’t necessarily create a data cube that is equal to the original data cube.

Parameters:
  • dimension (str) – The name of the dimension to split.

  • target_dimensions (List[str]) – The names of the target dimensions.

  • label_separator (Optional[str]) – The string that will be used as a separator to split the dimension labels.

Returns:

A data cube with the new shape.

Warning

experimental process: not generally supported, API subject to change.

New in version 0.10.0.

See also

openeo.org documentation on process “unflatten_dimension”.

validate()[source]

Validate a process graph without executing it.

Return type:

List[dict]

Returns:

list of errors (dictionaries with “code” and “message” fields)

openeo.rest.vectorcube

class openeo.rest.vectorcube.VectorCube(graph, connection, metadata=None)[source]

A Vector Cube, or ‘Vector Collection’ is a data structure containing ‘Features’: https://www.w3.org/TR/sdw-bp/#dfn-feature

The features in this cube are restricted to have a geometry. Geometries can be points, lines, polygons etcetera. A geometry is specified in a ‘coordinate reference system’. https://www.w3.org/TR/sdw-bp/#dfn-coordinate-reference-system-(crs)

create_job(out_format=None, job_options=None, **format_options)[source]

Sends a job to the backend and returns a ClientJob instance.

Parameters:
  • out_format – String Format of the job result.

  • job_options

  • format_options – String Parameters for the job result format

Return type:

BatchJob

Returns:

status: ClientJob resulting job.

execute_batch(outputfile, out_format=None, print=<built-in function print>, max_poll_interval=60, connection_retry_interval=30, job_options=None, **format_options)[source]

Evaluate the process graph by creating a batch job, and retrieving the results when it is finished. This method is mostly recommended if the batch job is expected to run in a reasonable amount of time.

For very long running jobs, you probably do not want to keep the client running.

Parameters:
  • job_options

  • outputfile (Union[str, Path]) – The path of a file to which a result can be written

  • out_format (Optional[str]) – (optional) Format of the job result.

  • format_options – String Parameters for the job result format

Return type:

BatchJob

flat_graph()

Get the process graph in internal flat dict representation.

Warning

This method is mainly intended for internal use. It is not recommended for general use and is subject to change.

Instead, it is recommended to use to_json() or print_json() to obtain a standardized, interoperable JSON representation of the process graph. See Export a process graph for more information.

Return type:

dict

print_json(*, file=None, indent=2, separators=None)

Print interoperable JSON representation of the process graph.

See DataCube.to_json() to get the JSON representation as a string and Export a process graph for more usage information.

Also see json.dumps docs for more information on the JSON formatting options.

Parameters:
  • file – file-like object (stream) to print to (current sys.stdout by default). Or a path (string or pathlib.Path) to a file to write to.

  • indent (Optional[int]) – JSON indentation level.

  • separators (Optional[Tuple[str, str]]) – (optional) tuple of item/key separators.

New in version 0.12.0.

process(process_id, arguments=None, metadata=None, namespace=None, **kwargs)[source]

Generic helper to create a new DataCube by applying a process.

Parameters:
  • process_id (str) – process id of the process.

  • args – argument dictionary for the process.

Return type:

VectorCube

Returns:

new DataCube instance

result_node()

Get the current result node (PGNode) of the process graph.

New in version 0.10.1.

Return type:

PGNode

run_udf(udf, runtime, version=None, context=None)[source]

New in version 0.10.0.

See also

openeo.org documentation on process “run_udf”.

Return type:

VectorCube

save_result(format='GeoJson', options=None)[source]

See also

openeo.org documentation on process “save_result”.

send_job(out_format=None, job_options=None, **format_options)

Use of this legacy method is deprecated, use create_job() instead.

Return type:

BatchJob

to_json(*, indent=2, separators=None)

Get interoperable JSON representation of the process graph.

See DataCube.print_json() to directly print the JSON representation and Export a process graph for more usage information.

Also see json.dumps docs for more information on the JSON formatting options.

Parameters:
  • indent (Optional[int]) – JSON indentation level.

  • separators (Optional[Tuple[str, str]]) – (optional) tuple of item/key separators.

Return type:

str

Returns:

JSON string

openeo.rest.mlmodel

class openeo.rest.mlmodel.MlModel(graph, connection)[source]

A machine learning model.

It is the result of a training procedure, e.g. output of a fit_... process, and can be used for prediction (classification or regression) with the corresponding predict_... process.

New in version 0.10.0.

create_job(**kwargs)[source]

Sends a job to the backend and returns a ClientJob instance.

See Connection.create_job() for additional arguments (e.g. to set job title, description, …)

Return type:

BatchJob

Returns:

resulting job.

execute_batch(outputfile, print=<built-in function print>, max_poll_interval=60, connection_retry_interval=30, job_options=None)[source]

Evaluate the process graph by creating a batch job, and retrieving the results when it is finished. This method is mostly recommended if the batch job is expected to run in a reasonable amount of time.

For very long running jobs, you probably do not want to keep the client running.

Parameters:
  • job_options

  • outputfile (Union[str, Path]) – The path of a file to which a result can be written

  • out_format – (optional) Format of the job result.

  • format_options – String Parameters for the job result format

Return type:

BatchJob

flat_graph()

Get the process graph in internal flat dict representation.

Warning

This method is mainly intended for internal use. It is not recommended for general use and is subject to change.

Instead, it is recommended to use to_json() or print_json() to obtain a standardized, interoperable JSON representation of the process graph. See Export a process graph for more information.

Return type:

dict

static load_ml_model(connection, id)[source]

Loads a machine learning model from a STAC Item.

Parameters:
  • connection (Connection) – connection object

  • id (Union[str, BatchJob]) – STAC item reference, as URL, batch job (id) or user-uploaded file

Return type:

MlModel

Returns:

New in version 0.10.0.

See also

openeo.org documentation on process “load_ml_model”.

print_json(*, file=None, indent=2, separators=None)

Print interoperable JSON representation of the process graph.

See DataCube.to_json() to get the JSON representation as a string and Export a process graph for more usage information.

Also see json.dumps docs for more information on the JSON formatting options.

Parameters:
  • file – file-like object (stream) to print to (current sys.stdout by default). Or a path (string or pathlib.Path) to a file to write to.

  • indent (Optional[int]) – JSON indentation level.

  • separators (Optional[Tuple[str, str]]) – (optional) tuple of item/key separators.

New in version 0.12.0.

result_node()

Get the current result node (PGNode) of the process graph.

New in version 0.10.1.

Return type:

PGNode

save_ml_model(options=None)[source]

Saves a machine learning model as part of a batch job.

Parameters:

options (Optional[dict]) – Additional parameters to create the file(s).

to_json(*, indent=2, separators=None)

Get interoperable JSON representation of the process graph.

See DataCube.print_json() to directly print the JSON representation and Export a process graph for more usage information.

Also see json.dumps docs for more information on the JSON formatting options.

Parameters:
  • indent (Optional[int]) – JSON indentation level.

  • separators (Optional[Tuple[str, str]]) – (optional) tuple of item/key separators.

Return type:

str

Returns:

JSON string

openeo.api.process

class openeo.api.process.Parameter(name, description=None, schema=None, default=<object object>, optional=None)[source]

Wrapper for a process parameter, as used in predefined and user-defined processes.

classmethod array(name, description=None, default=<object object>)[source]

Helper to create a ‘array’ type parameter.

Return type:

Parameter

classmethod boolean(name, description=None, default=<object object>)[source]

Helper to create a ‘boolean’ type parameter.

Return type:

Parameter

classmethod integer(name, description=None, default=<object object>)[source]

Helper to create a ‘integer’ type parameter.

Return type:

Parameter

classmethod number(name, description=None, default=<object object>)[source]

Helper to create a ‘number’ type parameter.

Return type:

Parameter

classmethod raster_cube(name='data', description='A data cube.')[source]

Helper to easily create a ‘raster-cube’ parameter.

Parameters:
  • name (str) – name of the parameter.

  • description (str) – description of the parameter

Return type:

Parameter

Returns:

Parameter

classmethod string(name, description=None, default=<object object>, values=None)[source]

Helper to create a ‘string’ type parameter.

Return type:

Parameter

to_dict()[source]

Convert to dictionary for JSON-serialization.

Return type:

dict

openeo.api.logs

class openeo.api.logs.LogEntry(*args, **kwargs)[source]

Log message and info for jobs and services

Fields:
  • id: Unique ID for the log, string, REQUIRED

  • code: Error code, string, optional

  • level: Severity level, string (error, warning, info or debug), REQUIRED

  • message: Error message, string, REQUIRED

  • time: Date and time of the error event as RFC3339 date-time, string, available since API 1.1.0

  • path: A “stack trace” for the process, array of dicts

  • links: Related links, array of dicts

  • usage: Usage metrics available as property ‘usage’, dict, available since API 1.1.0 May contain the following metrics: cpu, memory, duration, network, disk, storage and other custom ones Each of the metrics is also a dict with the following parts: value (numeric) and unit (string)

  • data: Arbitrary data the user wants to “log” for debugging purposes. Please note that this property may not exist as there’s a difference between None and non-existing. None for example refers to no-data in many cases while the absence of the property means that the user did not provide any data for debugging.

openeo.rest.connection

This module provides a Connection object to manage and persist settings when interacting with the OpenEO API.

class openeo.rest.connection.Connection(url, auth=None, session=None, default_timeout=None, auth_config=None, refresh_token_store=None, slow_response_threshold=None)[source]

Connection to an openEO backend.

as_curl(data, path='/result', method='POST')[source]

Build curl command to evaluate given process graph or data cube (including authorization and content-type headers).

Parameters:
  • data (Union[dict, DataCube]) – process graph dictionary or DataCube object

  • path – endpoint to send request to

  • method – HTTP method to use

Return type:

str

Returns:

curl command as a string

authenticate_basic(username=None, password=None)[source]

Authenticate a user to the backend using basic username and password.

Parameters:
  • username (Optional[str]) – User name

  • password (Optional[str]) – User passphrase

Return type:

Connection

authenticate_oidc(provider_id=None, client_id=None, client_secret=None, store_refresh_token=True, use_pkce=None)[source]

Do OpenID Connect authentication, first trying refresh tokens and falling back on device code flow.

New in version 0.6.0.

authenticate_oidc_authorization_code(client_id=None, client_secret=None, provider_id=None, timeout=None, server_address=None, webbrowser_open=None, store_refresh_token=False)[source]

OpenID Connect Authorization Code Flow (with PKCE).

Return type:

Connection

authenticate_oidc_client_credentials(client_id=None, client_secret=None, provider_id=None, store_refresh_token=False)[source]

OpenID Connect Client Credentials flow.

Return type:

Connection

authenticate_oidc_device(client_id=None, client_secret=None, provider_id=None, store_refresh_token=False, use_pkce=None, **kwargs)[source]

Authenticate with OAuth Device Authorization grant/flow

Parameters:

use_pkce (Optional[bool]) – Use PKCE instead of client secret. If not set explicitly to True (use PKCE) or False (use client secret), it will be attempted to detect the best mode automatically. Note that PKCE for device code is not widely supported among OIDC providers.

Changed in version 0.5.1: Add use_pkce argument

Return type:

Connection

authenticate_oidc_refresh_token(client_id=None, refresh_token=None, client_secret=None, provider_id=None, store_refresh_token=False)[source]

OpenId Connect Refresh Token

Return type:

Connection

authenticate_oidc_resource_owner_password_credentials(username, password, client_id=None, client_secret=None, provider_id=None, store_refresh_token=False)[source]

OpenId Connect Resource Owner Password Credentials

Return type:

Connection

capabilities()[source]

Loads all available capabilities.

Return type:

RESTCapabilities

collection_items(name, spatial_extent=None, temporal_extent=None, limit=None)[source]

Loads items for a specific image collection. May not be available for all collections.

This is an experimental API and is subject to change.

Parameters:
  • name – String Id of the collection

  • spatial_extent (Optional[List[float]]) – Limits the items to the given bounding box in WGS84: 1. Lower left corner, coordinate axis 1 2. Lower left corner, coordinate axis 2 3. Upper right corner, coordinate axis 1 4. Upper right corner, coordinate axis 2

  • temporal_extent (Optional[List[Union[str, datetime]]]) – Limits the items to the specified temporal interval.

  • limit (Optional[int]) – The amount of items per request/page. If None, the back-end decides. The interval has to be specified as an array with exactly two elements (start, end). Also supports open intervals by setting one of the boundaries to None, but never both.

Return type:

Iterator[dict]

Returns:

data_list: List A list of items

create_file(path)[source]

Creates virtual file

Returns:

file object.

create_job(process_graph, title=None, description=None, plan=None, budget=None, additional=None)[source]

Posts a job to the back end.

Parameters:
  • process_graph (Union[dict, str, Path]) – (flat) dict representing a process graph, or process graph as raw JSON string, or as local file path or URL

  • title (Optional[str]) – String title of the job

  • description (Optional[str]) – String description of the job

  • plan (Optional[str]) – billing plan

  • budget (Optional[float]) – maximum cost the request is allowed to produce

  • additional (Optional[dict]) – additional job options to pass to the backend

Return type:

BatchJob

Returns:

job_id: String Job id of the new created job

datacube_from_flat_graph(flat_graph, parameters=None)[source]

Construct a DataCube from a flat dictionary representation of a process graph.

Parameters:

flat_graph (dict) – flat dictionary representation of a process graph or a process dictionary with such a flat process graph under a “process_graph” field (and optionally parameter metadata under a “parameters” field).

Return type:

DataCube

Returns:

A DataCube corresponding with the operations encoded in the process graph

datacube_from_json(src, parameters=None)[source]

Construct a DataCube from JSON resource containing (flat) process graph representation.

Parameters:

src (Union[str, Path]) – raw JSON string, URL to JSON resource or path to local JSON file

Return type:

DataCube

Returns:

A DataCube corresponding with the operations encoded in the process graph

datacube_from_process(process_id, namespace=None, **kwargs)[source]

Load a data cube from a (custom) process.

Parameters:
  • process_id (str) – The process id.

  • namespace (Optional[str]) – optional: process namespace

  • kwargs – The arguments of the custom process

Return type:

DataCube

Returns:

A DataCube, without valid metadata, as the client is not aware of this custom process.

describe_account()[source]

Describes the currently authenticated user account.

Return type:

str

describe_collection(collection_id)[source]

Get full collection metadata for given collection id.

See also

list_collection_ids() to list all collection ids provided by the back-end.

Parameters:

collection_id (str) – collection id

Return type:

dict

Returns:

collection metadata.

describe_process(id, namespace=None)[source]

Returns a single process from the back end.

Parameters:
  • id (str) – The id of the process.

  • namespace (Optional[str]) – The namespace of the process.

Return type:

dict

Returns:

The process definition.

download(graph, outputfile=None, timeout=1800)[source]

Downloads the result of a process graph synchronously, and save the result to the given file or return bytes object if no outputfile is specified. This method is useful to export binary content such as images. For json content, the execute method is recommended.

Parameters:
  • graph (Union[dict, str, Path]) – (flat) dict representing a process graph, or process graph as raw JSON string, or as local file path or URL

  • outputfile (Union[str, Path, None]) – output file

  • timeout (int) – timeout to wait for response

execute(process_graph)[source]

Execute a process graph synchronously and return the result (assumed to be JSON).

Parameters:

process_graph (Union[dict, str, Path]) – (flat) dict representing a process graph, or process graph as raw JSON string, or as local file path or URL

Returns:

parsed JSON response

imagecollection(collection_id, spatial_extent=None, temporal_extent=None, bands=None, properties=None, fetch_metadata=True)

Use of this legacy method is deprecated, use load_collection() instead.

Return type:

DataCube

job(job_id)[source]

Get the job based on the id. The job with the given id should already exist.

Use openeo.rest.connection.Connection.create_job() to create new jobs

Parameters:

job_id (str) – the job id of an existing job

Return type:

BatchJob

Returns:

A job object.

job_logs(job_id, offset)[source]

Get batch job logs.

Deprecated since version 0.4.10: Use openeo.rest.job.BatchJob.logs() instead.

Return type:

list

job_results(job_id)[source]

Get batch job results metadata.

Deprecated since version 0.4.10: Use openeo.rest.job.BatchJob.get_results() instead.

Return type:

dict

list_collection_ids()[source]

List all collection ids provided by the back-end.

See also

describe_collection() to get the metadata of a particular collection.

Return type:

List[str]

Returns:

list of collection ids

list_collections()[source]

List basic metadata of all collections provided by the back-end.

Caution

Only the basic collection metadata will be returned. To obtain full metadata of a particular collection, it is recommended to use describe_collection() instead.

Return type:

List[dict]

Returns:

list of dictionaries with basic collection metadata.

list_file_formats()[source]

Get available input and output formats

Return type:

dict

list_file_types()

Use of this legacy method is deprecated, use list_output_formats() instead.

Return type:

dict

list_files()[source]

Lists all files that the logged in user uploaded.

Returns:

file_list: List of the user uploaded files.

list_jobs()[source]

Lists all jobs of the authenticated user.

Return type:

List[dict]

Returns:

job_list: Dict of all jobs of the user.

list_processes(namespace=None)[source]

Loads all available processes of the back end.

Parameters:

namespace (Optional[str]) – The namespace for which to list processes.

Return type:

List[dict]

Returns:

processes_dict: Dict All available processes of the back end.

list_service_types()[source]

Loads all available service types.

Return type:

dict

Returns:

data_dict: Dict All available service types

list_services()[source]

Loads all available services of the authenticated user.

Return type:

dict

Returns:

data_dict: Dict All available services

list_udf_runtimes()[source]

Loads all available UDF runtimes.

Return type:

dict

Returns:

data_dict: Dict All available UDF runtimes

list_user_defined_processes()[source]

Lists all user-defined processes of the authenticated user.

Return type:

List[dict]

load_collection(collection_id, spatial_extent=None, temporal_extent=None, bands=None, properties=None, fetch_metadata=True)[source]

Load a DataCube by collection id.

Parameters:
  • collection_id (str) – image collection identifier

  • spatial_extent (Optional[Dict[str, float]]) – limit data to specified bounding box or polygons

  • temporal_extent (Optional[List[Union[str, datetime, date]]]) – limit data to specified temporal interval

  • bands (Optional[List[str]]) – only add the specified bands

  • properties (Optional[Dict[str, Union[str, PGNode, Callable]]]) – limit data by metadata property predicates

Return type:

DataCube

Returns:

a datacube containing the requested data

load_disk_collection(format, glob_pattern, options={})[source]

Loads image data from disk as an ImageCollection.

Parameters:
  • format (str) – the file format, e.g. ‘GTiff’

  • glob_pattern (str) – a glob pattern that matches the files to load from disk

  • options (dict) – options specific to the file format

Return type:

ImageCollectionClient

Returns:

the data as an ImageCollection

load_ml_model(id)[source]

Loads a machine learning model from a STAC Item.

Parameters:

id (Union[str, BatchJob]) – STAC item reference, as URL, batch job (id) or user-uploaded file

Return type:

MlModel

Returns:

New in version 0.10.0.

load_result(id, spatial_extent=None, temporal_extent=None, bands=None)[source]

Loads batch job results by job id from the server-side user workspace. The job must have been stored by the authenticated user on the back-end currently connected to.

Parameters:
  • id (str) – The id of a batch job with results.

  • spatial_extent (Optional[Dict[str, float]]) – limit data to specified bounding box or polygons

  • temporal_extent (Optional[List[Union[str, datetime, date]]]) – limit data to specified temporal interval

  • bands (Optional[List[str]]) – only add the specified bands

Return type:

DataCube

Returns:

a DataCube

remove_service(service_id)[source]

Stop and remove a secondary web service.

Parameters:

service_id (str) – service identifier

Returns:

Deprecated since version 0.8.0: Use openeo.rest.service.Service.delete_service() instead.

request(method, path, headers=None, auth=None, check_error=True, expected_status=None, **kwargs)[source]

Generic request send

save_user_defined_process(user_defined_process_id, process_graph, parameters=None, public=False, summary=None, description=None, returns=None, categories=None, examples=None, links=None)[source]

Store a process graph and its metadata on the backend as a user-defined process for the authenticated user.

Parameters:
  • user_defined_process_id (str) – unique identifier for the user-defined process

  • process_graph (Union[dict, ProcessBuilderBase]) – a process graph

  • parameters (Optional[List[Union[Parameter, dict]]]) – a list of parameters

  • public (bool) – visible to other users?

  • summary (Optional[str]) – A short summary of what the process does.

  • description (Optional[str]) – Detailed description to explain the entity. CommonMark 0.29 syntax MAY be used for rich text representation.

  • returns (Optional[dict]) – Description and schema of the return value.

  • categories (Optional[List[str]]) – A list of categories.

  • examples (Optional[List[dict]]) – A list of examples.

  • links (Optional[List[dict]]) – A list of links.

Return type:

RESTUserDefinedProcess

Returns:

a RESTUserDefinedProcess instance

service(service_id)[source]

Get the secondary web service based on the id. The service with the given id should already exist.

Use openeo.rest.connection.Connection.create_service() to create new services

Parameters:

job_id – the service id of an existing secondary web service

Return type:

Service

Returns:

A service object.

user_defined_process(user_defined_process_id)[source]

Get the user-defined process based on its id. The process with the given id should already exist.

Parameters:

user_defined_process_id (str) – the id of the user-defined process

Return type:

RESTUserDefinedProcess

Returns:

a RESTUserDefinedProcess instance

user_jobs()[source]

Deprecated since version 0.4.10: use list_jobs() instead

Return type:

dict

validate_process_graph(process_graph)[source]

Validate a process graph without executing it.

Parameters:

process_graph (dict) – (flat) dict representing process graph

Return type:

List[dict]

Returns:

list of errors (dictionaries with “code” and “message” fields)

classmethod version_discovery(url, session=None, timeout=None)[source]

Do automatic openEO API version discovery from given url, using a “well-known URI” strategy.

Parameters:

url (str) – initial backend url (not including “/.well-known/openeo”)

Return type:

str

Returns:

root url of highest supported backend version

version_info()[source]

List version of the openEO client, API, back-end, etc.

openeo.rest.job

class openeo.rest.job.BatchJob(job_id, connection)[source]

Handle for an openEO batch job, allowing it to describe, start, cancel, inspect results, etc.

New in version 0.11.0: This class originally had the more cryptic name RESTJob, which is still available as legacy alias, but BatchJob is recommended since version 0.11.0.

delete_job()[source]

Delete a job.

describe_job()[source]

Get all job information.

Return type:

dict

download_result(target=None)[source]

Download single job result to the target file path or into folder (current working dir by default).

Fails if there are multiple result files.

Parameters:

target (Union[str, Path, None]) – String or path where the file should be downloaded to.

Return type:

Path

download_results(target=None)[source]

Download all job result files into given folder (current working dir by default).

The names of the files are taken directly from the backend.

Parameters:

target (Union[str, Path, None]) – String/path, folder where to put the result files.

Return type:

Dict[Path, dict]

Returns:

file_list: Dict containing the downloaded file path as value and asset metadata

Deprecated since version 0.4.10: Instead use BatchJob.get_results() and the more flexible download functionality of JobResults

estimate_job()[source]

Calculate an time/cost estimate for a job.

get_result()[source]

Deprecated since version 0.4.10: Use BatchJob.get_results() instead.

get_results()[source]

Get handle to batch job results for result metadata inspection or downloading resulting assets.

New in version 0.4.10.

Return type:

JobResults

job_id

Unique identifier of the batch job (string).

list_results()[source]

Get batch job results metadata.

Deprecated since version 0.4.10: Use get_results() instead.

Return type:

dict

logs(offset=None)[source]

Retrieve job logs.

Return type:

List[LogEntry]

run_synchronous(outputfile=None, print=<built-in function print>, max_poll_interval=60, connection_retry_interval=30)[source]

Start the job, wait for it to finish and download result

Return type:

BatchJob

start_and_wait(print=<built-in function print>, max_poll_interval=60, connection_retry_interval=30, soft_error_max=10)[source]

Start the batch job, poll its status and wait till it finishes (or fails)

Parameters:
  • print – print/logging function to show progress/status

  • max_poll_interval (int) – maximum number of seconds to sleep between status polls

  • connection_retry_interval (int) – how long to wait when status poll failed due to connection issue

  • soft_error_max – maximum number of soft errors (e.g. temporary connection glitches) to allow

Return type:

BatchJob

Returns:

start_job()[source]

Start / queue a job for processing.

status()[source]

Get the status of the batch job

Return type:

str

Returns:

batch job status, one of “created”, “queued”, “running”, “canceled”, “finished” or “error”.

stop_job()[source]

Stop / cancel job processing.

update_job(process_graph=None, output_format=None, output_parameters=None, title=None, description=None, plan=None, budget=None, additional=None)[source]

Update a job.

class openeo.rest.job.JobResults(job)[source]

Results of a batch job: listing of one or more output files (assets) and some metadata.

New in version 0.4.10.

download_file(target=None, name=None)[source]

Download single asset. Can be used when there is only one asset in the JobResults, or when the desired asset name is given explicitly.

Parameters:
  • target (Union[str, Path, None]) – path to download to. Can be an existing directory (in which case the filename advertised by backend will be used) or full file name. By default, the working directory will be used.

  • name (Optional[str]) – asset name to download (not required when there is only one asset)

Return type:

Path

Returns:

path of downloaded asset

download_files(target=None, include_stac_metadata=True)[source]

Download all assets to given folder.

Parameters:
  • target (Union[str, Path, None]) – path to folder to download to (must be a folder if it already exists)

  • include_stac_metadata (bool) – whether to download the job result metadata as a STAC (JSON) file.

Return type:

List[Path]

Returns:

list of paths to the downloaded assets.

get_asset(name=None)[source]

Get single asset by name or without name if there is only one.

Return type:

ResultAsset

get_assets()[source]

Get all assets from the job results.

Return type:

List[ResultAsset]

get_metadata(force=False)[source]

Get batch job results metadata (parsed JSON)

Return type:

dict

class openeo.rest.job.RESTJob(job_id, connection)[source]

Legacy alias for BatchJob.

Deprecated since version 0.11.0: Use BatchJob instead

class openeo.rest.job.ResultAsset(job, name, href, metadata)[source]

Result asset of a batch job (e.g. a GeoTIFF or JSON file)

New in version 0.4.10.

download(target=None, chunk_size=None)[source]

Download asset to given location

Parameters:

target (Union[str, Path, None]) – download target path. Can be an existing folder (in which case the filename advertised by backend will be used) or full file name. By default, the working directory will be used.

Return type:

Path

href

Download URL of the asset.

load_bytes()[source]

Load asset in memory as raw bytes.

Return type:

bytes

load_json()[source]

Load asset in memory and parse as JSON.

Return type:

dict

metadata

Asset metadata provided by the backend, possibly containing keys “type” (for media type), “roles”, “title”, “description”.

name

Asset name as advertised by the backend.

openeo.rest.conversions

Helpers for data conversions between Python ecosystem data types and openEO data structures.

exception openeo.rest.conversions.InvalidTimeSeriesException[source]
openeo.rest.conversions.datacube_from_file(filename, fmt='netcdf')[source]

Deprecated since version 0.7.0: Use XarrayDataCube.from_file() instead.

Return type:

XarrayDataCube

openeo.rest.conversions.datacube_plot(datacube, *args, **kwargs)[source]

Deprecated since version 0.7.0: Use XarrayDataCube.plot() instead.

openeo.rest.conversions.datacube_to_file(datacube, filename, fmt='netcdf')[source]

Deprecated since version 0.7.0: Use XarrayDataCube.save_to_file() instead.

openeo.rest.conversions.timeseries_json_to_pandas(timeseries, index='date', auto_collapse=True)[source]

Convert a timeseries JSON object as returned by the aggregate_spatial process to a pandas DataFrame object

This timeseries data has three dimensions in general: date, polygon index and band index. One of these will be used as index of the resulting dataframe (as specified by the index argument), and the other two will be used as multilevel columns. When there is just a single polygon or band in play, the dataframe will be simplified by removing the corresponding dimension if auto_collapse is enabled (on by default).

Parameters:
  • timeseries (dict) – dictionary as returned by aggregate_spatial

  • index (str) – which dimension should be used for the DataFrame index: ‘date’ or ‘polygon’

  • auto_collapse – whether single band or single polygon cases should be simplified automatically

Return type:

DataFrame

Returns:

pandas DataFrame or Series

openeo.rest.udp

class openeo.rest.udp.RESTUserDefinedProcess(user_defined_process_id, connection)[source]

Wrapper for a user-defined process stored (or to be stored) on an openEO back-end

delete()[source]

Remove user-defined process from back-end

Return type:

None

describe()[source]

Get metadata of this user-defined process.

Return type:

dict

store(process_graph, parameters=None, public=False, summary=None, description=None, returns=None, categories=None, examples=None, links=None)[source]

Store a process graph and its metadata on the backend as a user-defined process

update(process_graph, parameters=None, public=False, summary=None, description=None)[source]

Deprecated since version 0.4.11: Use store instead. Method update is misleading: OpenEO API does not provide (partial) updates of user-defined processes, only fully overwriting ‘store’ operations.

openeo.udf

class openeo.udf.udf_data.UdfData(proj=None, datacube_list=None, feature_collection_list=None, structured_data_list=None, user_context=None)[source]

Container for data passed to a user defined function (UDF)

property datacube_list: Optional[List[XarrayDataCube]]

Get the data cube list

Return type:

Optional[List[XarrayDataCube]]

property feature_collection_list: Optional[List[FeatureCollection]]

get all feature collections as list

Return type:

Optional[List[FeatureCollection]]

classmethod from_dict(udf_dict)[source]

Create a udf data object from a python dictionary that was created from the JSON definition of the UdfData class

Parameters:

udf_dict (dict) – The dictionary that contains the udf data definition

Return type:

UdfData

get_datacube_list()[source]

Get the data cube list

Return type:

Optional[List[XarrayDataCube]]

get_feature_collection_list()[source]

get all feature collections as list

Return type:

Optional[List[FeatureCollection]]

get_structured_data_list()[source]

Get all structured data entries

Return type:

Optional[List[StructuredData]]

Returns:

A list of StructuredData objects

set_datacube_list(datacube_list)[source]

Set the data cube list

Parameters:

datacube_list (Optional[List[XarrayDataCube]]) – A list of data cubes

set_structured_data_list(structured_data_list)[source]

Set the list of structured data

Parameters:

structured_data_list (Optional[List[StructuredData]]) – A list of StructuredData objects

property structured_data_list: Optional[List[StructuredData]]

Get all structured data entries

Return type:

Optional[List[StructuredData]]

Returns:

A list of StructuredData objects

to_dict()[source]

Convert this UdfData object into a dictionary that can be converted into a valid JSON representation

Return type:

dict

property user_context: dict

Return the user context that was passed to the run_udf function

Return type:

dict

class openeo.udf.xarraydatacube.XarrayDataCube(array)[source]

This is a thin wrapper around xarray.DataArray providing a basic “DataCube” interface for openEO UDF usage around multi-dimensional data.

property array: DataArray

Get the xarray.DataArray that contains the data and dimension definition

Return type:

DataArray

classmethod from_dict(xdc_dict)[source]

Create a XarrayDataCube from a Python dictionary that was created from the JSON definition of the data cube

Parameters:

data – The dictionary that contains the data cube definition

Return type:

XarrayDataCube

classmethod from_file(path, fmt=None)[source]

Load data file as XarrayDataCube in memory

Parameters:
  • path (Union[str, Path]) – the file on disk

  • fmt – format to load from, e.g. “netcdf” or “json” (will be auto-detected when not specified)

Return type:

XarrayDataCube

Returns:

loaded data cube

get_array()[source]

Get the xarray.DataArray that contains the data and dimension definition

Return type:

DataArray

plot(title=None, limits=None, show_bandnames=True, show_dates=True, show_axeslabels=False, fontsize=10.0, oversample=1, cmap='RdYlBu_r', cbartext=None, to_file=None, to_show=True)[source]

Visualize a XarrayDataCube with matplotlib

Parameters:
  • datacube – data to plot

  • title – title text drawn in the top left corner (default: nothing)

  • limits – range of the contour plot as a tuple(min,max) (default: None, in which case the min/max is computed from the data)

  • show_bandnames – whether to plot the column names (default: True)

  • show_dates – whether to show the dates for each row (default: True)

  • show_axeslabels – whether to show the labels on the axes (default: False)

  • fontsize – font size in pixels (default: 10)

  • oversample – one value is plotted into oversample x oversample number of pixels (default: 1 which means each value is plotted as a single pixel)

  • cmap – built-in matplotlib color map name or ColorMap object (default: RdYlBu_r which is a blue-yellow-red rainbow)

  • cbartext – text on top of the legend (default: nothing)

  • to_file – filename to save the image to (default: None, which means no file is generated)

  • to_show – whether to show the image in a matplotlib window (default: True)

Returns:

None

save_to_file(path, fmt=None)[source]

Store XarrayDataCube to file

Parameters:
  • path (Union[str, Path]) – destination file on disk

  • fmt – format to save as, e.g. “netcdf” or “json” (will be auto-detected when not specified)

to_dict()[source]

Convert this hypercube into a dictionary that can be converted into a valid JSON representation

>>> example = {
...     "id": "test_data",
...     "data": [
...         [[0.0, 0.1], [0.2, 0.3]],
...         [[0.0, 0.1], [0.2, 0.3]],
...     ],
...     "dimension": [
...         {"name": "time", "coordinates": ["2001-01-01", "2001-01-02"]},
...         {"name": "X", "coordinates": [50.0, 60.0]},
...         {"name": "Y"},
...     ],
... }
Return type:

dict

class openeo.udf.structured_data.StructuredData(data, description=None, type=None)[source]

This class represents structured data that is produced by an UDF and can not be represented as a raster or vector data cube. For example: the result of a statistical computation.

Usage example:

>>> StructuredData([3, 5, 8, 13])
>>> StructuredData({"mean": 5, "median": 8})
>>> StructuredData([('col_1', 'col_2'), (1, 2), (2, 3)], type="table")
openeo.udf.run_code.execute_local_udf(udf, datacube, fmt='netcdf')[source]

Locally executes an user defined function on a previously downloaded datacube.

Parameters:
  • udf (str) – the code of the user defined function

  • datacube (Union[str, DataArray, XarrayDataCube]) – the path to the downloaded data in disk or a DataCube

  • fmt – format of the file if datacube is string

Returns:

the resulting DataCube

Debug utilities for UDFs

openeo.udf.debug.inspect(data=None, message='', code='User', level='info')[source]

Implementation of the openEO inspect process for UDF contexts.

Note that it is up to the back-end implementation to properly capture this logging and include it in the batch job logs.

New in version 0.10.1.

Parameters:
  • data – data to log

  • message (str) – message to send in addition to the data

  • code (str) – A label to help identify one or more log entries

  • level (str) – The severity level of this message. Allowed values: “error”, “warning”, “info”, “debug”

openeo.util

Various utilities and helpers.

class openeo.util.BBoxDict(*, west, south, east, north, crs=None)[source]

Dictionary based helper to easily create/work with bounding box dictionaries (having keys “west”, “south”, “east”, “north”, and optionally “crs”).

New in version 0.10.1.

classmethod from_dict(data)[source]

Build from dictionary with at least keys “west”, “south”, “east”, and “north”.

Return type:

BBoxDict

classmethod from_sequence(seq, crs=None)[source]

Build from sequence of 4 bounds (west, south, east and north).

Return type:

BBoxDict

openeo.util.load_json_resource(src)[source]

Helper to load some kind of JSON resource

Parameters:

src (Union[str, Path]) – a JSON resource: a raw JSON string, a path to (local) JSON file, or a URL to a remote JSON resource

Return type:

dict

Returns:

data structured parsed from JSON

openeo.util.to_bbox_dict(x, *, crs=None)[source]

Convert given data or object to a bounding box dictionary (having keys “west”, “south”, “east”, “north”, and optionally “crs”).

Supports various input types/formats:

  • list/tuple (assumed to be in west-south-east-north order)

    >>> to_bbox_dict([3, 50, 4, 51])
    {'west': 3, 'south': 50, 'east': 4, 'north': 51}
    
  • dictionary (unnecessary items will be stripped)

    >>> to_bbox_dict({
    ...     "color": "red", "shape": "triangle",
    ...     "west": 1, "south": 2, "east": 3, "north": 4, "crs": "EPSG:4326",
    ... })
    {'west': 1, 'south': 2, 'east': 3, 'north': 4, 'crs': 'EPSG:4326'}
    
  • a shapely geometry

New in version 0.10.1.

Parameters:
  • x (Any) – input data that describes west-south-east-north bounds in some way, e.g. as a dictionary, a list, a tuple, ashapely geometry, …

  • crs (Optional[str]) – (optional) CRS field

Return type:

BBoxDict

Returns:

dictionary (subclass) with keys “west”, “south”, “east”, “north”, and optionally “crs”.

openeo.processes

openeo.processes.process(process_id, arguments=None, namespace=None, **kwargs)

Apply process, using given arguments

Parameters:
  • process_id (str) – process id of the process.

  • arguments (Optional[dict]) – argument dictionary for the process.

  • namespace (Optional[str]) – process namespace (only necessary to specify for non-predefined or non-user-defined processes)

Returns:

new ProcessBuilder instance

openeo.internal

Functionality for abstracting, building, manipulating and processing openEO process graphs.

class openeo.internal.graph_building.PGNode(process_id, arguments=None, namespace=None, **kwargs)[source]

A process node in a process graph: has at least a process_id and arguments.

Note that a full openEO “process graph” is essentially a directed acyclic graph of nodes pointing to each other. A full process graph is practically equivalent with its “result” node, as it points (directly or indirectly) to all the other nodes it depends on.

Warning

This class is an implementation detail meant for internal use. It is not recommended for general use in normal user code. Instead, use process graph abstraction builders like Connection.load_collection(), Connection.datacube_from_process(), Connection.datacube_from_flat_graph(), Connection.datacube_from_json(), Connection.load_ml_model(), openeo.processes.process(),

flat_graph()[source]

Get the process graph in internal flat dict representation.

Return type:

dict

static from_flat_graph(flat_graph, parameters=None)[source]

Unflatten a given flat dict representation of a process graph and return result node.

Return type:

PGNode

to_dict()[source]

Convert process graph to a nested dictionary structure. Uses deep copy style: nodes that are reused in graph will be deduplicated

Return type:

dict

static to_process_graph_argument(value)[source]

Normalize given argument properly to a “process_graph” argument to be used as reducer/subprocess for processes like reduce_dimension, aggregate_spatial, apply, merge_cubes, resample_cube_temporal

Return type:

dict

update_arguments(**kwargs)[source]

Add/Update arguments of the process node.

New in version 0.10.1.