openeo_udf.api package¶
Submodules¶
openeo_udf.api.collection_base module¶
OpenEO Python UDF interface
-
class
openeo_udf.api.collection_base.
CollectionBase
(id: str, extent: Union[openeo_udf.api.spatial_extent.SpatialExtent, NoneType] = None, start_times: Union[pandas.core.indexes.datetimes.DatetimeIndex, NoneType] = None, end_times: Union[pandas.core.indexes.datetimes.DatetimeIndex, NoneType] = None)[source]¶ Bases:
object
This is the base class for raster and vector collection tiles. It implements start time, end time and spatial extent handling.
Some basic tests:
>>> extent = SpatialExtent(top=100, bottom=0, right=100, left=0, height=10, width=10) >>> coll = CollectionBase(id="test", extent=extent) >>> print(coll) id: test extent: top: 100 bottom: 0 right: 100 left: 0 height: 10 width: 10 start_times: None end_times: None
>>> import pandas >>> extent = SpatialExtent(top=100, bottom=0, right=100, left=0, height=10, width=10) >>> dates = [pandas.Timestamp('2012-05-01')] >>> starts = pandas.DatetimeIndex(dates) >>> dates = [pandas.Timestamp('2012-05-02')] >>> ends = pandas.DatetimeIndex(dates) >>> rdc = CollectionBase(id="test", extent=extent, ... start_times=starts, end_times=ends) >>> "extent" in rdc.extent_to_dict() True >>> rdc.extent_to_dict()["extent"]["left"] == 0 True >>> rdc.extent_to_dict()["extent"]["right"] == 100 True >>> rdc.extent_to_dict()["extent"]["top"] == 100 True >>> rdc.extent_to_dict()["extent"]["bottom"] == 0 True >>> rdc.extent_to_dict()["extent"]["height"] == 10 True >>> rdc.extent_to_dict()["extent"]["width"] == 10 True
>>> import json >>> json.dumps(rdc.start_times_to_dict()) '{"start_times": ["2012-05-01T00:00:00"]}' >>> json.dumps(rdc.end_times_to_dict()) '{"end_times": ["2012-05-02T00:00:00"]}'
>>> ct = CollectionBase(id="test") >>> ct.set_extent_from_dict({"top": 53, "bottom": 50, "right": 30, "left": 24, "height": 0.01, "width": 0.01}) >>> ct.set_start_times_from_list(["2012-05-01T00:00:00"]) >>> ct.set_end_times_from_list(["2012-05-02T00:00:00"]) >>> print(ct) id: test extent: top: 53 bottom: 50 right: 30 left: 24 height: 0.01 width: 0.01 start_times: DatetimeIndex(['2012-05-01'], dtype='datetime64[ns]', freq=None) end_times: DatetimeIndex(['2012-05-02'], dtype='datetime64[ns]', freq=None)
-
check_data_with_time
()[source]¶ Check if the start and end date vectors have the same size as the data
-
end_times
¶ Returns the end time vector
Returns: End time vector Return type: pandas.DatetimeIndex
-
end_times_to_dict
() → Dict[source]¶ Convert the end times vector into a dictionary representation that can be converted to JSON
Returns: The end times vector Return type: dict
-
extent
¶ Return the spatial extent
Returns: The spatial extent Return type: SpatialExtent
-
extent_to_dict
() → Dict[source]¶ Convert the extent into a dictionary representation that can be converted to JSON
Returns: The spatial extent Return type: dict
-
get_end_times
() → Union[pandas.core.indexes.datetimes.DatetimeIndex, NoneType][source]¶ Returns the end time vector
Returns: End time vector Return type: pandas.DatetimeIndex
-
get_extent
() → openeo_udf.api.spatial_extent.SpatialExtent[source]¶ Return the spatial extent
Returns: The spatial extent Return type: SpatialExtent
-
get_start_times
() → Union[pandas.core.indexes.datetimes.DatetimeIndex, NoneType][source]¶ Returns the start time vector
Returns: Start time vector Return type: pandas.DatetimeIndex
-
set_end_times
(end_times: Union[pandas.core.indexes.datetimes.DatetimeIndex, NoneType])[source]¶ Set the end times vector
Parameters: end_times (pandas.DatetimeIndex) -- The end times vector
-
set_end_times_from_list
(end_times: Dict)[source]¶ Set the end times vector from a dictionary
Parameters: end_times (dict) -- The dictionary with the layout of the JSON end times vector definition
-
set_extent
(extent: openeo_udf.api.spatial_extent.SpatialExtent)[source]¶ Set the spatial extent
Parameters: extent (SpatialExtent) -- The spatial extent with resolution information, must be of type SpatialExtent
-
set_extent_from_dict
(extent: Dict)[source]¶ Set the spatial extent from a dictionary
Parameters: extent (dict) -- The dictionary with the layout of the JSON SpatialExtent definition
-
set_start_times
(start_times: Union[pandas.core.indexes.datetimes.DatetimeIndex, NoneType])[source]¶ Set the start times vector
Parameters: start_times (pandas.DatetimeIndex) -- The start times vector
-
set_start_times_from_list
(start_times: Dict)[source]¶ Set the start times vector from a dictionary
Parameters: start_times (dict) -- The dictionary with the layout of the JSON start times vector definition
-
start_times
¶ Returns the start time vector
Returns: Start time vector Return type: pandas.DatetimeIndex
-
openeo_udf.api.datacube module¶
OpenEO Python UDF interface
-
class
openeo_udf.api.datacube.
DataCube
(array: xarray.core.dataarray.DataArray)[source]¶ Bases:
object
This class is a hypercube representation of multi-dimensional data that stores an xarray and provides methods to convert the xarray into the HyperCube JSON representation
>>> array = xarray.DataArray(numpy.zeros(shape=(2, 3)), coords={'x': [1, 2], 'y': [1, 2, 3]}, dims=('x', 'y')) >>> array.attrs["description"] = "This is an xarray with two dimensions" >>> array.name = "testdata" >>> h = DataCube(array=array) >>> d = h.to_dict() >>> d["id"] 'testdata' >>> d["data"] [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]] >>> d["dimensions"] [{'name': 'x', 'coordinates': [1, 2]}, {'name': 'y', 'coordinates': [1, 2, 3]}] >>> d["description"] 'This is an xarray with two dimensions'
>>> new_h = DataCube.from_dict(d) >>> d = new_h.to_dict() >>> d["id"] 'testdata' >>> d["data"] [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]] >>> d["dimensions"] [{'name': 'x', 'coordinates': [1, 2]}, {'name': 'y', 'coordinates': [1, 2, 3]}] >>> d["description"] 'This is an xarray with two dimensions'
>>> array = xarray.DataArray(numpy.zeros(shape=(2, 3)), coords={'x': [1, 2], 'y': [1, 2, 3]}, dims=('x', 'y')) >>> h = DataCube(array=array) >>> d = h.to_dict() >>> d["id"] >>> d["data"] [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]] >>> d["dimensions"] [{'name': 'x', 'coordinates': [1, 2]}, {'name': 'y', 'coordinates': [1, 2, 3]}] >>> "description" not in d True
>>> new_h = DataCube.from_dict(d) >>> d = new_h.to_dict() >>> d["id"] >>> d["data"] [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]] >>> d["dimensions"] [{'name': 'x', 'coordinates': [1, 2]}, {'name': 'y', 'coordinates': [1, 2, 3]}] >>> "description" not in d True
>>> array = xarray.DataArray(numpy.zeros(shape=(2, 3))) >>> h = DataCube(array=array) >>> d = h.to_dict() >>> d["id"] >>> d["data"] [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]] >>> d["dimensions"] [] >>> "description" not in d True
>>> new_h = DataCube.from_dict(d) >>> d = new_h.to_dict() >>> d["id"] >>> d["data"] [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]] >>> d["dimensions"] [] >>> "description" not in d True
-
array
¶ Return the xarray.DataArray that contains the data and dimension definition
Returns: that contains the data and dimension definition Return type: xarray.DataArray
-
static
from_data_collection
() → List[_ForwardRef('DataCube')][source]¶ Create data cubes from a data collection
Parameters: data_collection -- Returns: A list of data cubes
-
static
from_dict
() → openeo_udf.api.datacube.DataCube[source]¶ Create a hypercube from a python dictionary that was created from the JSON definition of the HyperCube
Parameters: hc_dict (dict) -- The dictionary that contains the hypercube definition Returns: HyperCube
-
get_array
() → xarray.core.dataarray.DataArray[source]¶ Return the xarray.DataArray that contains the data and dimension definition
Returns: that contains the data and dimension definition Return type: xarray.DataArray
-
id
¶
-
set_array
(array: xarray.core.dataarray.DataArray)[source]¶ Set the xarray.DataArray that contains the data and dimension definition
This function will check if the provided data is a geopandas.GeoDataFrame and raises an Exception
Parameters: array -- xarray.DataArray that contains the data and dimension definition
-
to_dict
() → Dict[source]¶ Convert this hypercube into a dictionary that can be converted into a valid JSON representation
Returns: HyperCube as a dictionary Return type: dict >>> example = { ... "id": "test_data", ... "data": [ ... [ ... [0.0, 0.1], ... [0.2, 0.3] ... ], ... [ ... [0.0, 0.1], ... [0.2, 0.3] ... ] ... ], ... "dimension": [{"name": "time", "unit": "ISO:8601", "coordinates":["2001-01-01", "2001-01-02"]}, ... {"name": "X", "unit": "degree", "coordinates":[50.0, 60.0]}, ... {"name": "Y", "unit": "degree"}, ... ] ... }
-
openeo_udf.api.feature_collection module¶
OpenEO Python UDF interface
-
class
openeo_udf.api.feature_collection.
FeatureCollection
(id: str, data: geopandas.geodataframe.GeoDataFrame, start_times: Union[pandas.core.indexes.datetimes.DatetimeIndex, NoneType] = None, end_times: Union[pandas.core.indexes.datetimes.DatetimeIndex, NoneType] = None)[source]¶ Bases:
openeo_udf.api.collection_base.CollectionBase
A feature collection that represents a subset or a whole feature collection where single vector features may have time stamps assigned.
Some basic tests:
>>> from shapely.geometry import Point >>> import geopandas >>> p1 = Point(0,0) >>> p2 = Point(100,100) >>> p3 = Point(100,0) >>> pseries = [p1, p2, p3] >>> data = geopandas.GeoDataFrame(geometry=pseries, columns=["a", "b"]) >>> data["a"] = [1,2,3] >>> data["b"] = ["a","b","c"] >>> fct = FeatureCollection(id="test", data=data) >>> print(fct) id: test start_times: None end_times: None data: a b geometry 0 1 a POINT (0 0) 1 2 b POINT (100 100) 2 3 c POINT (100 0) >>> import json >>> json.dumps(fct.to_dict()) ... '{"id": "test", "data": {"type": "FeatureCollection", "features": [{"id": "0", "type": "Feature", "properties": {"a": 1, "b": "a"}, "geometry": {"type": "Point", "coordinates": [0.0, 0.0]}}, {"id": "1", "type": "Feature", "properties": {"a": 2, "b": "b"}, "geometry": {"type": "Point", "coordinates": [100.0, 100.0]}}, {"id": "2", "type": "Feature", "properties": {"a": 3, "b": "c"}, "geometry": {"type": "Point", "coordinates": [100.0, 0.0]}}]}}'
>>> p1 = Point(0,0) >>> pseries = [p1] >>> data = geopandas.GeoDataFrame(geometry=pseries, columns=["a", "b"]) >>> data["a"] = [1] >>> data["b"] = ["a"] >>> dates = [pandas.Timestamp('2012-05-01')] >>> starts = pandas.DatetimeIndex(dates) >>> dates = [pandas.Timestamp('2012-05-02')] >>> ends = pandas.DatetimeIndex(dates) >>> fct = FeatureCollection(id="test", start_times=starts, end_times=ends, data=data) >>> print(fct) id: test start_times: DatetimeIndex(['2012-05-01'], dtype='datetime64[ns]', freq=None) end_times: DatetimeIndex(['2012-05-02'], dtype='datetime64[ns]', freq=None) data: a b geometry 0 1 a POINT (0 0)
>>> import json >>> json.dumps(fct.to_dict()) ... '{"id": "test", "start_times": ["2012-05-01T00:00:00"], "end_times": ["2012-05-02T00:00:00"], "data": {"type": "FeatureCollection", "features": [{"id": "0", "type": "Feature", "properties": {"a": 1, "b": "a"}, "geometry": {"type": "Point", "coordinates": [0.0, 0.0]}}]}}'
>>> fct = FeatureCollection.from_dict(fct.to_dict()) >>> json.dumps(fct.to_dict()) ... '{"id": "test", "start_times": ["2012-05-01T00:00:00"], "end_times": ["2012-05-02T00:00:00"], "data": {"type": "FeatureCollection", "features": [{"id": "0", "type": "Feature", "properties": {"a": 1, "b": "a"}, "geometry": {"type": "Point", "coordinates": [0.0, 0.0]}}]}}'
-
data
¶ Return the geopandas.GeoDataFrame that contains the geometry column and any number of attribute columns
Returns: A data frame that contains the geometry column and any number of attribute columns Return type: geopandas.GeoDataFrame
-
static
from_dict
()[source]¶ Create a feature collection from a python dictionary that was created from the JSON definition of the FeatureCollection
Parameters: fct_dict (dict) -- The dictionary that contains the feature collection definition Returns: A new FeatureCollection object Return type: FeatureCollection
-
get_data
() → geopandas.geodataframe.GeoDataFrame[source]¶ Return the geopandas.GeoDataFrame that contains the geometry column and any number of attribute columns
Returns: A data frame that contains the geometry column and any number of attribute columns Return type: geopandas.GeoDataFrame
-
set_data
(data: geopandas.geodataframe.GeoDataFrame)[source]¶ Set the geopandas.GeoDataFrame that contains the geometry column and any number of attribute columns
This function will check if the provided data is a geopandas.GeoDataFrame and raises an Exception
Parameters: data (geopandas.GeoDataFrame) -- A GeoDataFrame with geometry column and attribute data
-
openeo_udf.api.machine_learn_model module¶
OpenEO Python UDF interface
-
class
openeo_udf.api.machine_learn_model.
MachineLearnModelConfig
(framework: str, name: str, description: str, path: Union[str, NoneType] = None, md5_hash: Union[str, NoneType] = None)[source]¶ Bases:
object
This class represents a machine learn model. The model will be loaded at construction, based on the machine learn framework.
- The following frameworks are supported:
- sklearn models that are created with sklearn.externals.joblib
- pytorch models that are created with torch.save
>>> from sklearn.ensemble import RandomForestRegressor >>> from sklearn.externals import joblib >>> model = RandomForestRegressor(n_estimators=10, max_depth=2, verbose=0) >>> path = '/tmp/test.pkl.xz' >>> dummy = joblib.dump(value=model, filename=path, compress=("xz", 3)) >>> m = MachineLearnModelConfig(framework="sklearn", name="test", ... description="Machine learn model", path=path) >>> m.get_model() ... RandomForestRegressor(bootstrap=True, criterion='mse', max_depth=2, max_features='auto', max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=1, oob_score=False, random_state=None, verbose=0, warm_start=False) >>> m.to_dict() ... {'description': 'Machine learn model', 'name': 'test', 'framework': 'sklearn', 'path': '/tmp/test.pkl.xz', 'md5_hash': None} >>> d = {'description': 'Machine learn model', 'name': 'test', 'framework': 'sklearn', ... 'path': '/tmp/test.pkl.xz', "md5_hash": None} >>> m = MachineLearnModelConfig.from_dict(d) >>> m.get_model() ... RandomForestRegressor(bootstrap=True, criterion='mse', max_depth=2, max_features='auto', max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=1, oob_score=False, random_state=None, verbose=0, warm_start=False)
>>> import torch >>> import torch.nn as nn >>> model = nn.Module >>> path = '/tmp/test.pt' >>> torch.save(model, path) >>> m = MachineLearnModelConfig(framework="pytorch", name="test", ... description="Machine learn model", path=path) >>> m.get_model() ... <class 'torch.nn.modules.module.Module'> >>> m.to_dict() ... {'description': 'Machine learn model', 'name': 'test', 'framework': 'pytorch', 'path': '/tmp/test.pt', 'md5_hash': None} >>> d = {'description': 'Machine learn model', 'name': 'test', 'framework': 'pytorch', ... 'path': '/tmp/test.pt', "md5_hash": None} >>> m = MachineLearnModelConfig.from_dict(d) >>> m.get_model() ... <class 'torch.nn.modules.module.Module'>
-
get_model
()[source]¶ Get the loaded machine learn model. This function will return None if the model was not loaded
Returns: the loaded model
openeo_udf.api.run_code module¶
OpenEO Python UDF interface
-
openeo_udf.api.run_code.
load_module_from_string
[source]¶ Experimental -- avoid loading same UDF module more than once, to make caching inside the udf work. @param code: @return:
-
openeo_udf.api.run_code.
run_legacy_user_code
(dict_data: Dict) → Dict[source]¶ Run the user defined python code on legacy data
Parameters: dict_data -- the udf request object with code and legacy data organized in a dictionary Returns:
openeo_udf.api.spatial_extent module¶
OpenEO Python UDF interface
-
class
openeo_udf.api.spatial_extent.
SpatialExtent
(top: float, bottom: float, right: float, left: float, height: Union[float, NoneType] = None, width: Union[float, NoneType] = None)[source]¶ Bases:
object
The axis aligned spatial extent of a collection tile
Some basic tests:
>>> extent = SpatialExtent(top=100, bottom=0, right=100, left=0, height=10, width=10) >>> print(extent) top: 100 bottom: 0 right: 100 left: 0 height: 10 width: 10 >>> extent.to_index(50, 50) (5, 5) >>> extent.to_index(0, 0) (0, 10) >>> extent.to_index(100, 0) (0, 0)
>>> extent = SpatialExtent(top=100, bottom=0, right=100, left=0) >>> print(extent) top: 100 bottom: 0 right: 100 left: 0 height: None width: None >>> p = extent.as_polygon() >>> print(p) POLYGON ((0 100, 100 100, 100 0, 0 0, 0 100))
>>> from shapely.wkt import loads >>> p = loads("POLYGON ((0 100, 100 100, 100 0, 0 0, 0 100))") >>> extent = SpatialExtent.from_polygon(p) >>> print(extent) top: 100.0 bottom: 0.0 right: 100.0 left: 0.0 height: None width: None >>> extent.contains_point(50, 50) True >>> extent.contains_point(150, 50) False >>> extent.contains_point(25, 25) True >>> extent.contains_point(101, 101) False
>>> extent = SpatialExtent(top=100, bottom=0, right=100, left=0) >>> extent.as_polygon() == extent.as_polygon() True >>> diff = extent.as_polygon() - extent.as_polygon() >>> print(diff) GEOMETRYCOLLECTION EMPTY
>>> extent_1 = SpatialExtent(top=80, bottom=10, right=80, left=10) >>> extent_2 = SpatialExtent(top=100, bottom=0, right=100, left=0) >>> extent_1.as_polygon() == extent_2.as_polygon() False >>> extent_2.as_polygon().contains(extent_2.as_polygon()) True
-
as_polygon
() → shapely.geometry.polygon.Polygon[source]¶ Return the extent as shapely.geometry.Polygon to perform comparison operations between other extents like equal, intersect and so on
Returns: The polygon representing the spatial extent Return type: shapely.geometry.Polygon
-
contains_point
(top: float, left: float) → shapely.geometry.point.Point[source]¶ Return True if the provided coordinate is located in the spatial extent, False otherwise
Parameters: Returns: True if the coordinates are contained in the extent, False otherwise
Return type:
-
static
from_dict
()[source]¶ Create a SpatialExtent from a python dictionary that was created from the JSON definition of the SpatialExtent
Parameters: extent (dict) -- The dictionary that contains the spatial extent definition Returns: A new SpatialExtent object Return type: SpatialExtent
-
static
from_polygon
() → openeo_udf.api.spatial_extent.SpatialExtent[source]¶ Convert a polygon with rectangular shape into a spatial extent
Parameters: polygon (shapely.geometry.Polygon) -- The polygon that should be converted into a spatial extent Returns: The spatial extent Return type: SpatialExtent
-
to_dict
() → Dict[source]¶ Return the spatial extent as a dict that can be easily converted into JSON
Returns: Dictionary representation Return type: dict
-
openeo_udf.api.structured_data module¶
OpenEO Python UDF interface
-
class
openeo_udf.api.structured_data.
StructuredData
(description, data, type)[source]¶ Bases:
object
This class represents structured data that is produced by an UDF and can not be represented as a RasterCollectionTile or FeatureCollectionTile. For example the result of a statistical computation. The data is self descriptive and supports the basic types dict/map, list and table.
The data field contains the UDF specific values (argument or return) as dict, list or table:
- A dict can be as complex as required by the UDF
- A list must contain simple data types example {"list": [1,2,3,4]}
- A table is a list of lists with a header, example {"table": [["id","value"],
- [1, 10], [2, 23], [3, 4]]}
>>> table = [("col_1", "col_2"), (1, 2), (2, 3)] >>> st = StructuredData(description="Table output", data=table, type="table") >>> st.to_dict() ... {'description': 'Table output', 'data': [('col_1', 'col_2'), (1, 2), (2, 3)], 'type': 'table'}
>>> values = [1,2,3,4] >>> st = StructuredData(description="List output", data=values, type="list") >>> st.to_dict() ... {'description': 'List output', 'data': [1, 2, 3, 4], 'type': 'list'}
>>> key_value_store = dict(a=1, b=2, c=3) >>> st = StructuredData(description="Key-value output", data=key_value_store, type="dict") >>> st.to_dict() ... {'description': 'Key-value output', 'data': {'a': 1, 'b': 2, 'c': 3}, 'type': 'dict'}
openeo_udf.api.tools module¶
OpenEO Python UDF interface
openeo_udf.api.udf_data module¶
OpenEO Python UDF interface
-
class
openeo_udf.api.udf_data.
UdfData
(proj: Dict = None, datacube_list: Union[typing.List[openeo_udf.api.datacube.DataCube], NoneType] = None, feature_collection_list: Union[typing.List[openeo_udf.api.feature_collection.FeatureCollection], NoneType] = None, structured_data_list: Union[typing.List[openeo_udf.api.structured_data.StructuredData], NoneType] = None, ml_model_list: Union[typing.List[openeo_udf.api.machine_learn_model.MachineLearnModelConfig], NoneType] = None, metadata: openeo_udf.server.data_model.metadata_schema.MetadataModel = None)[source]¶ Bases:
object
The class that stores the arguments for a user defined function (UDF)
Some basic tests:
>>> from shapely.geometry import Point >>> import geopandas >>> import numpy, pandas >>> from sklearn.ensemble import RandomForestRegressor >>> from sklearn.externals import joblib >>> data = numpy.zeros(shape=(1,1,1)) >>> extent = SpatialExtent(top=100, bottom=0, right=100, left=0, height=10, width=10) >>> starts = pandas.DatetimeIndex([pandas.Timestamp('2012-05-01')]) >>> ends = pandas.DatetimeIndex([pandas.Timestamp('2012-05-02')]) >>> p1 = Point(0,0) >>> p2 = Point(100,100) >>> p3 = Point(100,0) >>> pseries = [p1, p2, p3] >>> data = geopandas.GeoDataFrame(geometry=pseries, columns=["a", "b"]) >>> data["a"] = [1,2,3] >>> data["b"] = ["a","b","c"] >>> C = FeatureCollection(id="C", data=data) >>> D = FeatureCollection(id="D", data=data) >>> udf_data = UdfData(proj={"EPSG":4326}, feature_collection_list=[C, D]) >>> model = RandomForestRegressor(n_estimators=10, max_depth=2, verbose=0) >>> path = '/tmp/test.pkl.xz' >>> dummy = joblib.dump(value=model, filename=path, compress=("xz", 3)) >>> m = MachineLearnModelConfig(framework="sklearn", name="test", ... description="Machine learn model", path=path) >>> udf_data.append_machine_learn_model(m) >>> print(udf_data.get_feature_collection_by_id("C")) id: C start_times: None end_times: None data: a b geometry 0 1 a POINT (0 0) 1 2 b POINT (100 100) 2 3 c POINT (100 0) >>> print(udf_data.get_feature_collection_by_id("D")) id: D start_times: None end_times: None data: a b geometry 0 1 a POINT (0 0) 1 2 b POINT (100 100) 2 3 c POINT (100 0) >>> print(len(udf_data.get_feature_collection_list()) == 2) True >>> print(udf_data.ml_model_list[0].path) /tmp/test.pkl.xz >>> print(udf_data.ml_model_list[0].framework) sklearn
>>> import json >>> json.dumps(udf_data.to_dict()) ... '{"proj": {"EPSG": 4326}, "user_context": {}, "server_context": {}, "datacubes": [], "feature_collection_list": [{"id": "C", "data": {"type": "FeatureCollection", "features": [{"id": "0", "type": "Feature", "properties": {"a": 1, "b": "a"}, "geometry": {"type": "Point", "coordinates": [0.0, 0.0]}}, {"id": "1", "type": "Feature", "properties": {"a": 2, "b": "b"}, "geometry": {"type": "Point", "coordinates": [100.0, 100.0]}}, {"id": "2", "type": "Feature", "properties": {"a": 3, "b": "c"}, "geometry": {"type": "Point", "coordinates": [100.0, 0.0]}}]}}, {"id": "D", "data": {"type": "FeatureCollection", "features": [{"id": "0", "type": "Feature", "properties": {"a": 1, "b": "a"}, "geometry": {"type": "Point", "coordinates": [0.0, 0.0]}}, {"id": "1", "type": "Feature", "properties": {"a": 2, "b": "b"}, "geometry": {"type": "Point", "coordinates": [100.0, 100.0]}}, {"id": "2", "type": "Feature", "properties": {"a": 3, "b": "c"}, "geometry": {"type": "Point", "coordinates": [100.0, 0.0]}}]}}], "structured_data_list": [], "machine_learn_models": [{"description": "Machine learn model", "name": "test", "framework": "sklearn", "path": "/tmp/test.pkl.xz", "md5_hash": null}]}'
>>> udf = UdfData.from_dict(udf_data.to_dict()) >>> json.dumps(udf.to_dict()) ... '{"proj": {"EPSG": 4326}, "user_context": {}, "server_context": {}, "datacubes": [], "feature_collection_list": [{"id": "C", "data": {"type": "FeatureCollection", "features": [{"id": "0", "type": "Feature", "properties": {"a": 1, "b": "a"}, "geometry": {"type": "Point", "coordinates": [0.0, 0.0]}}, {"id": "1", "type": "Feature", "properties": {"a": 2, "b": "b"}, "geometry": {"type": "Point", "coordinates": [100.0, 100.0]}}, {"id": "2", "type": "Feature", "properties": {"a": 3, "b": "c"}, "geometry": {"type": "Point", "coordinates": [100.0, 0.0]}}]}}, {"id": "D", "data": {"type": "FeatureCollection", "features": [{"id": "0", "type": "Feature", "properties": {"a": 1, "b": "a"}, "geometry": {"type": "Point", "coordinates": [0.0, 0.0]}}, {"id": "1", "type": "Feature", "properties": {"a": 2, "b": "b"}, "geometry": {"type": "Point", "coordinates": [100.0, 100.0]}}, {"id": "2", "type": "Feature", "properties": {"a": 3, "b": "c"}, "geometry": {"type": "Point", "coordinates": [100.0, 0.0]}}]}}], "structured_data_list": [], "machine_learn_models": [{"description": "Machine learn model", "name": "test", "framework": "sklearn", "path": "/tmp/test.pkl.xz", "md5_hash": null}]}'
>>> sd_list = StructuredData(description="Data list", data={"list":[1,2,3]}, type="list") >>> sd_dict = StructuredData(description="Data dict", data={"A":{"B": 1}}, type="dict") >>> udf = UdfData(proj={"EPSG":4326}, structured_data_list=[sd_list, sd_dict]) >>> json.dumps(udf.to_dict()) ... '{"proj": {"EPSG": 4326}, "user_context": {}, "server_context": {}, "datacubes": [], "feature_collection_list": [], "structured_data_list": [{"description": "Data list", "data": {"list": [1, 2, 3]}, "type": "list"}, {"description": "Data dict", "data": {"A": {"B": 1}}, "type": "dict"}], "machine_learn_models": []}'
>>> array = xarray.DataArray(numpy.zeros(shape=(2, 3)), coords={'x': [1, 2], 'y': [1, 2, 3]}, dims=('x', 'y')) >>> array.attrs["description"] = "This is an xarray with two dimensions" >>> array.name = "testdata" >>> h = DataCube(array=array) >>> udf_data = UdfData(proj={"EPSG":4326}, datacube_list=[h]) >>> udf_data.user_context = {"kernel": 3} >>> udf_data.server_context = {"reduction_dimension": "t"} >>> udf_data.user_context {'kernel': 3} >>> udf_data.server_context {'reduction_dimension': 't'} >>> print(udf_data.get_datacube_by_id("testdata").to_dict()) {'id': 'testdata', 'data': [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]], 'dimensions': [{'name': 'x', 'coordinates': [1, 2]}, {'name': 'y', 'coordinates': [1, 2, 3]}], 'description': 'This is an xarray with two dimensions'} >>> json.dumps(udf_data.to_dict()) ... '{"proj": {"EPSG": 4326}, "user_context": {"kernel": 3}, "server_context": {"reduction_dimension": "t"}, "datacubes": [{"id": "testdata", "data": [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]], "dimensions": [{"name": "x", "coordinates": [1, 2]}, {"name": "y", "coordinates": [1, 2, 3]}], "description": "This is an xarray with two dimensions"}], "feature_collection_list": [], "structured_data_list": [], "machine_learn_models": []}'
>>> udf = UdfData.from_dict(udf_data.to_dict()) >>> json.dumps(udf.to_dict()) ... '{"proj": {"EPSG": 4326}, "user_context": {}, "server_context": {}, "datacubes": [{"id": "testdata", "data": [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]], "dimensions": [{"name": "x", "coordinates": [1, 2]}, {"name": "y", "coordinates": [1, 2, 3]}], "description": "This is an xarray with two dimensions"}], "feature_collection_list": [], "structured_data_list": [], "machine_learn_models": []}'
-
append_datacube
(datacube: openeo_udf.api.datacube.DataCube)[source]¶ Append a HyperCube to the list
It will be automatically added to the dictionary of all datacubes
Parameters: datacube (DataCube) -- The HyperCube to append
-
append_feature_collection
(feature_collection_tile: openeo_udf.api.feature_collection.FeatureCollection)[source]¶ Append a feature collection tile to the list
It will be automatically added to the dictionary of all feature collection tiles
Parameters: feature_collection_tile (FeatureCollection) -- The feature collection tile to append
-
append_machine_learn_model
(machine_learn_model: openeo_udf.api.machine_learn_model.MachineLearnModelConfig)[source]¶ Append a machine learn model to the list
Parameters: machine_learn_model (MachineLearnModelConfig) -- A MachineLearnModel objects
-
append_structured_data
(structured_data: openeo_udf.api.structured_data.StructuredData)[source]¶ Append a structured data object to the list
Parameters: structured_data (StructuredData) -- A StructuredData objects
-
datacube_list
¶ Get the datacube list
-
feature_collection_list
¶ Get all feature collections as list
Returns: The list of feature collections Return type: list[FeatureCollection]
-
static
from_dict
()[source]¶ Create a udf data object from a python dictionary that was created from the JSON definition of the UdfData class
Parameters: udf_dict (dict) -- The dictionary that contains the udf data definition Returns: A new UdfData object Return type: UdfData
-
static
from_udf_data_model
() → UdfData[source]¶ TODO: Must be implemented
Parameters: udf_model -- Returns:
-
get_datacube_by_id
(id: str) → Union[openeo_udf.api.datacube.DataCube, NoneType][source]¶ Get a datacube by its id
Parameters: id (str) -- The datacube id Returns: the requested datacube or None if not found Return type: HypeCube
-
get_datacube_list
() → Union[typing.List[openeo_udf.api.datacube.DataCube], NoneType][source]¶ Get the datacube list
-
get_feature_collection_by_id
(id: str) → Union[openeo_udf.api.feature_collection.FeatureCollection, NoneType][source]¶ Get a feature collection by its id
Parameters: id (str) -- The vector tile id Returns: the requested feature collection or None if not found Return type: FeatureCollection
-
get_feature_collection_list
() → Union[typing.List[openeo_udf.api.feature_collection.FeatureCollection], NoneType][source]¶ Get all feature collections as list
Returns: The list of feature collections Return type: list[FeatureCollection]
-
get_ml_model_list
() → Union[typing.List[openeo_udf.api.machine_learn_model.MachineLearnModelConfig], NoneType][source]¶ Get all machine learn models
Returns: A list of MachineLearnModel objects Return type: (list[MachineLearnModel])
-
get_structured_data_list
() → Union[typing.List[openeo_udf.api.structured_data.StructuredData], NoneType][source]¶ Get all structured data entries
Returns: A list of StructuredData objects Return type: (list[StructuredData])
-
metadata
¶
-
ml_model_list
¶ Get all machine learn models
Returns: A list of MachineLearnModel objects Return type: (list[MachineLearnModel])
-
server_context
¶ Return the server context that is passed from the backend to the UDF server for runtime configuration
-
set_datacube_list
(datacube_list: List[openeo_udf.api.datacube.DataCube])[source]¶ Set the datacube list
If datacube_list is None, then the list will be cleared
Parameters: datacube_list (List[DataCube]) -- A list of HyperCube's
-
set_feature_collection_list
(feature_collection_list: Union[typing.List[openeo_udf.api.feature_collection.FeatureCollection], NoneType])[source]¶ Set the feature collection tiles
If feature_collection_tiles is None, then the list will be cleared
Parameters: feature_collection_list (list[FeatureCollection]) -- A list of FeatureCollectionTile's
-
set_ml_model_list
(ml_model_list: Union[typing.List[openeo_udf.api.machine_learn_model.MachineLearnModelConfig], NoneType])[source]¶ Set the list of machine learn models
If ml_model_list is None, then the list will be cleared
Parameters: ml_model_list (list[MachineLearnModelConfig]) -- A list of MachineLearnModel objects
-
set_structured_data_list
(structured_data_list: Union[typing.List[openeo_udf.api.structured_data.StructuredData], NoneType])[source]¶ Set the list of structured data
If structured_data_list is None, then the list will be cleared
Parameters: structured_data_list (list[StructuredData]) -- A list of StructuredData objects
-
structured_data_list
¶ Get all structured data entries
Returns: A list of StructuredData objects Return type: (list[StructuredData])
-
to_dict
() → Dict[source]¶ Convert this UdfData object into a dictionary that can be converted into a valid JSON representation
Returns: UdfData object as a dictionary Return type: dict
-
user_context
¶ Return the user context that was passed to the run_udf function
-
openeo_udf.api.udf_signatures module¶
This module defines a number of function signatures that can be implemented by UDF's. Both the name of the function and the argument types are/can be used by the backend to validate if the provided UDF is compatible with the calling context of the process graph in which it is used.
-
openeo_udf.api.udf_signatures.
apply_datacube
(cube: openeo_udf.api.datacube.DataCube, context: Dict) → openeo_udf.api.datacube.DataCube[source]¶ Map a DataCube to another DataCube. Depending on the context in which this function is used, the DataCube dimensions have to be retained or can be chained. For instance, in the context of a reducing operation along a dimension, that dimension will have to be reduced to a single value. In the context of a 1 to 1 mapping operation, all dimensions have to be retained.
Parameters: - cube -- A DataCube object
- context -- A dictionary containing user context.
Returns: A DataCube object
-
openeo_udf.api.udf_signatures.
apply_timeseries
(series: pandas.core.series.Series, context: Dict) → pandas.core.series.Series[source]¶ Process a timeseries of values, without changing the time instants. This can for instance be used for smoothing or gap-filling. TODO: do we need geospatial coordinates for the series?
Parameters: - series -- A Pandas Series object with a date-time index.
- context -- A dictionary containing user context.
Returns: A Pandas Series object with the same datetime index.
openeo_udf.api.udf_wrapper module¶
-
openeo_udf.api.udf_wrapper.
apply_timeseries
(series: pandas.core.series.Series, context: Dict) → pandas.core.series.Series[source]¶ Do something with the timeseries :param series: :param context: :return:
-
openeo_udf.api.udf_wrapper.
apply_timeseries_generic
(udf_data: openeo_udf.api.udf_data.UdfData, callback: Callable = <function apply_timeseries>)[source]¶ Implements the UDF contract by calling a user provided time series transformation function (apply_timeseries). Multiple bands are currently handled separately, another approach could provide a dataframe with a timeseries for each band.
Parameters: udf_data -- Returns: