OGC Web Processing Service (OGC WPS)¶
OGC Web Processing Service standard provides rules for standardizing how inputs and outputs (requests and responses) for geospatial processing services. The standard also defines how a client can request the execution of a process, and how the output from the process is handled. It defines an interface that facilitates the publishing of geospatial processes and clients discovery of and binding to those processes. The data required by the WPS can be delivered across a network or they can be available at the server.
This description is mainly refering to 1.0.0 version standard, since PyWPS implements this version only. There is also 2.0.0 version, which we are about to implement in near future.
WPS is intended to be state-less protocol (like any OGC services). For every request-response action, the negotiation between the server and the client has to start. There is no official way, how to make the server “remember”, what was before, there is no communication history between the server and the client.
A process p is a function that for each input returns a corresponding output:
where X denotes the domain of arguments x and Y denotes the co-domain of values y.
Within the specification, process arguments are referred to as process inputs and result values are referred to as process outputs. Processes that have no process inputs represent value generators that deliver constant or random process outputs.
Process is just some geospatial operation, which has it’s in- and outputs and which is deployed on the server. It can be something relatively simple (adding two raster maps together) or very complicated (climate change model). It can take short time (seconds) or long (days) to be calculated. Process is, what you, as PyWPS user, want to expose to other people and let their data processed.
Every process has the following properties:
- Unique process identifier
- Human readable title
- Longer description of the process, what it does, how is it supposed to be used
And a list of inputs and outputs.
Data inputs and outputs¶
OGC WPS defines 3 types of data inputs and outputs: LiteralData, ComplexData and BoundingBoxData.
All data types do need to have following properties:
- Unique input identifier
- Human readable title
- Longer description of data input or output, so that the user could get oriented.
- Minimal occurrence of the input (e.g. there can be more bands of raster file and they all can be passed as input using the same identifier)
- Maxium number of times, the input or output is present
Depending on the data type (Literal, Complex, BoundingBox), other attributes might occur too.
Literal data is any text string, usually short. It’s used for passing single parameters like numbers or text parameters. WPS enables to the server, to define allowedValues - list or intervals of allowed values, as well as data type (integer, float, string). Additional attributes can be set, such as units or encoding.
Complex data are usually raster or vector files, but basically any (usually file based) data, which are usually processed (or result of the process). The input can be specified more using mimeType, XML schema or encoding (such as base64 for raster data.
PyWPS (like every server) supports limited list mimeTypes. In case
you need some new format, just create pull request in our repository.
pywps.inout.formats.FORMATS for more details.
Usually, the minimum requirement for input data identification is mimeType. That usually is application/gml+xml for GML-encoded vector files, image/tiff; subtype=geotiff for raster files. The input or output can also be result of any OGC OWS service.
add reference to OGC OWS Common spec
BoundingBox data are specified in OGC OWS Common specification as two pairs of coordinate (for 2D and 3D space). They can either be encoded in WGS84 or EPSG code can be passed too. They are intended to be used as definition of the target region.
In real life, BoundingBox data are not that commonly used
Passing data to process instance¶
There are typically 3 approaches to pass the input data from the client to the server:
- Data are on the server already
- In the first case, the data are already stored on the server (from the point of view of the client). This is the simplest case.
- Data are send to the server along with the request
- In this case, the data are directly part of the XML encoded document send via HTTP POST. Some clients/servers are expecting the data to be inserted in CDATA section. The data can be text based (JSON), XML based (GML) or even raster based - in this case, they are usually encoded using base64.
- Reference link to target service is passed
Client does not have to pass the data itself, client can just send reference link to target data service (or file). In such case, for example OGC WFS GetFeatureType URL can be passed and server will download the data automatically.
Although this is usually used for ComplexData input type, it can be used for literal and bounding box data too.
Sychronous versus asynchronous process request¶
There are two modes of process instance execution: Synchronous and asynchronous.
- Synchronous mode
- The client sends the Execute request to the server and waits with open server connection, till the process is calculated and final response is returned back. This is useful for fast calculations which do not take longer then a couple of seconds (Apache2 httpd server uses 300 seconds as default value for ConnectionTimeout).
- Asynchronous mode
Client sends the Execute request with explicit request for asynchronous mode. If supported by the process (in PyWPS, we have a configuration for that), the server returns back ProcessAccepted response immediately with URL, where the client can regularly check for process execution status.
As you see, using WPS, the client has to apply pull method for the communication with the server. Client has to be the active element in the communication - server is just responding to clients request and is not actively pushing any information (like it would if e.g. web sockets would be implemented).
Process status is generic status of the process instance, reporting to the client, how does the calculation go. There are 4 types of process statuses
- Process was accepted by the server and the process execution will start soon.
- Process calculation has started. The status also contains report about percentDone - calculation progress and statusMessage - text reporting current calculation state (example: “Caculationg buffer” - 33%).
- Process instance performed the calculation successfully and the final Execute response is returned to the client and/or stored on final location
- There was something wrong with the process instance and the server reports
server exception (see
pywps.exceptions) along with the message, what could possibly go wrong.
Request encoding, HTTP GET and POST¶
The request can be encoded either using key-value pairs (KVP) or an XML payload.
- Key-value pairs
is usually sent via HTTP GET request method encoded directly in the URL. The keys and values are separated with = sign and each pair is separated with & sign (with ? at the beginning of the request. Example could be the get capabilities reques:
In this example, there are 3 pairs of input parameter: service, request and version with values WPS, GetCapabilities and 1.0.0 respectively.
- XML payload
is XML data sent via HTTP POST request method. The XML document can be more rich, having more parameters, better to be parsed in complex structures. The Client can also encode entire datasets to the request, including raster (encoded using base64) or vector data (usually as GML file).:
<?xml version="1.0" encoding="UTF-8"?> <wps:GetCapabilities language="cz" service="WPS" xmlns:ows="http://www.opengis.net/ows/1.1" xmlns:wps="http://www.opengis.net/wps/1.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/wps/1.0.0 http://schemas.opengis.net/wps/1.0.0/wpsGetCapabilities_request.xsd"> <wps:AcceptVersions> <ows:Version>1.0.0</ows:Version> </wps:AcceptVersions> </wps:GetCapabilities>
Even it might be looking more complicated to use XML over KVP, for some complex request it usually is more safe and efficient to use XML encoding. The KVP way, especially for WPS Execute request can be tricky and lead to unpredictable errors.