.. _wps: OGC Web Processing Service (OGC WPS) ==================================== `OGC Web Processing Service `_ standard provides rules for standardizing how inputs and outputs (requests and responses) for geospatial processing services. The standard also defines how a client can request the execution of a process, and how the output from the process is handled. It defines an interface that facilitates the publishing of geospatial processes and clients discovery of and binding to those processes. The data required by the WPS can be delivered across a network or they can be available at the server. .. note:: This description is mainly refering to 1.0.0 version standard, since PyWPS implements this version only. There is also 2.0.0 version, which we are about to implement in near future. WPS is intended to be state-less protocol (like any OGC services). For every request-response action, the negotiation between the server and the client has to start. There is no official way, how to make the server "remember", what was before, there is no communication history between the server and the client. Process ------- A process `p` is a function that for each input returns a corresponding output: .. math:: p: X \rightarrow Y where `X` denotes the domain of arguments `x` and `Y` denotes the co-domain of values `y`. Within the specification, process arguments are referred to as *process inputs* and result values are referred to as *process outputs*. Processes that have no process inputs represent value generators that deliver constant or random process outputs. *Process* is just some geospatial operation, which has it's in- and outputs and which is deployed on the server. It can be something relatively simple (adding two raster maps together) or very complicated (climate change model). It can take short time (seconds) or long (days) to be calculated. Process is, what you, as PyWPS user, want to expose to other people and let their data processed. Every process has the following properties: Identifier Unique process identifier Title Human readable title Abstract Longer description of the process, what it does, how is it supposed to be used And a list of inputs and outputs. Data inputs and outputs ----------------------- OGC WPS defines 3 types of data inputs and outputs: *LiteralData*, *ComplexData* and *BoundingBoxData*. All data types do need to have following properties: Identifier Unique input identifier Title Human readable title Abstract Longer description of data input or output, so that the user could get oriented. minOccurs Minimal occurrence of the input (e.g. there can be more bands of raster file and they all can be passed as input using the same identifier) maxOccurs Maxium number of times, the input or output is present Depending on the data type (Literal, Complex, BoundingBox), other attributes might occur too. LiteralData ~~~~~~~~~~~ Literal data is any text string, usually short. It's used for passing single parameters like numbers or text parameters. WPS enables to the server, to define `allowedValues` - list or intervals of allowed values, as well as data type (integer, float, string). Additional attributes can be set, such as `units` or `encoding`. ComplexData ~~~~~~~~~~~ Complex data are usually raster or vector files, but basically any (usually file based) data, which are usually processed (or result of the process). The input can be specified more using `mimeType`, XML `schema` or `encoding` (such as `base64` for raster data. .. note:: PyWPS (like every server) supports limited list `mimeTypes`. In case you need some new format, just create pull request in our repository. Refer :const:`pywps.inout.formats.FORMATS` for more details. Usually, the minimum requirement for input data identification is `mimeType`. That usually is `application/gml+xml` for `GML `_-encoded vector files, `image/tiff; subtype=geotiff` for raster files. The input or output can also be result of any OGC OWS service. BoundingBoxData ~~~~~~~~~~~~~~~ .. todo:: add reference to OGC OWS Common spec BoundingBox data are specified in OGC OWS Common specification as two pairs of coordinate (for 2D and 3D space). They can either be encoded in WGS84 or EPSG code can be passed too. They are intended to be used as definition of the target region. .. note:: In real life, BoundingBox data are not that commonly used Passing data to process instance -------------------------------- There are typically 3 approaches to pass the input data from the client to the server: **Data are on the server already** In the first case, the data are already stored on the server (from the point of view of the client). This is the simplest case. **Data are send to the server along with the request** In this case, the data are directly part of the XML encoded document send via HTTP POST. Some clients/servers are expecting the data to be inserted in `CDATA` section. The data can be text based (JSON), XML based (GML) or even raster based - in this case, they are usually encoded using `base64 `_. **Reference link to target service is passed** Client does not have to pass the data itself, client can just send reference link to target data service (or file). In such case, for example OGC WFS `GetFeatureType` URL can be passed and server will download the data automatically. Although this is usually used for `ComplexData` input type, it can be used for literal and bounding box data too. Synchronous versus asynchronous process request ----------------------------------------------- There are two modes of process instance execution: Synchronous and asynchronous. Synchronous mode The client sends the `Execute` request to the server and waits with open server connection, till the process is calculated and final response is returned back. This is useful for fast calculations which do not take longer then a couple of seconds (`Apache2 httpd server uses 300 seconds `_ as default value for ConnectionTimeout). Asynchronous mode Client sends the `Execute` request with explicit request for asynchronous mode. If supported by the process (in PyWPS, we have a configuration for that), the server returns back `ProcessAccepted` response immediately with URL, where the client can regularly check for *process execution status*. .. note:: As you see, using WPS, the client has to apply *pull* method for the communication with the server. Client has to be the active element in the communication - server is just responding to clients request and is not actively *pushing* any information (like it would if e.g. web sockets would be implemented). Process status -------------- `Process status` is generic status of the process instance, reporting to the client, how does the calculation go. There are 4 types of process statuses ProcessAccepted Process was accepted by the server and the process execution will start soon. ProcessStarted Process calculation has started. The status also contains report about `percentDone` - calculation progress and `statusMessage` - text reporting current calculation state (example: *"Caculationg buffer"* - 33%). ProcessFinished Process instance performed the calculation successfully and the final `Execute` response is returned to the client and/or stored on final location ProcessFailed There was something wrong with the process instance and the server reports `server exception` (see :py:mod:`pywps.exceptions`) along with the message, what could possibly go wrong. Request encoding, HTTP GET and POST ----------------------------------- The request can be encoded either using key-value pairs (KVP) or an XML payload. Key-value pairs is usually sent via `HTTP GET request method `_ encoded directly in the URL. The keys and values are separated with `=` sign and each pair is separated with `&` sign (with `?` at the beginning of the request. Example could be the *get capabilities reques*:: http://server.domain/wps?service=WPS&request=GetCapabilities&version=1.0.0 In this example, there are 3 pairs of input parameter: `service`, `request` and `version` with values `WPS`, `GetCapabilities` and `1.0.0` respectively. XML payload is XML data sent via `HTTP POST request method `_. The XML document can be more rich, having more parameters, better to be parsed in complex structures. The Client can also encode entire datasets to the request, including raster (encoded using base64) or vector data (usually as GML file).:: 1.0.0 .. note:: Even it might be looking more complicated to use XML over KVP, for some complex request it usually is more safe and efficient to use XML encoding. The KVP way, especially for WPS Execute request can be tricky and lead to unpredictable errors.