Skip to content

Prepare Execute Request CommandLineTool

Goal

Wrap the prepare-execute-request step as a Common Workflow Language CommandLineTool and execute it with a CWL runner. Indeed, it prepare the execute request JSON file for OGC API Processes based on search results

How to wrap a step as a CWL CommandLineTool

The CWL document below shows the prepare-execute-request step wrapped as a CWL CommandLineTool:

prepare-execute-request-cli.cwl
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
cwlVersion: v1.2
$graph:
- class: CommandLineTool
  id: prepare-execute-request
  label: Prepare Execute Request
  doc: Prepare the execute request for OGC API Processes based on search results 

  baseCommand: ["/bin/bash", "run.sh"]
  arguments: []

  requirements:
  - class: InlineJavascriptRequirement
  - class: SchemaDefRequirement
    types:
    - $import: https://raw.githubusercontent.com/eoap/schemas/main/string_format.yaml
    - $import: https://raw.githubusercontent.com/eoap/schemas/main/geojson.yaml
    - $import: |-
        https://raw.githubusercontent.com/eoap/schemas/main/experimental/api-endpoint.yaml
    - $import: https://raw.githubusercontent.com/eoap/schemas/main/experimental/process.yaml
  - class: NetworkAccess
    networkAccess: true
  - class: InitialWorkDirRequirement
    listing:
    - entryname: input_execute_request.json
      entry: |-
        ${ return JSON.stringify(inputs.in_execute_request, null, 2); }
    - entryname: process_id.json
      entry: |-
        ${ return JSON.stringify(inputs.in_execute_request.process_id, null, 2); }
    - entryname: run.sh
      entry: |-
        #!/usr/bin/env bash
        set -x
        set -euo pipefail


        jq '[.features[].links[] | select(.rel=="self") | .href]' "$(inputs.search_results.path)" > stac_items.json

        jq --argjson stac_items "`cat stac_items.json`" \
          'del(.process_id) | .inputs.stac_items = $stac_items' \
          input_execute_request.json > execute_request.json

        cat execute_request.json | jq .

  inputs:
    in_execute_request:
      label: OGC API Processes settings
      doc: OGC API Processes settings for Landsat-9 data
      type: |-
        https://raw.githubusercontent.com/eoap/schemas/main/experimental/process.yaml#OGCExecuteProcessSettings

    search_results:
      label: Search Results
      doc: Search results from the discovery step
      type: File

  outputs:
    execute_request:
      type: File
      outputBinding:
        glob: execute_request.json
    process_id:
      type: Any
      outputBinding:
        glob: process_id.json
        loadContents: true
        outputEval: ${ return JSON.parse(self[0].contents); }

Let's break down the key components of this CWL document:

  • cwlVersion: v1.2: Specifies the version of the CWL specification that this document follows.

  • class: CommandLineTool: Indicates that this CWL document defines a command-line tool.

  • id: prepare-execute-request: Provides a unique identifier for this tool, which can be referenced in workflows.

  • arguments: This section is empty, meaning there are no additional command-line arguments specified here. The tool receives its arguments via the input parameters.

  • baseCommand: Defines the base command to be executed. In this case, it's running a bash script: ["/bin/bash", "run.sh"].

  • requirements: Specifies the requirements and dependencies of the tool:

  • InlineJavascriptRequirement: Allows the use of inline JavaScript expressions in the tool.

  • SchemaDefRequirement: Ensures that specific types are defined and validated according to the referenced schemas. Custom types allow for reuse and better validation. Learn more.
  • NetworkAccess: Specifies that the tool requires outgoing network access. Only needed for cwlVersion: v1.2; in v1.0 this is supported by default.
  • InitialWorkDirRequirement: Defines files that must be staged before running the tool. Specifically:

    • Creates input_execute_request.json containing the contents of inputs.in_execute_request converted to a pretty-printed JSON string:
    listing:
    - entryname: input_execute_request.json
      entry: |-
          ${ return JSON.stringify(inputs.in_execute_request, null, 2); }
    
    • Creates process_id.json containing only the process_id from inputs.in_execute_request, also formatted as pretty-printed JSON:
    listing:
    - entryname: process_id.json
      entry: |-
          ${ return JSON.stringify(inputs.in_execute_request.process_id, null, 2); }
    
    • Creates run.sh, which performs the following actions when executed:

    • Extracts STAC item URLs from the search_results JSON file and saves them to stac_items.json.

    • Deletes the process_id from input_execute_request.json and injects the extracted STAC items into the .inputs.stac_items field.
    • Outputs the resulting execute_request.json, which can then be used as input for subsequent workflow steps.
  • inputs: Defines the input parameters for the tool.

  • in_execute_request: OGC API Processes settings (e.g., for Sentinel2/Landsat-9 data).

  • search_results: Discovery results from the prior step, provided as a file.

  • outputs: Defines the outputs produced by the tool.

  • execute_request: The final JSON request prepared for execution. An expample of this JSON file is mentioned below:

    {
        "inputs": {
            "stac_items": [
                "https://planetarycomputer.microsoft.com/api/stac/v1/collections/landsat-c2-l2/items/LC08_L2SP_044032_20231208_02_T1",
                "https://planetarycomputer.microsoft.com/api/stac/v1/collections/landsat-c2-l2/items/LC08_L2SP_043033_20231201_02_T1"
            ],
            "aoi": "-121.399,39.834,-120.74,40.472",
            "epsg": "EPSG:4326",
            "bands": [
                "green",
                "nir08"
            ]
        }
    }
    

  • process_id: The original process identifier, preserved for reference.