wiki:WCSTImportGuide/GeneralRecipe

General Recipe for WCSTImport

The general recipe aims to be a highly flexible recipe that can handle any kind of data files (be it 2D, 3D or n-D) and model them in coverages of any dimensionality. It does that by allowing users to define their own coverage models with any number of bands and axes and fill the necesary coverage information through the so called ingredient sentences inside the ingredients.

Ingredient Sentences

An ingredient expression can be of multiple types:

  • Numeric - e.g. 2, 4.5
  • Strings - e.g. 'Some information'
  • Functions - e.g. datetime('2012-01-01', 'YYYY-mm-dd')
  • Expressions - allows a user to collect information from inside the ingested file using a specific driver. An expression is of form ${driverName:driverOperation} - e.g. ${gdal:minX}, ${netcdf:variable:time:min}. You can find all the possible expressions in the (Possible Expressions) section
  • Any valid python expression - You can combine the types below into a python expression; this allows you to do mathematical operations, some string parsing etc. - e.g. ${gdal:minX} + 1/2 * ${gdal:resolutionX} or datetime(${netcdf:variable:time:min} * 24 * 3600)

Recipe options

Using the ingredient sentences we can define any coverage model directly in the ingredient. To do this we just have to add the coverage model to the options of the recipe. Each coverage model contains:

  • a CRS - the crs of the coverage to be constructed. Either a CRS url e.g. http://opengis.net/def/crs/EPSG/0/4326 or http://ows.rasdaman.org/def/crs-compound?1=http://ows.rasdaman.org/def/crs/EPSG/0/4326&2=http://ows.rasdaman.org/def/crs/OGC/0/AnsiDate or the shorthand notations CRS1@CRS2@CRS3, e.g. EPSG/0/4326@OGC/0/AnsiDate
  • a metadata section - which specifies in which format you want the metadata (json or xml), the global metadata fields that should be saved (e.g. the licence, the creator etc) and the local metadata (an entry is saved for each file that was imported) fields that should be saved.
  • a slicer section - that specifies the driver (netcdf, gdal or grib) to use to read from the data files and for each axis from the CRS how to obtain the bounds and resolution corresponding to each file.
    • NOTE: "type": "gdal" is used for TIFF, PNG, and other 2D formats.

Let's take the recipe part of an ingredient file for grib format (further example for the netCDF format can be found here and for PNG here) and examine it:

  "recipe": {
    "name": "general_coverage",
    "options": {
      "__comment__": "You need to provide the coverage description and the method of building it.",
      "coverage": {
// We create a coverage with 4 axes by combining 3 CRSes. The axes will be Lat, Long, ansi, ensemble
        "crs": "EPSG/0/4326@OGC/0/AnsiDate@OGC/0/Index1D?axis-label=\"ensemble\"",
        "metadata": {
          "type": "json",
          "global": {
// We will save the following fields for the whole coverage
            "MarsType": "'${grib:marsType}'",
            "Experiment": "'${grib:experimentVersionNumber}'"
          },
          "local": {
// and the following field for each file that will compose the final coverage
            "level": "${grib:level}"
          }
        },
        "slicer": {
// we specify that we want to use the grib driver on our files. This will give us access to grib and file expressions.
          "type": "grib",
// we specify that the grib file considers pixels to be 0D, in the middle of the cell, as opposed to e.g. GeoTiff, which considers pixels to be intervals
          "pixelIsPoint": true,
// we define the bands that we want to create from the files
          "bands": [
            {
              "name": "temp2m",
              "definition": "The temperature at 2 meters.",
              "description": "We measure temperature at 2 meters using sensors and then we process the values using a sophisticated algorithm.",
              "nilReason": "The nil value represents an error in the sensor."
              "nilValue": "-99999"
            }
          ],
          "axes": {
// for each axis we define how to extract the spatio-temporal position of each file that we ingest
            "Lat": {
// e.g. to determine at which Latitude the nth file will be positioned, we will evaluate the given expression on the file
              "min": "${grib:latitudeOfLastGridPointInDegrees} + (${grib:jDirectionIncrementInDegrees} if bool(${grib:jScansPositively}) else -${grib:jDirectionIncrementInDegrees})",
              "max": "${grib:latitudeOfFirstGridPointInDegrees}",
              "resolution": "${grib:jDirectionIncrementInDegrees} if bool(${grib:jScansPositively}) else -${grib:jDirectionIncrementInDegrees}",
// the grid order specifies the order of the axis in the raster that will be created
              "gridOrder": 3
            },
            "Long": {
              "min": "${grib:longitudeOfFirstGridPointInDegrees}",
              "max": "${grib:longitudeOfLastGridPointInDegrees} + (-${grib:iDirectionIncrementInDegrees} if bool(${grib:iScansNegatively}) else ${grib:iDirectionIncrementInDegrees})",
              "resolution": "-${grib:iDirectionIncrementInDegrees} if bool(${grib:iScansNegatively}) else ${grib:iDirectionIncrementInDegrees}",
              "gridOrder": 2
            },
            "ansi": {
              "min": "grib_datetime(${grib:dataDate}, ${grib:dataTime})",
              "resolution": "1.0 / 4.0",
              "type": "ansidate",
              "gridOrder": 1,
              // default, all axis is set to True, but in case, axis is not belonged to file (such as Time) from file name then this property must set to false
              "dataBound": false

            },
            "ensemble": {
              "min": "${grib:localDefinitionNumber}",
              "resolution": 1,
              "gridOrder": 0
            }
          }
        }
      },
      "tiling": "REGULAR [0:0, 0:20, 0:1023, 0:1023]"
    }
  }

Possible Expressions

For each driver we will show all possible expressions that can be used. We will mark with capital letters, things that vary in the expression. E.g. ${gdal:metadata:YOUR_FIELD} means that you can replace YOUR_FIELD with any valid gdal metadata tag (e.g. a TIFFTAG_DATETIME)

Special cases

To import coverage netCDF, Grib which has irregular axis with aggregated values (e.g: dateTime, ensemble, levels,...), you can see the example here with the options directPositions in ingredient file.

Netcdf

Take a look at http://rasdaman.org/browser/applications/wcst_import/ingredients/general_coverage_netcdf.json for a general recipe ingredient file that uses a lot of netcdf expressions.

NOTE: netCDF global metadata is imported with this convention: http://rasdaman.org/ticket/1528

Type Description Examples
Metadata information ${netcdf:metadata:YOUR_METADATA_FIELD} ${netcdf:metadata:title}
Variable information ${netcdf:variable:YOUR_VARIABLE_NAME:YOUR_MODIFIER} - where YOUR_VARIABLE_NAME can be any variable in the file and YOUR_MODIFIER can be one of: first|last|max|min; Any extra modifiers will return the corresponding metadata field on the given variable ${netcdf:variable:time:min}
Dimension information ${netcdf:dimension:YOUR_DIMENSION_NAME} - where YOUR_DIMENSION_NAME can be any dimension in the file. This will return the value on the selected dimension ${netcdf:dimension:time}

GDAL

For TIFF, PNG, JPEG, and other 2D data formats we use GDAL. Take a look at http://rasdaman.org/browser/applications/wcst_import/ingredients/general_coverage_gdal_3d.json for a general recipe ingredient file that uses a lot of gdal expressions.

Type Description Examples
Metadata information ${gdal:metadata:YOUR_METADATA_FIELD} ${gdal:metadata:TIFFTAG_NAME}
Geo Bounds ${gdal:BOUND_NAME} where BOUND_NAME can be one of the minX|maxX|minY|maxY ${gdal:minX}
Geo Resoultion ${gdal:RESOLUTION_NAME} where RESOLUTION_NAME can be one of the resolutionX|resolutionY ${gdal:resolutionX}
Origin ${gdal:ORIGIN_NAME} where ORIGIN_NAME can be one of the originX|originY ${gdal:originY}

Grib

Take a look at http://rasdaman.org/browser/applications/wcst_import/ingredients/general_coverage_grib.json for a general recipe ingredient file that uses a lot of grib expressions.

Type Description Examples
GRIB Key ${grib:KEY} where KEY can be any of the keys contained in the GRIB file ${grib:longitudeOfFirstGridPointInDegrees}

File

Type Description Examples
File Information ${file:PROPERTY} where property can be one of the following options path|name ${file:path}

Special Functions

A couple of special functions are available to deal with some more complicated formats. These are:

Function Name Description Examples
grib_datetime(grib_date, grib_time) This function helps to deal with the usual grib date and time format. It returns back a datetime string in ISO format grib_datetime(${grib:dataDate}, ${grib:dataTime})
datetime(date, format) This function helps to deal with strange date time formats. It returns back a datetime string in ISO format datetime("20120101:1200", "YYYYMMDD:HHmm")
regex_extract(input, regex, group) This function extracts information from a string using regex; input is the string you parse, regex is the regular expression, group is the regex group you want to select datetime(regex_extract('${file:name}', '(.*)_(.*)_(.*)_(\\d\\d\\d\\d-\\d\\d)(.*)', 4), 'YYYY-MM')
Last modified 3 months ago Last modified on Apr 12, 2017 8:14:16 AM