# Processing Data Types¶

## Raster¶

Warning

If no nodata value is defined, GDAL will automatically assume 0, which might lead to artifacts at the image edges. Therefore always specify the nodata value, band-wise if necessary. You might use the Metadata Editor for this.

## Vector¶

• Can be any OGR readable vector format
• A vector layer is a list of features, where every feature consists of a geometry and attributes

Any GDAL/OGR readable raster or vector file can be interpreted as a boolean mask.

• In case of a raster, all pixels that are equal to the no data value (default is 0) are interpreted as False, all other pixels as True. Multiband rasters are first evaluated band wise. The final mask for a given pixel is True, if all band wise masks for that pixel are True.
• In case of a vector, all pixels covered by features are interpreted as True, all other pixels as False. This means:
• … for point features: If point falls inside the pixel, it is mapped to True
• … for line features: If pixel is on line render path, it is mapped to True
• … for polygon features: If the center of the pixel is within the polygon, the pixel is mapped to True

## Classification¶

• A classification is a representation of a map holding categorical information.

• 0 will be implicitly assumed as nodata value

• Metadata for class names and colors are saved in the ENVI metadata domain (class names, class lookup). If those parameters are not defined, classes will be numbered consecutively and random colors will be used.

• The class unclassified is always expected to be 0 and will be treated as nodata.

## Regression¶

• A regression is a representation of a map holding quantitative information.
• The number of bands and the band names are dependent on the number of response variables.
• The nodata value has to be defined
• Band names are stored in the GDAL band descriptions

## Fraction¶

• Special form of regression, where the quantitative information is representing class fractions (relative coverage of a class inside a pixel).

• Optional: Metadata for class names and colors are stored in the ENVI metadata domain (class names, class lookup). In this case there is no unclassified class (compared to Classification)

## Spectral Library¶

The EnMAP-Box supports the ENVI standard spectral library format (.sli + .hdr file). Spectral libraries can be imported as single line raster using the processing algorithm Auxillary ‣ Import Library.

Todo

Support for further formats will be implemented soon (e.g. import spectral library from ASD field spectrometer)

## Labelled Spectral Library¶

The labelled spectral library extents the default .sli format by adding additional metadata information (e.g., class labels, class colors). This information is stored by adding a .csv and .json file to the default spectral library, so that the labelled spectral library consists of

• .sli file (ENVI standard)

• .hdr file (ENVI standard)

• .csv file (containing the additional information)

• should be comma-separated csv

• should have same basename as .sli file

• first row stores the headers, where the first element has to be the spectra names as specified in the .hdr file:

spectra names, attribute1, attribute2

• Example from the EnMAP-Box test dataset:

• .json file (stores class name and class color information)

• should have same basename as .sli file

• class name and color information should be provided for every attribute in the csv:

{
"attribute_name": {
"categories":  [
[0, "unclassified", [0, 0, 0]],
[1, "class1", [230, 0, 0]],
[2, "class2", [56, 168, 0]],
[3, "class3", [168, 112, 0]],
[4, "class4", [0,100,255]]
],
"no data value": 0,
"description": "Classification"
}

• The keys categories, no data value and description should not be altered. But change attribute_name according to your data.

• no data value should be supplied

• Example from the EnMAP-Box test dataset:

## Models¶

• Certain algorithms produce output files with model information stored as .pkl file (e.g. algorithms starting with Fit …)
• There 4 kinds of model files: Classifiers, Clusterers, Regressors and Transformers.
• The content of those files can be inspected in the Data Sources panel

Note

You can generate example model files which are based on the EnMAP-Box test dataset. In the processing algorithms under EnMAP-Box ‣ Auxilliary you can find one algorithm for each kind of model file:

• Create Test Classifier (RandomForest)
• Create Clusterer (KMeans)
• Create Regressor (RandomForest)
• Create Transformer (PCA)