Fit RobustScaler
Scale features using statistics that are robust to outliers. This Scaler removes the median and scales the data according to the quantile range (defaults to IQR: Interquartile Range). The IQR is the range between the 1st quartile (25th quantile) and the 3rd quartile (75th quantile). Centering and scaling happen independently on each feature by computing the relevant statistics on the samples in the training set. Median and interquartile range are then stored to be used on later data using the transform method. Standardization of a dataset is a common requirement for many machine learning estimators. Typically this is done by removing the mean and scaling to unit variance. However, outliers can often influence the sample mean / variance in a negative way. In such cases, the median and the interquartile range often give better results.
Usage:
Start the algorithm from the Processing Toolbox panel.
Select a raster layer to process and click run.
Parameters
- Transformer [string]
Scikit-learn python code. See RobustScaler for information on different parameters.
Default:
from sklearn.preprocessing import RobustScaler transformer = RobustScaler(quantile_range=(25, 75))
- Raster layer with features [raster]
Raster layer with feature data X used for fitting the transformer. Mutually exclusive with parameter: Training dataset
- Sample size [number]
Approximate number of samples drawn from raster. If 0, whole raster will be used. Note that this is only a hint for limiting the number of rows and columns.
Default: 1000
- Training dataset [file]
Training dataset pickle file used for fitting the transformer. Mutually exclusive with parameter: Raster layer with features
Outputs
- Output transformer [fileDestination]
Pickle file destination.
Command-line usage
>qgis_process help enmapbox:FitRobustscaler
:
----------------
Arguments
----------------
transformer: Transformer
Default value: from sklearn.preprocessing import RobustScaler
transformer = RobustScaler(quantile_range=(25, 75))
Argument type: string
Acceptable values:
- String value
- field:FIELD_NAME to use a data defined value taken from the FIELD_NAME field
- expression:SOME EXPRESSION to use a data defined value calculated using a custom QGIS expression
featureRaster: Raster layer with features (optional)
Argument type: raster
Acceptable values:
- Path to a raster layer
sampleSize: Sample size (optional)
Default value: 1000
Argument type: number
Acceptable values:
- A numeric value
- field:FIELD_NAME to use a data defined value taken from the FIELD_NAME field
- expression:SOME EXPRESSION to use a data defined value calculated using a custom QGIS expression
dataset: Training dataset (optional)
Argument type: file
Acceptable values:
- Path to a file
outputTransformer: Output transformer
Argument type: fileDestination
Acceptable values:
- Path for new file
----------------
Outputs
----------------
outputTransformer: <outputFile>
Output transformer