Classifier feature ranking (permutation importance)
Permutation feature importance is a model inspection technique that is especially useful for non-linear or opaque estimators. The permutation feature importance is defined to be the decrease in a model score when a single feature value is randomly shuffled. This procedure breaks the relationship between the feature and the target, thus the drop in the model score is indicative of how much the model depends on the feature. This technique benefits from being model agnostic and can be calculated many times with different permutations of the feature.
Usage:
Open the algorithm from the processing toolbox.
Select a classifier, then click run.
The output report will automatically open in your web browser
Parameters
- Classifier [file]
Classifier pickle file. In case of an unfitted classifier, also specify a training dataset.
- Training dataset [file]
Training dataset pickle file used for (re-)fitting the classifier. Can be skipped in case of a fitted classifier.
- Test dataset [file]
Test dataset pickle file used for performance evaluation. If skipped, the training dataset is used.
- Evaluation metric [enum]
An evaluation metric to use. See Metrics and scoring: quantifying the quality of predictions for further information. Default: 7
- Number of repetitions [number]
Number of times to permute a feature. Default: 10
- Random seed [number]
The seed for the random generator can be provided.
- Open output report in webbrowser after running algorithm [boolean]
Whether to open the output report in the web browser. Default: True
Outputs
- Output report [fileDestination]
Report file destination.
Command-line usage
>qgis_process help enmapbox:ClassifierFeatureRankingPermutationImportance
:
----------------
Arguments
----------------
classifier: Classifier
Argument type: file
Acceptable values:
- Path to a file
trainDataset: Training dataset (optional)
Argument type: file
Acceptable values:
- Path to a file
testDataset: Test dataset (optional)
Argument type: file
Acceptable values:
- Path to a file
evaluationMetric: Evaluation metric
Default value: 7
Argument type: enum
Available values:
- 0: accuracy
- 1: balanced_accuracy
- 2: top_k_accuracy
- 3: average_precision
- 4: neg_brier_score
- 5: f1
- 6: f1_micro
- 7: f1_macro
- 8: f1_weighted
- 9: f1_samples
- 10: neg_log_loss
- 11: precision
- 12: recall
- 13: jaccard
- 14: roc_auc
- 15: roc_auc_ovr
- 16: roc_auc_ovo
- 17: roc_auc_ovr_weighted
- 18: roc_auc_ovo_weighted
Acceptable values:
- Number of selected option, e.g. '1'
- Comma separated list of options, e.g. '1,3'
repeats: Number of repetitions
Default value: 10
Argument type: number
Acceptable values:
- A numeric value
- field:FIELD_NAME to use a data defined value taken from the FIELD_NAME field
- expression:SOME EXPRESSION to use a data defined value calculated using a custom QGIS expression
seed: Random seed (optional)
Argument type: number
Acceptable values:
- A numeric value
- field:FIELD_NAME to use a data defined value taken from the FIELD_NAME field
- expression:SOME EXPRESSION to use a data defined value calculated using a custom QGIS expression
openReport: Open output report in webbrowser after running algorithm
Default value: true
Argument type: boolean
Acceptable values:
- 1 for true/yes
- 0 for false/no
- field:FIELD_NAME to use a data defined value taken from the FIELD_NAME field
- expression:SOME EXPRESSION to use a data defined value calculated using a custom QGIS expression
outputPermutationImportanceRanking: Output report
Argument type: fileDestination
Acceptable values:
- Path for new file
----------------
Outputs
----------------
outputPermutationImportanceRanking: <outputHtml>
Output report