Random samples from classification dataset
Split a dataset by randomly drawing samples.
Usage:
Open the algorithm from the processing toolbox.
Select or create a classification dataset, specify the number of samples per class, then click run.
Parameters
- Classification dataset [file]
Classification dataset pickle file with feature data X and target data y to draw from.
- Number of samples per category [string]
Number of samples to draw from each category. Set a single value N to draw N samples for each category. Set a list of values N1, N2, … Ni, … to draw Ni samples for category i.
- Draw with replacement [boolean]
Whether to draw samples with replacement. Default: False
- Draw proportional [boolean]
Whether to interprete number of samples N or Ni as percentage to be drawn from each category. Default: False
- Random seed [number]
The seed for the random generator can be provided.
Outputs
- Output dataset [fileDestination]
Pickle file destination.Stores sampled data.
- Output dataset complement [fileDestination]
Pickle file destination.Stores remaining data that was not sampled.
Command-line usage
>qgis_process help enmapbox:RandomSamplesFromClassificationDataset
:
----------------
Arguments
----------------
dataset: Classification dataset
Argument type: file
Acceptable values:
- Path to a file
n: Number of samples per category
Argument type: string
Acceptable values:
- String value
- field:FIELD_NAME to use a data defined value taken from the FIELD_NAME field
- expression:SOME EXPRESSION to use a data defined value calculated using a custom QGIS expression
replace: Draw with replacement
Default value: false
Argument type: boolean
Acceptable values:
- 1 for true/yes
- 0 for false/no
- field:FIELD_NAME to use a data defined value taken from the FIELD_NAME field
- expression:SOME EXPRESSION to use a data defined value calculated using a custom QGIS expression
proportional: Draw proportional
Default value: false
Argument type: boolean
Acceptable values:
- 1 for true/yes
- 0 for false/no
- field:FIELD_NAME to use a data defined value taken from the FIELD_NAME field
- expression:SOME EXPRESSION to use a data defined value calculated using a custom QGIS expression
seed: Random seed (optional)
Argument type: number
Acceptable values:
- A numeric value
- field:FIELD_NAME to use a data defined value taken from the FIELD_NAME field
- expression:SOME EXPRESSION to use a data defined value calculated using a custom QGIS expression
outputDatasetRandomSample: Output dataset
Argument type: fileDestination
Acceptable values:
- Path for new file
outputDatasetRandomSampleComplement: Output dataset complement (optional)
Argument type: fileDestination
Acceptable values:
- Path for new file
----------------
Outputs
----------------
outputDatasetRandomSample: <outputFile>
Output dataset
outputDatasetRandomSampleComplement: <outputFile>
Output dataset complement