GradientBoostingClassifier
¶
Gradient boosting classification.
Parent type: Classifier
Parameters:
Name | Type | Description | Default |
---|---|---|---|
treeCount |
Int |
The number of boosting stages to perform. Gradient boosting is fairly robust to over-fitting so a large number usually results in better performance. | 100 |
learningRate |
Float |
The larger the value, the more the model is influenced by each additional tree. If the learning rate is too low, the model might underfit. If the learning rate is too high, the model might overfit. | 0.1 |
Examples:
pipeline example {
val training = Table.fromCsvFile("training.csv").toTabularDataset("target");
val test = Table.fromCsvFile("test.csv").toTabularDataset("target");
val classifier = GradientBoostingClassifier(treeCount = 50).fit(training);
val accuracy = classifier.accuracy(test);
}
Stub code in GradientBoostingClassifier.sdsstub
isFitted
¶
Whether the model is fitted.
Type: Boolean
learningRate
¶
The learning rate.
Type: Float
treeCount
¶
The number of trees (estimators) in the ensemble.
Type: Int
accuracy
¶
Compute the accuracy of the classifier on the given data.
The accuracy is the proportion of predicted target values that were correct. The higher the accuracy, the better. Results range from 0.0 to 1.0.
Note: The model must be fitted.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
validationOrTestSet |
union<Table, TabularDataset> |
The validation or test set. | - |
Results:
Name | Type | Description |
---|---|---|
accuracy |
Float |
The classifier's accuracy. |
Stub code in Classifier.sdsstub
f1Score
¶
Compute the classifier's F₁ score on the given data.
The F₁ score is the harmonic mean of precision and recall. The higher the F₁ score, the better the classifier. Results range from 0.0 to 1.0.
Note: The model must be fitted.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
validationOrTestSet |
union<Table, TabularDataset> |
The validation or test set. | - |
positiveClass |
Any |
The class to be considered positive. All other classes are considered negative. | - |
Results:
Name | Type | Description |
---|---|---|
f1Score |
Float |
The classifier's F₁ score. |
Stub code in Classifier.sdsstub
fit
¶
Create a copy of this classifier and fit it with the given training data.
This classifier is not modified.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
trainingSet |
TabularDataset |
The training data containing the feature and target vectors. | - |
Results:
Name | Type | Description |
---|---|---|
fittedClassifier |
GradientBoostingClassifier |
The fitted classifier. |
Stub code in GradientBoostingClassifier.sdsstub
getFeatureNames
¶
Return the names of the feature columns.
Note: The model must be fitted.
Results:
Name | Type | Description |
---|---|---|
featureNames |
List<String> |
The names of the feature columns. |
Stub code in SupervisedModel.sdsstub
getFeaturesSchema
¶
Return the schema of the feature columns.
Note: The model must be fitted.
Results:
Name | Type | Description |
---|---|---|
featureSchema |
Schema |
The schema of the feature columns. |
Stub code in SupervisedModel.sdsstub
getTargetName
¶
Return the name of the target column.
Note: The model must be fitted.
Results:
Name | Type | Description |
---|---|---|
targetName |
String |
The name of the target column. |
Stub code in SupervisedModel.sdsstub
getTargetType
¶
Return the type of the target column.
Note: The model must be fitted.
Results:
Name | Type | Description |
---|---|---|
targetType |
ColumnType |
The type of the target column. |
Stub code in SupervisedModel.sdsstub
precision
¶
Compute the classifier's precision on the given data.
The precision is the proportion of positive predictions that were correct. The higher the precision, the better the classifier. Results range from 0.0 to 1.0.
Note: The model must be fitted.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
validationOrTestSet |
union<Table, TabularDataset> |
The validation or test set. | - |
positiveClass |
Any |
The class to be considered positive. All other classes are considered negative. | - |
Results:
Name | Type | Description |
---|---|---|
precision |
Float |
The classifier's precision. |
Stub code in Classifier.sdsstub
predict
¶
Predict the target values on the given dataset.
Note: The model must be fitted.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset |
union<Table, TabularDataset> |
The dataset containing at least the features. | - |
Results:
Name | Type | Description |
---|---|---|
prediction |
TabularDataset |
The given dataset with an additional column for the predicted target values. |
Stub code in SupervisedModel.sdsstub
recall
¶
Compute the classifier's recall on the given data.
The recall is the proportion of actual positives that were predicted correctly. The higher the recall, the better the classifier. Results range from 0.0 to 1.0.
Note: The model must be fitted.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
validationOrTestSet |
union<Table, TabularDataset> |
The validation or test set. | - |
positiveClass |
Any |
The class to be considered positive. All other classes are considered negative. | - |
Results:
Name | Type | Description |
---|---|---|
recall |
Float |
The classifier's recall. |
Stub code in Classifier.sdsstub
summarizeMetrics
¶
Summarize the classifier's metrics on the given data.
Note: The model must be fitted.
API Stability
Do not rely on the exact output of this method. In future versions, we may change the displayed metrics without prior notice.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
validationOrTestSet |
union<Table, TabularDataset> |
The validation or test set. | - |
positiveClass |
Any |
The class to be considered positive. All other classes are considered negative. | - |
Results:
Name | Type | Description |
---|---|---|
metrics |
Table |
A table containing the classifier's metrics. |