Skip to content

Regressor

A model for regression tasks.

Parent type: SupervisedModel

Inheritors:

Stub code in Regressor.sdsstub
class Regressor sub SupervisedModel {
    /**
     * Create a copy of this model and fit it with the given training data.
     *
     * **Note:** This model is not modified.
     *
     * @param trainingSet The training data containing the features and target.
     *
     * @result fittedModel The fitted model.
     */
    @Pure
    fun fit(
        @PythonName("training_set") trainingSet: TabularDataset
    ) -> fittedModel: Regressor

    /**
     * Summarize the regressor's metrics on the given data.
     *
     * **Note:** The model must be fitted.
     *
     * @param validationOrTestSet The validation or test set.
     *
     * @result metrics A table containing the regressor's metrics.
     */
    @Pure
    @PythonName("summarize_metrics")
    fun summarizeMetrics(
        @PythonName("validation_or_test_set") validationOrTestSet: union<Table, TabularDataset>
    ) -> metrics: Table

    /**
     * Compute the coefficient of determination (R²) of the regressor on the given data.
     *
     * The coefficient of determination compares the regressor's predictions to another model that always predicts the
     * mean of the target values. It is a measure of how well the regressor explains the variance in the target values.
     *
     * The **higher** the coefficient of determination, the better the regressor. Results range from negative infinity
     * to 1.0. You can interpret the coefficient of determination as follows:
     *
     * | R²         | Interpretation                                                                             |
     * | ---------- | ------------------------------------------------------------------------------------------ |
     * | 1.0        | The model perfectly predicts the target values. Did you overfit?                           |
     * | (0.0, 1.0) | The model is better than predicting the mean of the target values. You should be here.     |
     * | 0.0        | The model is as good as predicting the mean of the target values. Try something else.      |
     * | (-∞, 0.0)  | The model is worse than predicting the mean of the target values. Something is very wrong. |
     *
     * **Notes:**
     *
     * - The model must be fitted.
     * - Some other libraries call this metric `r2_score`.
     *
     * @param validationOrTestSet The validation or test set.
     *
     * @result coefficientOfDetermination The coefficient of determination of the regressor.
     */
    @Pure
    @PythonName("coefficient_of_determination")
    fun coefficientOfDetermination(
        @PythonName("validation_or_test_set") validationOrTestSet: union<Table, TabularDataset>
    ) -> coefficientOfDetermination: Float

    /**
     * Compute the mean absolute error (MAE) of the regressor on the given data.
     *
     * The mean absolute error is the average of the absolute differences between the predicted and expected target
     * values. The **lower** the mean absolute error, the better the regressor. Results range from 0.0 to positive
     * infinity.
     *
     *
     * **Note:** The model must be fitted.
     *
     * @param validationOrTestSet The validation or test set.
     *
     * @result meanAbsoluteError The mean absolute error of the regressor.
     */
    @Pure
    @PythonName("mean_absolute_error")
    fun meanAbsoluteError(
        @PythonName("validation_or_test_set") validationOrTestSet: union<Table, TabularDataset>
    ) -> meanAbsoluteError: Float

    /**
     * Compute the mean directional accuracy (MDA) of the regressor on the given data.
     *
     * This metric compares two consecutive target values and checks if the predicted direction (down/unchanged/up)
     * matches the expected direction. The mean directional accuracy is the proportion of correctly predicted
     * directions. The **higher** the mean directional accuracy, the better the regressor. Results range from 0.0 to
     * 1.0.
     *
     * This metric is useful for time series data, where the order of the target values has a meaning. It is not useful
     * for other types of data. Because of this, it is not included in the `summarize_metrics` method.
     *
     *
     * **Note:** The model must be fitted.
     *
     * @param validationOrTestSet The validation or test set.
     *
     * @result meanDirectionalAccuracy The mean directional accuracy of the regressor.
     */
    @Pure
    @PythonName("mean_directional_accuracy")
    fun meanDirectionalAccuracy(
        @PythonName("validation_or_test_set") validationOrTestSet: union<Table, TabularDataset>
    ) -> meanDirectionalAccuracy: Float

    /**
     * Compute the mean squared error (MSE) of the regressor on the given data.
     *
     * The mean squared error is the average of the squared differences between the predicted and expected target
     * values. The **lower** the mean squared error, the better the regressor. Results range from 0.0 to positive
     * infinity.
     *
     * **Notes:**
     *
     * - The model must be fitted.
     * - To get the root mean squared error (RMSE), take the square root of the result.
     *
     * @param validationOrTestSet The validation or test set.
     *
     * @result meanSquaredError The mean squared error of the regressor.
     */
    @Pure
    @PythonName("mean_squared_error")
    fun meanSquaredError(
        @PythonName("validation_or_test_set") validationOrTestSet: union<Table, TabularDataset>
    ) -> meanSquaredError: Float

    /**
     * Compute the median absolute deviation (MAD) of the regressor on the given data.
     *
     * The median absolute deviation is the median of the absolute differences between the predicted and expected
     * target values. The **lower** the median absolute deviation, the better the regressor. Results range from 0.0 to
     * positive infinity.
     *
     *
     * **Note:** The model must be fitted.
     *
     * @param validationOrTestSet The validation or test set.
     *
     * @result medianAbsoluteDeviation The median absolute deviation of the regressor.
     */
    @Pure
    @PythonName("median_absolute_deviation")
    fun medianAbsoluteDeviation(
        @PythonName("validation_or_test_set") validationOrTestSet: union<Table, TabularDataset>
    ) -> medianAbsoluteDeviation: Float
}

isFitted

Whether the model is fitted.

Type: Boolean

coefficientOfDetermination

Compute the coefficient of determination (R²) of the regressor on the given data.

The coefficient of determination compares the regressor's predictions to another model that always predicts the mean of the target values. It is a measure of how well the regressor explains the variance in the target values.

The higher the coefficient of determination, the better the regressor. Results range from negative infinity to 1.0. You can interpret the coefficient of determination as follows:

Interpretation
1.0 The model perfectly predicts the target values. Did you overfit?
(0.0, 1.0) The model is better than predicting the mean of the target values. You should be here.
0.0 The model is as good as predicting the mean of the target values. Try something else.
(-∞, 0.0) The model is worse than predicting the mean of the target values. Something is very wrong.

Notes:

  • The model must be fitted.
  • Some other libraries call this metric r2_score.

Parameters:

Name Type Description Default
validationOrTestSet union<Table, TabularDataset> The validation or test set. -

Results:

Name Type Description
coefficientOfDetermination Float The coefficient of determination of the regressor.
Stub code in Regressor.sdsstub
@Pure
@PythonName("coefficient_of_determination")
fun coefficientOfDetermination(
    @PythonName("validation_or_test_set") validationOrTestSet: union<Table, TabularDataset>
) -> coefficientOfDetermination: Float

fit

Create a copy of this model and fit it with the given training data.

Note: This model is not modified.

Parameters:

Name Type Description Default
trainingSet TabularDataset The training data containing the features and target. -

Results:

Name Type Description
fittedModel Regressor The fitted model.
Stub code in Regressor.sdsstub
@Pure
fun fit(
    @PythonName("training_set") trainingSet: TabularDataset
) -> fittedModel: Regressor

getFeatureNames

Return the names of the feature columns.

Note: The model must be fitted.

Results:

Name Type Description
featureNames List<String> The names of the feature columns.
Stub code in SupervisedModel.sdsstub
@Pure
@PythonName("get_feature_names")
fun getFeatureNames() -> featureNames: List<String>

getFeaturesSchema

Return the schema of the feature columns.

Note: The model must be fitted.

Results:

Name Type Description
featureSchema Schema The schema of the feature columns.
Stub code in SupervisedModel.sdsstub
@Pure
@PythonName("get_features_schema")
fun getFeaturesSchema() -> featureSchema: Schema

getTargetName

Return the name of the target column.

Note: The model must be fitted.

Results:

Name Type Description
targetName String The name of the target column.
Stub code in SupervisedModel.sdsstub
@Pure
@PythonName("get_target_name")
fun getTargetName() -> targetName: String

getTargetType

Return the type of the target column.

Note: The model must be fitted.

Results:

Name Type Description
targetType DataType The type of the target column.
Stub code in SupervisedModel.sdsstub
@Pure
@PythonName("get_target_type")
fun getTargetType() -> targetType: DataType

meanAbsoluteError

Compute the mean absolute error (MAE) of the regressor on the given data.

The mean absolute error is the average of the absolute differences between the predicted and expected target values. The lower the mean absolute error, the better the regressor. Results range from 0.0 to positive infinity.

Note: The model must be fitted.

Parameters:

Name Type Description Default
validationOrTestSet union<Table, TabularDataset> The validation or test set. -

Results:

Name Type Description
meanAbsoluteError Float The mean absolute error of the regressor.
Stub code in Regressor.sdsstub
@Pure
@PythonName("mean_absolute_error")
fun meanAbsoluteError(
    @PythonName("validation_or_test_set") validationOrTestSet: union<Table, TabularDataset>
) -> meanAbsoluteError: Float

meanDirectionalAccuracy

Compute the mean directional accuracy (MDA) of the regressor on the given data.

This metric compares two consecutive target values and checks if the predicted direction (down/unchanged/up) matches the expected direction. The mean directional accuracy is the proportion of correctly predicted directions. The higher the mean directional accuracy, the better the regressor. Results range from 0.0 to 1.0.

This metric is useful for time series data, where the order of the target values has a meaning. It is not useful for other types of data. Because of this, it is not included in the summarize_metrics method.

Note: The model must be fitted.

Parameters:

Name Type Description Default
validationOrTestSet union<Table, TabularDataset> The validation or test set. -

Results:

Name Type Description
meanDirectionalAccuracy Float The mean directional accuracy of the regressor.
Stub code in Regressor.sdsstub
@Pure
@PythonName("mean_directional_accuracy")
fun meanDirectionalAccuracy(
    @PythonName("validation_or_test_set") validationOrTestSet: union<Table, TabularDataset>
) -> meanDirectionalAccuracy: Float

meanSquaredError

Compute the mean squared error (MSE) of the regressor on the given data.

The mean squared error is the average of the squared differences between the predicted and expected target values. The lower the mean squared error, the better the regressor. Results range from 0.0 to positive infinity.

Notes:

  • The model must be fitted.
  • To get the root mean squared error (RMSE), take the square root of the result.

Parameters:

Name Type Description Default
validationOrTestSet union<Table, TabularDataset> The validation or test set. -

Results:

Name Type Description
meanSquaredError Float The mean squared error of the regressor.
Stub code in Regressor.sdsstub
@Pure
@PythonName("mean_squared_error")
fun meanSquaredError(
    @PythonName("validation_or_test_set") validationOrTestSet: union<Table, TabularDataset>
) -> meanSquaredError: Float

medianAbsoluteDeviation

Compute the median absolute deviation (MAD) of the regressor on the given data.

The median absolute deviation is the median of the absolute differences between the predicted and expected target values. The lower the median absolute deviation, the better the regressor. Results range from 0.0 to positive infinity.

Note: The model must be fitted.

Parameters:

Name Type Description Default
validationOrTestSet union<Table, TabularDataset> The validation or test set. -

Results:

Name Type Description
medianAbsoluteDeviation Float The median absolute deviation of the regressor.
Stub code in Regressor.sdsstub
@Pure
@PythonName("median_absolute_deviation")
fun medianAbsoluteDeviation(
    @PythonName("validation_or_test_set") validationOrTestSet: union<Table, TabularDataset>
) -> medianAbsoluteDeviation: Float

predict

Predict the target values on the given dataset.

Note: The model must be fitted.

Parameters:

Name Type Description Default
dataset union<Table, TabularDataset> The dataset containing at least the features. -

Results:

Name Type Description
prediction TabularDataset The given dataset with an additional column for the predicted target values.
Stub code in SupervisedModel.sdsstub
@Pure
fun predict(
    dataset: union<Table, TabularDataset>
) -> prediction: TabularDataset

summarizeMetrics

Summarize the regressor's metrics on the given data.

Note: The model must be fitted.

Parameters:

Name Type Description Default
validationOrTestSet union<Table, TabularDataset> The validation or test set. -

Results:

Name Type Description
metrics Table A table containing the regressor's metrics.
Stub code in Regressor.sdsstub
@Pure
@PythonName("summarize_metrics")
fun summarizeMetrics(
    @PythonName("validation_or_test_set") validationOrTestSet: union<Table, TabularDataset>
) -> metrics: Table