Skip to content

TabularDataset

A dataset containing tabular data. It can be used to train machine learning models.

Columns in a tabular dataset are divided into three categories:

  • The target column is the column that a model should predict.
  • Feature columns are columns that a model should use to make predictions.
  • Extra columns are columns that are neither feature nor target. They are ignored by models and can be used to provide additional context. An ID or name column is a common example.

Feature columns are implicitly defined as all columns except the target and extra columns. If no extra columns are specified, all columns except the target column are used as features.

Parent type: Dataset<Table, Column<Any?>>

Parameters:

Name Type Description Default
data union<Map<String, List<Any>>, Table> The data. -
targetName String The name of the target column. -
extraNames union<List<String>, String> Names of the columns that are neither feature nor target. If null, no extra columns are used, i.e. all but the target column are used as features. []

Examples:

pipeline example {
    val table = Table(
        {
            "id": [1, 2, 3],
            "feature": [4, 5, 6],
            "target": [7, 8, 9],
        },
    );
    val dataset = table.toTabularDataset("target", extraNames="id");
}
Stub code in TabularDataset.sdsstub

class TabularDataset(
    data: union<Map<String, List<Any>>, Table>,
    @PythonName("target_name") targetName: String,
    @PythonName("extra_names") extraNames: union<List<String>, String> = []
) sub Dataset<Table, Column> {
    /**
     * The feature columns of the tabular dataset.
     */
    attr features: Table
    /**
     * The target column of the tabular dataset.
     */
    attr target: Column
    /**
     * Additional columns of the tabular dataset that are neither features nor target.
     *
     * These can be used to store additional information about instances, such as IDs.
     */
    attr extras: Table

    /**
     * Return a table containing all columns of the tabular dataset.
     *
     * @result table A table containing all columns of the tabular dataset.
     *
     * @example
     * pipeline example {
     *     val table = Table(
     *         {
     *             "id": [1, 2, 3],
     *             "feature": [4, 5, 6],
     *             "target": [7, 8, 9],
     *         },
     *     );
     *     val tabularDataset = table.toTabularDataset(targetName="target", extraNames=["id"]);
     *     val result = tabularDataset.toTable();
     * }
     */
    @Pure
    @PythonName("to_table")
    fun toTable() -> table: Table
}

extras

Additional columns of the tabular dataset that are neither features nor target.

These can be used to store additional information about instances, such as IDs.

Type: Table

features

The feature columns of the tabular dataset.

Type: Table

target

The target column of the tabular dataset.

Type: Column<Any?>

toTable

Return a table containing all columns of the tabular dataset.

Results:

Name Type Description
table Table A table containing all columns of the tabular dataset.

Examples:

pipeline example {
    val table = Table(
        {
            "id": [1, 2, 3],
            "feature": [4, 5, 6],
            "target": [7, 8, 9],
        },
    );
    val tabularDataset = table.toTabularDataset(targetName="target", extraNames=["id"]);
    val result = tabularDataset.toTable();
}
Stub code in TabularDataset.sdsstub

@Pure
@PythonName("to_table")
fun toTable() -> table: Table