TabularDataset¶
A dataset containing tabular data. It can be used to train machine learning models.
Columns in a tabular dataset are divided into three categories:
- The target column is the column that a model should predict.
- Feature columns are columns that a model should use to make predictions.
- Extra columns are columns that are neither feature nor target. They are ignored by models and can be used to provide additional context. An ID or name column is a common example.
Feature columns are implicitly defined as all columns except the target and extra columns. If no extra columns are specified, all columns except the target column are used as features.
Parent type: Dataset<Table, Column<Any?>>
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data |
union<Map<String, List<Any>>, Table> |
The data. | - |
targetName |
String |
The name of the target column. | - |
extraNames |
union<List<String>, String> |
Names of the columns that are neither feature nor target. If null, no extra columns are used, i.e. all but the target column are used as features. | [] |
Examples:
pipeline example {
val table = Table(
{
"id": [1, 2, 3],
"feature": [4, 5, 6],
"target": [7, 8, 9],
},
);
val dataset = table.toTabularDataset("target", extraNames="id");
}
Stub code in TabularDataset.sdsstub
extras¶
Additional columns of the tabular dataset that are neither features nor target.
These can be used to store additional information about instances, such as IDs.
Type: Table
features¶
The feature columns of the tabular dataset.
Type: Table
target¶
The target column of the tabular dataset.
Type: Column<Any?>
toTable¶
Return a table containing all columns of the tabular dataset.
Results:
| Name | Type | Description |
|---|---|---|
table |
Table |
A table containing all columns of the tabular dataset. |
Examples: