OneHotEncoder¶
A way to deal with categorical features that is particularly useful for unordered (i.e. nominal) data.
It replaces a column with a set of columns, each representing a unique value in the original column. The value of each new column is 1 if the original column had that value, and 0 otherwise. Take the following table as an example:
| col1 |
|---|
| "a" |
| "b" |
| "c" |
| "a" |
The one-hot encoding of this table is:
| col1__a | col1__b | col1__c |
|---|---|---|
| 1 | 0 | 0 |
| 0 | 1 | 0 |
| 0 | 0 | 1 |
| 1 | 0 | 0 |
The name "one-hot" comes from the fact that each row has exactly one 1 in it, and the rest of the values are 0s. One-hot encoding is closely related to dummy variable / indicator variables, which are used in statistics.
Parent type: InvertibleTableTransformer
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
selector |
union<List<String>, String?> |
The list of columns used to fit the transformer. If None, all non-numeric columns are used. |
null |
separator |
String |
The separator used to separate the original column name from the value in the new column names. | "__" |
Examples:
pipeline example {
val table = Table({"a": ["z", "y"], "b": [3, 4]});
val encoder = OneHotEncoder(selector=["a"]).fit(table);
val transformedTable = encoder.transform(table);
// Table({"a__z": [1, 0], "a__y": [0, 1], "b": [3, 4]})
val originalTable = encoder.inverseTransform(transformedTable);
// Table({"a": ["z", "y"], "b": [3, 4]})
}
Stub code in OneHotEncoder.sdsstub
isFitted¶
Whether the transformer is fitted.
Type: Boolean
separator¶
The separator used to separate the original column name from the value in the new column names.
Type: String
fit¶
Learn a transformation for a set of columns in a table.
This transformer is not modified.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
table |
Table |
The table used to fit the transformer. | - |
Results:
| Name | Type | Description |
|---|---|---|
fittedTransformer |
OneHotEncoder |
The fitted transformer. |
Stub code in OneHotEncoder.sdsstub
fitAndTransform¶
Learn a transformation for a set of columns in a table and apply the learned transformation to the same table.
Note: Neither this transformer nor the given table are modified.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
table |
Table |
The table used to fit the transformer. The transformer is then applied to this table. | - |
Results:
| Name | Type | Description |
|---|---|---|
fittedTransformer |
OneHotEncoder |
The fitted transformer. |
transformedTable |
Table |
The transformed table. |
Stub code in OneHotEncoder.sdsstub
inverseTransform¶
Undo the learned transformation as well as possible.
Column order and types may differ from the original table. Likewise, some values might not be restored.
Note: The given table is not modified.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
transformedTable |
Table |
The table to be transformed back to the original version. | - |
Results:
| Name | Type | Description |
|---|---|---|
originalTable |
Table |
The original table. |
Stub code in InvertibleTableTransformer.sdsstub
transform¶
Apply the learned transformation to a table.
Note: The given table is not modified.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
table |
Table |
The table to which the learned transformation is applied. | - |
Results:
| Name | Type | Description |
|---|---|---|
transformedTable |
Table |
The transformed table. |