ToKey Class
Converts input values (words, numbers, etc.) to index in a dictionary.
- Inheritance
-
nimbusml.internal.core.preprocessing._tokey.ToKeyToKeynimbusml.base_transform.BaseTransformToKeysklearn.base.TransformerMixinToKey
Constructor
ToKey(max_num_terms=1000000, term=None, sort='ByOccurrence', text_key_values=False, columns=None, **params)
Parameters
Name | Description |
---|---|
columns
|
a dictionary of key-value pairs, where key is the output column name and value is the input column name.
simply specify The << operator can be used to set this value (see Column Operator) For example
For more details see Columns. |
max_num_terms
|
Maximum number of keys to keep per column when auto- training. |
term
|
List of terms. |
sort
|
How items should be ordered when vectorized. By default, they will be in the order encountered. If by value items are sorted according to their default comparison, for example, text sorting will be case sensitive (for example, 'A' then 'Z' then 'a'). |
text_key_values
|
Whether key value metadata should be text, regardless of the actual input type. |
params
|
Additional arguments sent to compute engine. |
Examples
###############################################################################
# ToKey
import numpy
from nimbusml import FileDataStream
from nimbusml.datasets import get_dataset
from nimbusml.preprocessing import ToKey
# data input (as a FileDataStream)
path = get_dataset('infert').as_filepath()
data = FileDataStream.read_csv(path, sep=',', numeric_dtype=numpy.float32,
names={0: 'id'})
print(data.head())
# age case education id induced parity pooled.stratum spontaneous ...
# 0 26.0 1.0 0-5yrs 1.0 1.0 6.0 3.0 2.0 ...
# 1 42.0 1.0 0-5yrs 2.0 1.0 1.0 1.0 0.0 ...
# 2 39.0 1.0 0-5yrs 3.0 2.0 6.0 4.0 0.0 ...
# 3 34.0 1.0 0-5yrs 4.0 2.0 4.0 2.0 0.0 ..
# 4 35.0 1.0 6-11yrs 5.0 1.0 3.0 32.0 1.0 ..
# transform usage
xf = ToKey(columns={'id_1': 'id', 'edu_1': 'education'})
# fit and transform
features = xf.fit_transform(data)
print(features.head())
# age case edu_1 education id id_1 induced parity ...
# 0 26.0 1.0 0-5yrs 0-5yrs 1.0 0 1.0 6.0 ...
# 1 42.0 1.0 0-5yrs 0-5yrs 2.0 1 1.0 1.0 ...
# 2 39.0 1.0 0-5yrs 0-5yrs 3.0 2 2.0 6.0 ...
# 3 34.0 1.0 0-5yrs 0-5yrs 4.0 3 2.0 4.0 ...
# 4 35.0 1.0 6-11yrs 6-11yrs 5.0 4 1.0 3.0 ...
Remarks
The ToKey
transform converts a column of text to key values
using a dictionary. This operation can be reversed by using
FromKey to obtain the
orginal values.
Methods
get_params |
Get the parameters for this operator. |
get_params
Get the parameters for this operator.
get_params(deep=False)
Parameters
Name | Description |
---|---|
deep
|
Default value: False
|