public class OneHotEncoder extends Transformer implements DefaultParamsWritable
[0.0, 0.0, 1.0, 0.0].
The last category is not included by default (configurable via OneHotEncoder!.dropLast
because it makes the vector entries sum up to one, and hence linearly dependent.
So an input value of 4.0 maps to [0.0, 0.0, 0.0, 0.0].
Note that this is different from scikit-learn's OneHotEncoder, which keeps all categories.
The output vectors are sparse.
StringIndexer} for converting categorical values into category indices,
Serialized Form| Constructor and Description |
|---|
OneHotEncoder() |
OneHotEncoder(String uid) |
| Modifier and Type | Method and Description |
|---|---|
static Params |
clear(Param<?> param) |
OneHotEncoder |
copy(ParamMap extra)
Creates a copy of this instance with the same UID and some extra params.
|
BooleanParam |
dropLast()
Whether to drop the last category in the encoded vector (default: true)
|
static String |
explainParam(Param<?> param) |
static String |
explainParams() |
static ParamMap |
extractParamMap() |
static ParamMap |
extractParamMap(ParamMap extra) |
static <T> scala.Option<T> |
get(Param<T> param) |
static <T> scala.Option<T> |
getDefault(Param<T> param) |
boolean |
getDropLast() |
static String |
getInputCol() |
static <T> T |
getOrDefault(Param<T> param) |
static String |
getOutputCol() |
static Param<Object> |
getParam(String paramName) |
static <T> boolean |
hasDefault(Param<T> param) |
static boolean |
hasParam(String paramName) |
static Param<String> |
inputCol() |
static boolean |
isDefined(Param<?> param) |
static boolean |
isSet(Param<?> param) |
static OneHotEncoder |
load(String path) |
static Param<String> |
outputCol() |
static Param<?>[] |
params() |
static void |
save(String path) |
static <T> Params |
set(Param<T> param,
T value) |
OneHotEncoder |
setDropLast(boolean value) |
OneHotEncoder |
setInputCol(String value) |
OneHotEncoder |
setOutputCol(String value) |
static String |
toString() |
Dataset<Row> |
transform(Dataset<?> dataset)
Transforms the input dataset.
|
StructType |
transformSchema(StructType schema)
:: DeveloperApi ::
|
String |
uid()
An immutable unique ID for the object and its derivatives.
|
static void |
validateParams() |
static MLWriter |
write() |
transform, transform, transformequals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitwritesaveclear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn, validateParamstoStringpublic OneHotEncoder(String uid)
public OneHotEncoder()
public static OneHotEncoder load(String path)
public static String toString()
public static Param<?>[] params()
public static void validateParams()
public static String explainParam(Param<?> param)
public static String explainParams()
public static final boolean isSet(Param<?> param)
public static final boolean isDefined(Param<?> param)
public static boolean hasParam(String paramName)
public static Param<Object> getParam(String paramName)
public static final <T> scala.Option<T> get(Param<T> param)
public static final <T> T getOrDefault(Param<T> param)
public static final <T> scala.Option<T> getDefault(Param<T> param)
public static final <T> boolean hasDefault(Param<T> param)
public static final ParamMap extractParamMap()
public static final Param<String> inputCol()
public static final String getInputCol()
public static final Param<String> outputCol()
public static final String getOutputCol()
public static void save(String path)
throws java.io.IOException
java.io.IOExceptionpublic static MLWriter write()
public String uid()
Identifiableuid in interface Identifiablepublic final BooleanParam dropLast()
public boolean getDropLast()
public OneHotEncoder setDropLast(boolean value)
public OneHotEncoder setInputCol(String value)
public OneHotEncoder setOutputCol(String value)
public StructType transformSchema(StructType schema)
PipelineStageCheck transform validity and derive the output schema from the input schema.
Typical implementation should first conduct verification on schema change and parameter validity, including complex parameter interaction checks.
transformSchema in class PipelineStageschema - (undocumented)public Dataset<Row> transform(Dataset<?> dataset)
Transformertransform in class Transformerdataset - (undocumented)public OneHotEncoder copy(ParamMap extra)
ParamsdefaultCopy().copy in interface Paramscopy in class Transformerextra - (undocumented)