| Class | Description |
|---|---|
| Binarizer |
:: Experimental ::
Binarize a column of continuous features given a threshold.
|
| Bucketizer |
:: Experimental ::
Bucketizer maps a column of continuous features to a column of feature buckets. |
| ColumnPruner |
Utility transformer for removing temporary columns from a DataFrame.
|
| CountVectorizer |
:: Experimental ::
Extracts a vocabulary from document collections and generates a
CountVectorizerModel. |
| CountVectorizerModel |
:: Experimental ::
Converts a text document to a sparse vector of token counts.
|
| DCT |
:: Experimental ::
A feature transformer that takes the 1D discrete cosine transform of a real vector.
|
| ElementwiseProduct |
:: Experimental ::
Outputs the Hadamard product (i.e., the element-wise product) of each input vector with a
provided "weight" vector.
|
| HashingTF |
:: Experimental ::
Maps a sequence of terms to their term frequencies using the hashing trick.
|
| IDF |
:: Experimental ::
Compute the Inverse Document Frequency (IDF) given a collection of documents.
|
| IDFModel | |
| IndexToString |
:: Experimental ::
A
Transformer that maps a column of string indices back to a new column of corresponding
string values using either the ML attributes of the input column, or if provided using the labels
supplied by the user. |
| MinMaxScaler |
:: Experimental ::
Rescale each feature individually to a common range [min, max] linearly using column summary
statistics, which is also known as min-max normalization or Rescaling.
|
| MinMaxScalerModel | |
| NGram |
:: Experimental ::
A feature transformer that converts the input array of strings into an array of n-grams.
|
| Normalizer |
:: Experimental ::
Normalize a vector to have unit norm using the given p-norm.
|
| OneHotEncoder |
:: Experimental ::
A one-hot encoder that maps a column of category indices to a column of binary vectors, with
at most a single one-value per row that indicates the input category index.
|
| PCA |
:: Experimental ::
PCA trains a model to project vectors to a low-dimensional space using PCA.
|
| PCAModel | |
| PolynomialExpansion |
:: Experimental ::
Perform feature expansion in a polynomial space.
|
| RegexTokenizer |
:: Experimental ::
A regex based tokenizer that extracts tokens either by using the provided regex pattern to split
the text (default) or repeatedly matching the regex (if
gaps is false). |
| RFormula |
:: Experimental ::
Implements the transforms required for fitting a dataset against an R model formula.
|
| RFormulaModel |
:: Experimental ::
A fitted RFormula.
|
| StandardScaler |
:: Experimental ::
Standardizes features by removing the mean and scaling to unit variance using column summary
statistics on the samples in the training set.
|
| StandardScalerModel | |
| StopWords |
stop words list
|
| StopWordsRemover |
:: Experimental ::
A feature transformer that filters out stop words from input.
|
| StringIndexer |
:: Experimental ::
A label indexer that maps a string column of labels to an ML column of label indices.
|
| StringIndexerModel |
:: Experimental ::
Model fitted by
StringIndexer. |
| Tokenizer |
:: Experimental ::
A tokenizer that converts the input string to lowercase and then splits it by white spaces.
|
| VectorAssembler |
:: Experimental ::
A feature transformer that merges multiple columns into a vector column.
|
| VectorIndexer |
:: Experimental ::
Class for indexing categorical feature columns in a dataset of
Vector. |
| VectorIndexer.CategoryStats |
Helper class for tracking unique values for each feature.
|
| VectorIndexerModel |
:: Experimental ::
Transform categorical features to use 0-based indices instead of their original values.
|
| VectorSlicer |
:: Experimental ::
This class takes a feature vector and outputs a new feature vector with a subarray of the
original features.
|
| Word2Vec |
:: Experimental ::
Word2Vec trains a model of
Map(String, Vector), i.e. |
| Word2VecModel |
:: Experimental ::
Model fitted by
Word2Vec. |