org.apache.spark.ml.feature

Class Summary
Class	Description
Binarizer	:: Experimental :: Binarize a column of continuous features given a threshold.
Bucketizer	:: Experimental :: `Bucketizer` maps a column of continuous features to a column of feature buckets.
ColumnPruner	Utility transformer for removing temporary columns from a DataFrame.
CountVectorizer	:: Experimental :: Extracts a vocabulary from document collections and generates a `CountVectorizerModel`.
CountVectorizerModel	:: Experimental :: Converts a text document to a sparse vector of token counts.
DCT	:: Experimental :: A feature transformer that takes the 1D discrete cosine transform of a real vector.
ElementwiseProduct	:: Experimental :: Outputs the Hadamard product (i.e., the element-wise product) of each input vector with a provided "weight" vector.
HashingTF	:: Experimental :: Maps a sequence of terms to their term frequencies using the hashing trick.
IDF	:: Experimental :: Compute the Inverse Document Frequency (IDF) given a collection of documents.
IDFModel
IndexToString	:: Experimental :: A `Transformer` that maps a column of string indices back to a new column of corresponding string values using either the ML attributes of the input column, or if provided using the labels supplied by the user.
MinMaxScaler	:: Experimental :: Rescale each feature individually to a common range [min, max] linearly using column summary statistics, which is also known as min-max normalization or Rescaling.
MinMaxScalerModel
NGram	:: Experimental :: A feature transformer that converts the input array of strings into an array of n-grams.
Normalizer	:: Experimental :: Normalize a vector to have unit norm using the given p-norm.
OneHotEncoder	:: Experimental :: A one-hot encoder that maps a column of category indices to a column of binary vectors, with at most a single one-value per row that indicates the input category index.
PCA	:: Experimental :: PCA trains a model to project vectors to a low-dimensional space using PCA.
PCAModel
PolynomialExpansion	:: Experimental :: Perform feature expansion in a polynomial space.
RegexTokenizer	:: Experimental :: A regex based tokenizer that extracts tokens either by using the provided regex pattern to split the text (default) or repeatedly matching the regex (if `gaps` is false).
RFormula	:: Experimental :: Implements the transforms required for fitting a dataset against an R model formula.
RFormulaModel	:: Experimental :: A fitted RFormula.
StandardScaler	:: Experimental :: Standardizes features by removing the mean and scaling to unit variance using column summary statistics on the samples in the training set.
StandardScalerModel
StopWords	stop words list
StopWordsRemover	:: Experimental :: A feature transformer that filters out stop words from input.
StringIndexer	:: Experimental :: A label indexer that maps a string column of labels to an ML column of label indices.
StringIndexerModel	:: Experimental :: Model fitted by `StringIndexer`.
Tokenizer	:: Experimental :: A tokenizer that converts the input string to lowercase and then splits it by white spaces.
VectorAssembler	:: Experimental :: A feature transformer that merges multiple columns into a vector column.
VectorIndexer	:: Experimental :: Class for indexing categorical feature columns in a dataset of `Vector`.
VectorIndexer.CategoryStats	Helper class for tracking unique values for each feature.
VectorIndexerModel	:: Experimental :: Transform categorical features to use 0-based indices instead of their original values.
VectorSlicer	:: Experimental :: This class takes a feature vector and outputs a new feature vector with a subarray of the original features.
Word2Vec	:: Experimental :: Word2Vec trains a model of `Map(String, Vector)`, i.e.
Word2VecModel	:: Experimental :: Model fitted by `Word2Vec`.

Package org.apache.spark.ml.feature