public class Word2VecModel extends Model<Word2VecModel>
Word2Vec.| Modifier and Type | Method and Description |
|---|---|
Word2VecModel |
copy(ParamMap extra)
Creates a copy of this instance with the same UID and some extra params.
|
DataFrame |
findSynonyms(java.lang.String word,
int num)
Find "num" number of words closest in similarity to the given word.
|
DataFrame |
findSynonyms(Vector word,
int num)
Find "num" number of words closest to similarity to the given vector representation
of the word.
|
int |
getMinCount() |
int |
getNumPartitions() |
DataFrame |
getVectors()
Returns a dataframe with two fields, "word" and "vector", with "word" being a String and
and the vector the DenseVector that it is mapped to.
|
int |
getVectorSize() |
IntParam |
minCount()
The minimum number of times a token must appear to be included in the word2vec model's
vocabulary.
|
IntParam |
numPartitions()
Number of partitions for sentences of words.
|
Word2VecModel |
setInputCol(java.lang.String value) |
Word2VecModel |
setOutputCol(java.lang.String value) |
DataFrame |
transform(DataFrame dataset)
Transform a sentence column to a vector column to represent the whole sentence.
|
StructType |
transformSchema(StructType schema)
:: DeveloperApi ::
|
java.lang.String |
uid()
An immutable unique ID for the object and its derivatives.
|
StructType |
validateAndTransformSchema(StructType schema)
Validate and transform the input schema.
|
IntParam |
vectorSize()
The dimension of the code that you want to transform from words.
|
transform, transform, transformtransformSchemaclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitclear, copyValues, defaultCopy, defaultParamMap, explainParam, explainParams, extractParamMap, extractParamMap, get, getDefault, getOrDefault, getParam, hasDefault, hasParam, isDefined, isSet, paramMap, params, set, set, set, setDefault, setDefault, shouldOwn, validateParamstoStringinitializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarningpublic java.lang.String uid()
Identifiableuid in interface Identifiablepublic DataFrame getVectors()
public DataFrame findSynonyms(java.lang.String word, int num)
word - (undocumented)num - (undocumented)public DataFrame findSynonyms(Vector word, int num)
word - (undocumented)num - (undocumented)public Word2VecModel setInputCol(java.lang.String value)
public Word2VecModel setOutputCol(java.lang.String value)
public DataFrame transform(DataFrame dataset)
transform in class Transformerdataset - (undocumented)public StructType transformSchema(StructType schema)
PipelineStageDerives the output schema from the input schema.
transformSchema in class PipelineStageschema - (undocumented)public Word2VecModel copy(ParamMap extra)
Paramscopy in interface Paramscopy in class Model<Word2VecModel>extra - (undocumented)defaultCopy()public IntParam vectorSize()
public int getVectorSize()
public IntParam numPartitions()
public int getNumPartitions()
public IntParam minCount()
public int getMinCount()
public StructType validateAndTransformSchema(StructType schema)
schema - (undocumented)