public class RowMatrix extends Object implements DistributedMatrix, Logging
| Constructor and Description |
|---|
RowMatrix(RDD<Vector> rows)
Alternative constructor leaving matrix dimensions to be determined automatically.
|
RowMatrix(RDD<Vector> rows,
long nRows,
int nCols) |
| Modifier and Type | Method and Description |
|---|---|
MultivariateStatisticalSummary |
computeColumnSummaryStatistics()
Computes column-wise summary statistics.
|
Matrix |
computeCovariance()
Computes the covariance matrix, treating each row as an observation.
|
Matrix |
computeGramianMatrix()
Computes the Gramian matrix
A^T A. |
Matrix |
computePrincipalComponents(int k)
Computes the top k principal components.
|
SingularValueDecomposition<RowMatrix,Matrix> |
computeSVD(int k,
boolean computeU,
double rCond)
Computes the singular value decomposition of this matrix.
|
RowMatrix |
multiply(Matrix B)
Multiply this matrix by a local matrix on the right.
|
long |
numCols()
Gets or computes the number of columns.
|
long |
numRows()
Gets or computes the number of rows.
|
RDD<Vector> |
rows() |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitinitialized, initializeIfNecessary, initializeLogging, initLock, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logTrace, logTrace, logWarning, logWarningpublic long numCols()
numCols in interface DistributedMatrixpublic long numRows()
numRows in interface DistributedMatrixpublic Matrix computeGramianMatrix()
A^T A.public SingularValueDecomposition<RowMatrix,Matrix> computeSVD(int k, boolean computeU, double rCond)
There is no restriction on m, but we require n^2 doubles to fit in memory.
Further, n should be less than m.
The decomposition is computed by first computing A'A = V S^2 V',
computing svd locally on that (since n x n is small), from which we recover S and V.
Then we compute U via easy matrix multiplication as U = A * (V * S^-1).
Note that this approach requires O(n^3) time on the master node.
At most k largest non-zero singular values and associated vectors are returned. If there are k such values, then the dimensions of the return will be:
U is a RowMatrix of size m x k that satisfies U'U = eye(k), s is a Vector of size k, holding the singular values in descending order, and V is a Matrix of size n x k that satisfies V'V = eye(k).
k - number of singular values to keep. We might return less than k if there are
numerically zero singular values. See rCond.computeU - whether to compute UrCond - the reciprocal condition number. All singular values smaller than rCond * sigma(0)
are treated as zero, where sigma(0) is the largest singular value.public Matrix computeCovariance()
public Matrix computePrincipalComponents(int k)
k - number of top principal components.public MultivariateStatisticalSummary computeColumnSummaryStatistics()