JavaSchemaRDD (Spark 1.0.0 JavaDoc)

Object
- org.apache.spark.sql.api.java.JavaSchemaRDD

All Implemented Interfaces:

java.io.Serializable, JavaRDDLike<Row,JavaRDD<Row>>
```
public class JavaSchemaRDD
extends Object
implements JavaRDDLike<Row,JavaRDD<Row>>
```
An RDD of Row objects that is returned as the result of a Spark SQL query. In addition to standard RDD operations, a JavaSchemaRDD can also be registered as a table in the JavaSQLContext that was used to create. Registering a JavaSchemaRDD allows its contents to be queried in future SQL statement.

See Also:
Serialized Form

Constructor Summary

Constructors
Constructor and Description

JavaSchemaRDD(SQLContext sqlContext, org.apache.spark.sql.catalyst.plans.logical.LogicalPlan logicalPlan)

Constructors
Constructor and Description
`JavaSchemaRDD(SQLContext sqlContext, org.apache.spark.sql.catalyst.plans.logical.LogicalPlan logicalPlan)`

Method Summary

Methods
Modifier and Type	Method and Description
`SchemaRDD`	`baseSchemaRDD()`
`JavaSchemaRDD`	`cache()` Persist this RDD with the default storage level (`MEMORY_ONLY`).
`scala.reflect.ClassTag<Row>`	`classTag()`
`JavaSchemaRDD`	`coalesce(int numPartitions, boolean shuffle)` Return a new RDD that is reduced into `numPartitions` partitions.
`JavaSchemaRDD`	`distinct()` Return a new RDD containing the distinct elements in this RDD.
`JavaSchemaRDD`	`distinct(int numPartitions)` Return a new RDD containing the distinct elements in this RDD.
`JavaSchemaRDD`	`filter(Function<Row,Boolean> f)` Return a new RDD containing only the elements that satisfy a predicate.
`JavaSchemaRDD`	`intersection(JavaSchemaRDD other)` Return the intersection of this RDD and another one.
`JavaSchemaRDD`	`intersection(JavaSchemaRDD other, int numPartitions)` Return the intersection of this RDD and another one.
`JavaSchemaRDD`	`intersection(JavaSchemaRDD other, Partitioner partitioner)` Return the intersection of this RDD and another one.
`JavaSchemaRDD`	`persist()` Persist this RDD with the default storage level (`MEMORY_ONLY`).
`JavaSchemaRDD`	`persist(StorageLevel newLevel)` Set this RDD's storage level to persist its values across operations after the first time it is computed.
`RDD<Row>`	`rdd()`
`JavaSchemaRDD`	`repartition(int numPartitions)` Return a new RDD that has exactly `numPartitions` partitions.
`JavaSchemaRDD`	`setName(String name)` Assign a name to this RDD
`SQLContext`	`sqlContext()`
`JavaSchemaRDD`	`subtract(JavaSchemaRDD other)` Return an RDD with the elements from `this` that are not in `other`.
`JavaSchemaRDD`	`subtract(JavaSchemaRDD other, int numPartitions)` Return an RDD with the elements from `this` that are not in `other`.
`JavaSchemaRDD`	`subtract(JavaSchemaRDD other, Partitioner p)` Return an RDD with the elements from `this` that are not in `other`.
`String`	`toString()`
`JavaSchemaRDD`	`unpersist(boolean blocking)` Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
`JavaRDD<Row>`	`wrapRDD(RDD<Row> rdd)`

Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait

Methods inherited from interface org.apache.spark.api.java.JavaRDDLike
aggregate, cartesian, checkpoint, collect, collectPartitions, context, count, countApprox, countApprox, countApproxDistinct, countByValue, countByValueApprox, countByValueApprox, first, flatMap, flatMapToDouble, flatMapToPair, fold, foreach, foreachPartition, getCheckpointFile, getStorageLevel, glom, groupBy, groupBy, id, isCheckpointed, iterator, keyBy, map, mapPartitions, mapPartitions, mapPartitionsToDouble, mapPartitionsToDouble, mapPartitionsToPair, mapPartitionsToPair, mapPartitionsWithIndex, mapToDouble, mapToPair, max, min, name, pipe, pipe, pipe, reduce, saveAsObjectFile, saveAsTextFile, saveAsTextFile, splits, take, takeOrdered, takeOrdered, takeSample, takeSample, toArray, toDebugString, toLocalIterator, top, top, zip, zipPartitions, zipWithIndex, zipWithUniqueId

- Constructor Detail
  - JavaSchemaRDD
```
public JavaSchemaRDD(SQLContext sqlContext,
             org.apache.spark.sql.catalyst.plans.logical.LogicalPlan logicalPlan)
```
- Method Detail
  - sqlContext
```
public SQLContext sqlContext()
```
  - baseSchemaRDD
```
public SchemaRDD baseSchemaRDD()
```
  - classTag
```
public scala.reflect.ClassTag<Row> classTag()
```
    Specified by:
    
    classTag in interface JavaRDDLike<Row,JavaRDD<Row>>
  - wrapRDD
```
public JavaRDD<Row> wrapRDD(RDD<Row> rdd)
```
    Specified by:
    
    wrapRDD in interface JavaRDDLike<Row,JavaRDD<Row>>
  - rdd
```
public RDD<Row> rdd()
```
    Specified by:
    
    rdd in interface JavaRDDLike<Row,JavaRDD<Row>>
  - toString
```
public String toString()
```
    Overrides:
    
    toString in class Object
  - cache
```
public JavaSchemaRDD cache()
```
    Persist this RDD with the default storage level (`MEMORY_ONLY`).
  - persist
```
public JavaSchemaRDD persist()
```
    Persist this RDD with the default storage level (`MEMORY_ONLY`).
  - persist
```
public JavaSchemaRDD persist(StorageLevel newLevel)
```
    Set this RDD's storage level to persist its values across operations after the first time it is computed. This can only be used to assign a new storage level if the RDD does not have a storage level set yet..
  - unpersist
```
public JavaSchemaRDD unpersist(boolean blocking)
```
    Mark the RDD as non-persistent, and remove all blocks for it from memory and disk.
    
    Parameters:
    blocking - Whether to block until all blocks are deleted.
    
    Returns:
    This RDD.
  - setName
```
public JavaSchemaRDD setName(String name)
```
    Assign a name to this RDD
  - coalesce
```
public JavaSchemaRDD coalesce(int numPartitions,
                     boolean shuffle)
```
    Return a new RDD that is reduced into numPartitions partitions.
  - distinct
```
public JavaSchemaRDD distinct()
```
    Return a new RDD containing the distinct elements in this RDD.
  - distinct
```
public JavaSchemaRDD distinct(int numPartitions)
```
    Return a new RDD containing the distinct elements in this RDD.
  - filter
```
public JavaSchemaRDD filter(Function<Row,Boolean> f)
```
    Return a new RDD containing only the elements that satisfy a predicate.
  - intersection
```
public JavaSchemaRDD intersection(JavaSchemaRDD other)
```
    Return the intersection of this RDD and another one. The output will not contain any duplicate elements, even if the input RDDs did.
    Note that this method performs a shuffle internally.
  - intersection
```
public JavaSchemaRDD intersection(JavaSchemaRDD other,
                         Partitioner partitioner)
```
    Return the intersection of this RDD and another one. The output will not contain any duplicate elements, even if the input RDDs did.
    Note that this method performs a shuffle internally.
    
    Parameters:
    partitioner - Partitioner to use for the resulting RDD
  - intersection
```
public JavaSchemaRDD intersection(JavaSchemaRDD other,
                         int numPartitions)
```
    Return the intersection of this RDD and another one. The output will not contain any duplicate elements, even if the input RDDs did. Performs a hash partition across the cluster
    Note that this method performs a shuffle internally.
    
    Parameters:
    numPartitions - How many partitions to use in the resulting RDD
  - repartition
```
public JavaSchemaRDD repartition(int numPartitions)
```
    Return a new RDD that has exactly numPartitions partitions.
    Can increase or decrease the level of parallelism in this RDD. Internally, this uses a shuffle to redistribute data.
    If you are decreasing the number of partitions in this RDD, consider using coalesce, which can avoid performing a shuffle.
  - subtract
```
public JavaSchemaRDD subtract(JavaSchemaRDD other)
```
    Return an RDD with the elements from this that are not in other.
    Uses this partitioner/partition size, because even if other is huge, the resulting RDD will be <= us.
  - subtract
```
public JavaSchemaRDD subtract(JavaSchemaRDD other,
                     int numPartitions)
```
    Return an RDD with the elements from this that are not in other.
  - subtract
```
public JavaSchemaRDD subtract(JavaSchemaRDD other,
                     Partitioner p)
```
    Return an RDD with the elements from this that are not in other.

Class JavaSchemaRDD

Constructor Summary

Method Summary

Methods inherited from class Object

Methods inherited from interface org.apache.spark.api.java.JavaRDDLike

Constructor Detail

JavaSchemaRDD

Method Detail

sqlContext

baseSchemaRDD

classTag

wrapRDD

rdd

toString

cache

persist

persist

unpersist

setName

coalesce

distinct

distinct

filter

intersection

intersection

intersection

repartition

subtract

subtract

subtract