OrcFileFormat (Spark 2.0.2 JavaDoc)

Object
- org.apache.spark.sql.hive.orc.OrcFileFormat

All Implemented Interfaces:

java.io.Serializable, org.apache.spark.sql.execution.datasources.FileFormat, DataSourceRegister
```
public class OrcFileFormat
extends Object
implements org.apache.spark.sql.execution.datasources.FileFormat, DataSourceRegister, scala.Serializable
```
FileFormat for reading ORC files. If this is moved or renamed, please update DataSource's backwardCompatibilityMap.

See Also:
Serialized Form

Constructor Summary

Constructors
Constructor and Description

OrcFileFormat()

Constructors
Constructor and Description
`OrcFileFormat()`

Method Summary

Methods
Modifier and Type	Method and Description
`scala.Function1<org.apache.spark.sql.execution.datasources.PartitionedFile,scala.collection.Iterator<org.apache.spark.sql.catalyst.InternalRow>>`	`buildReader(SparkSession sparkSession, StructType dataSchema, StructType partitionSchema, StructType requiredSchema, scala.collection.Seq<Filter> filters, scala.collection.immutable.Map<String,String> options, org.apache.hadoop.conf.Configuration hadoopConf)`
`scala.Option<StructType>`	`inferSchema(SparkSession sparkSession, scala.collection.immutable.Map<String,String> options, scala.collection.Seq<org.apache.hadoop.fs.FileStatus> files)`
`boolean`	`isSplitable(SparkSession sparkSession, scala.collection.immutable.Map<String,String> options, org.apache.hadoop.fs.Path path)`
`org.apache.spark.sql.execution.datasources.OutputWriterFactory`	`prepareWrite(SparkSession sparkSession, org.apache.hadoop.mapreduce.Job job, scala.collection.immutable.Map<String,String> options, StructType dataSchema)`
`String`	`shortName()` The string that represents the format that this data source provider uses.
`String`	`toString()`

Methods inherited from class Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait

Methods inherited from interface org.apache.spark.sql.execution.datasources.FileFormat
buildReaderWithPartitionValues, buildWriter, supportBatch

Constructor Detail
- OrcFileFormat
```
public OrcFileFormat()
```

Method Detail

shortName
```
public String shortName()
```
Description copied from interface: DataSourceRegister
The string that represents the format that this data source provider uses. This is overridden by children to provide a nice alias for the data source. For example:
```
   override def shortName(): String = "parquet"
 
```
Specified by:

shortName in interface DataSourceRegister

Returns:
(undocumented)

toString
```
public String toString()
```
Overrides:

toString in class Object

inferSchema

public scala.Option<StructType> inferSchema(SparkSession sparkSession,
                                   scala.collection.immutable.Map<String,String> options,
                                   scala.collection.Seq<org.apache.hadoop.fs.FileStatus> files)

Specified by:: inferSchema in interface org.apache.spark.sql.execution.datasources.FileFormat

prepareWrite

public org.apache.spark.sql.execution.datasources.OutputWriterFactory prepareWrite(SparkSession sparkSession,
                                                                          org.apache.hadoop.mapreduce.Job job,
                                                                          scala.collection.immutable.Map<String,String> options,
                                                                          StructType dataSchema)

Specified by:: prepareWrite in interface org.apache.spark.sql.execution.datasources.FileFormat

isSplitable

public boolean isSplitable(SparkSession sparkSession,
                  scala.collection.immutable.Map<String,String> options,
                  org.apache.hadoop.fs.Path path)

Specified by:: isSplitable in interface org.apache.spark.sql.execution.datasources.FileFormat

buildReader

public scala.Function1<org.apache.spark.sql.execution.datasources.PartitionedFile,scala.collection.Iterator<org.apache.spark.sql.catalyst.InternalRow>> buildReader(SparkSession sparkSession,
                                                                                                                                                           StructType dataSchema,
                                                                                                                                                           StructType partitionSchema,
                                                                                                                                                           StructType requiredSchema,
                                                                                                                                                           scala.collection.Seq<Filter> filters,
                                                                                                                                                           scala.collection.immutable.Map<String,String> options,
                                                                                                                                                           org.apache.hadoop.conf.Configuration hadoopConf)

Specified by:: buildReader in interface org.apache.spark.sql.execution.datasources.FileFormat

Class OrcFileFormat

Constructor Summary

Method Summary

Methods inherited from class Object

Methods inherited from interface org.apache.spark.sql.execution.datasources.FileFormat

Constructor Detail

OrcFileFormat

Method Detail

shortName

toString

inferSchema

prepareWrite

isSplitable

buildReader