@InterfaceStability.Evolving
public interface DataReaderFactory<T>
extends java.io.Serializable
DataSourceReader.createDataReaderFactories() and is
responsible for creating the actual data reader. The relationship between
DataReaderFactory and DataReader
is similar to the relationship between Iterable and Iterator.
Note that, the reader factory will be serialized and sent to executors, then the data reader
will be created on executors and do the actual reading. So DataReaderFactory must be
serializable and DataReader doesn't need to be.| Modifier and Type | Method and Description |
|---|---|
DataReader<T> |
createDataReader()
Returns a data reader to do the actual reading work.
|
default String[] |
preferredLocations()
The preferred locations where the data reader returned by this reader factory can run faster,
but Spark does not guarantee to run the data reader on these locations.
|
default String[] preferredLocations()
DataReader<T> createDataReader()