SupportsV1OverwriteWithSaveAsTable

trait SupportsV1OverwriteWithSaveAsTable extends TableProvider

A marker interface that can be mixed into a TableProvider to indicate that the data source needs to distinguish between DataFrameWriter V1 saveAsTable operations and DataFrameWriter V2 createOrReplace/replace operations.

Background: DataFrameWriter V1's saveAsTable with SaveMode.Overwrite creates a ReplaceTableAsSelect logical plan, which is identical to the plan created by DataFrameWriter V2's createOrReplace. However, the documented semantics can have different interpretations:

V1 saveAsTable with Overwrite: "if data/table already exists, existing data is expected to be overwritten by the contents of the DataFrame" - does not define behavior for metadata (schema) overwriting
V2 createOrReplace: "The output table's schema, partition layout, properties, and other configuration will be based on the contents of the data frame... If the table exists, its configuration and data will be replaced"

Data sources that migrated from V1 to V2 may have adopted different behaviors based on these documented semantics. For example, Delta Lake interprets V1 saveAsTable to not replace table schema unless the overwriteSchema option is explicitly set.

When a TableProvider implements this interface and #addV1OverwriteWithSaveAsTableOption() returns true, DataFrameWriter V1 will add an internal write option to indicate that the command originated from saveAsTable API. The option key used is defined by #OPTION_NAME and the value will be set to "true". This allows the data source to distinguish between the two APIs and apply appropriate semantics.

Annotations: @Evolving()
Source: SupportsV1OverwriteWithSaveAsTable.java
Since: 4.1.0

Linear Supertypes

TableProvider, AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

SupportsV1OverwriteWithSaveAsTable
TableProvider
AnyRef
Any

Hide All
Show All

Visibility

Public
Protected

Abstract Value Members

abstract def getTable(schema: StructType, partitioning: Array[Transform], properties: Map[String, String]): Table
Return a Table instance with the specified table schema, partitioning and properties to do read/write.
Return a Table instance with the specified table schema, partitioning and properties to do read/write. The returned table should report the same schema and partitioning with the specified ones, or Spark may fail the operation.
schema
The specified table schema.
partitioning
The specified table partitioning.
properties
The specified table properties. It's case preserving (contains exactly what users specified) and implementations are free to use it case sensitively or insensitively. It should be able to identify a table, e.g. file path, Kafka topic name, etc.
Definition Classes
TableProvider
abstract def inferSchema(options: CaseInsensitiveStringMap): StructType
Infer the schema of the table identified by the given options.
Infer the schema of the table identified by the given options.
options
an immutable case-insensitive string-to-string map that can identify a table, e.g. file path, Kafka topic name, etc.
Definition Classes
TableProvider

Concrete Value Members

final def !=(arg0: Any): Boolean
Definition Classes
AnyRef → Any
final def ##: Int
Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean
Definition Classes
AnyRef → Any
def addV1OverwriteWithSaveAsTableOption(): Boolean
Returns whether to add the "v1_save_as_table_overwrite" to write operations originating from DataFrameWriter V1 saveAsTable with mode Overwrite. Implementations can override this method to control when the option is added.
Returns whether to add the "v1_save_as_table_overwrite" to write operations originating from DataFrameWriter V1 saveAsTable with mode Overwrite. Implementations can override this method to control when the option is added.
returns
true if the option should be added (default), false otherwise
final def asInstanceOf[T0]: T0
Definition Classes
Any
def clone(): AnyRef
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.CloneNotSupportedException]) @IntrinsicCandidate() @native()
final def eq(arg0: AnyRef): Boolean
Definition Classes
AnyRef
def equals(arg0: AnyRef): Boolean
Definition Classes
AnyRef → Any
final def getClass(): Class[_ <: AnyRef]
Definition Classes
AnyRef → Any
Annotations
@IntrinsicCandidate() @native()
def hashCode(): Int
Definition Classes
AnyRef → Any
Annotations
@IntrinsicCandidate() @native()
def inferPartitioning(options: CaseInsensitiveStringMap): Array[Transform]
Infer the partitioning of the table identified by the given options.
Infer the partitioning of the table identified by the given options.
By default this method returns empty partitioning, please override it if this source support partitioning.
options
an immutable case-insensitive string-to-string map that can identify a table, e.g. file path, Kafka topic name, etc.
Definition Classes
TableProvider
final def isInstanceOf[T0]: Boolean
Definition Classes
Any
final def ne(arg0: AnyRef): Boolean
Definition Classes
AnyRef
final def notify(): Unit
Definition Classes
AnyRef
Annotations
@IntrinsicCandidate() @native()
final def notifyAll(): Unit
Definition Classes
AnyRef
Annotations
@IntrinsicCandidate() @native()
def supportsExternalMetadata(): Boolean
Returns true if the source has the ability of accepting external table metadata when getting tables.
Returns true if the source has the ability of accepting external table metadata when getting tables. The external table metadata includes:
- For table reader: user-specified schema from DataFrameReader/DataStreamReader and schema/partitioning stored in Spark catalog.
- For table writer: the schema of the input Dataframe of DataframeWriter/DataStreamWriter.
By default this method returns false, which means the schema and partitioning passed to Transform[], Map) are from the infer methods. Please override it if this source has expensive schema/partitioning inference and wants external table metadata to avoid inference.
Definition Classes
TableProvider
final def synchronized[T0](arg0: => T0): T0
Definition Classes
AnyRef
def toString(): String
Definition Classes
AnyRef → Any
final def wait(arg0: Long, arg1: Int): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])
final def wait(arg0: Long): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException]) @native()
final def wait(): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])

Deprecated Value Members

def finalize(): Unit
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.Throwable]) @Deprecated
Deprecated
(Since version 9)

Packages

SupportsV1OverwriteWithSaveAsTable

trait SupportsV1OverwriteWithSaveAsTable extends TableProvider

Abstract Value Members

Concrete Value Members

Deprecated Value Members

Inherited from TableProvider

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

SupportsV1OverwriteWithSaveAsTable

trait SupportsV1OverwriteWithSaveAsTable extends TableProvider

Abstract Value Members

Concrete Value Members

Deprecated Value Members

Inherited from TableProvider

Inherited from AnyRef

Inherited from Any

Ungrouped

SupportsV1OverwriteWithSaveAsTable