trait Changelog extends AnyRef
The central connector interface for Change Data Capture (CDC).
Connectors implement this minimal interface to expose change data. Spark handles post-processing (carry-over removal, update detection, net change computation) based on the properties declared by the connector.
The columns returned by #columns() must include the following metadata columns:
_change_type(STRING) — the kind of change:insert,delete,update_preimage, orupdate_postimage_commit_version(connector-defined type, e.g. LONG) — the version containing this change_commit_timestamp(TIMESTAMP) — the timestamp of the commit
- Annotations
- @Evolving()
- Source
- Changelog.java
- Since
4.2.0
- Alphabetic
- By Inheritance
- Changelog
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Abstract Value Members
- abstract def columns(): Array[Column]
Returns the columns of this changelog, including data columns and the required metadata columns (
_change_type,_commit_version,_commit_timestamp). - abstract def containsCarryoverRows(): Boolean
Whether the raw change data may contain identical insert/delete carry-over pairs produced by copy-on-write file rewrites.
Whether the raw change data may contain identical insert/delete carry-over pairs produced by copy-on-write file rewrites.
When
trueand the CDC query'sdeduplicationModeis notnone, Spark will remove carry-over pairs from the raw change data. Iffalse, the connector guarantees that no carry-over pairs are present in the raw change data and Spark will skip carry-over removal entirely. - abstract def containsIntermediateChanges(): Boolean
Whether the raw change data may contain multiple intermediate states per row identity within the requested changelog range (across all commit versions in the range).
Whether the raw change data may contain multiple intermediate states per row identity within the requested changelog range (across all commit versions in the range).
When
trueand the CDC query'sdeduplicationModeisnetChanges, Spark will collapse multiple changes per row identity into the net effect. Iffalse, the connector guarantees at most one change per row identity across the entire changelog range, and Spark will skip net change computation. - abstract def name(): String
A name to identify this changelog.
- abstract def newScanBuilder(options: CaseInsensitiveStringMap): ScanBuilder
Returns a new
ScanBuilderfor reading the change data.Returns a new
ScanBuilderfor reading the change data.- options
read options (case-insensitive string map)
- abstract def representsUpdateAsDeleteAndInsert(): Boolean
Whether updates in the raw change data are represented as delete+insert pairs rather than fully materialized
update_preimageandupdate_postimageentries.Whether updates in the raw change data are represented as delete+insert pairs rather than fully materialized
update_preimageandupdate_postimageentries.When
trueand the CDC query'scomputeUpdatesoption is enabled, Spark will deriveupdate_preimage/update_postimagefrom insert/delete pairs in the raw change data. Iffalse, the connector guarantees that update pre/post-images are already present in the raw change data.
Concrete Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @IntrinsicCandidate() @native()
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
- def rowId(): Array[NamedReference]
Returns the columns that uniquely identify a row, used for update detection and net change computation.
Returns the columns that uniquely identify a row, used for update detection and net change computation.
The default implementation throws
UnsupportedOperationException. Connectors that support update detection or net change computation must override this method. - def rowVersion(): NamedReference
Returns the column used for ordering changes within the same row identity, used for update detection.
Returns the column used for ordering changes within the same row identity, used for update detection.
The default implementation throws
UnsupportedOperationException. Connectors that support update detection must override this method. - final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
Deprecated Value Members
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable]) @Deprecated
- Deprecated
(Since version 9)