Interface TableCatalog
- All Superinterfaces:
CatalogPlugin
- All Known Subinterfaces:
CatalogExtension,StagingTableCatalog
- All Known Implementing Classes:
DelegatingCatalogExtension
TableCatalog implementations may be case-sensitive or case-insensitive. Spark will pass
table identifiers without modification. Field names passed to
alterTable(Identifier, TableChange...) will be normalized to match the case used in the
table schema when updating, renaming, or dropping existing columns when catalyst analysis is
case-insensitive.
- Since:
- 3.0.0
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final StringA prefix used to pass OPTIONS in table propertiesstatic final StringA reserved property to specify the collation of the table.static final StringA reserved property to specify the description of the table.static final StringA reserved property to specify a table was created with EXTERNAL.static final StringA reserved property to indicate that the table location is managed, not user-specified.static final StringA reserved property to specify the location of the table.static final StringA reserved property to specify the owner of the table.static final StringA reserved property to specify the provider of the table.static final StringA reserved property that indicates table entity type (external, managed, view, etc.). -
Method Summary
Modifier and TypeMethodDescriptionalterTable(Identifier ident, TableChange... changes) Apply a set ofchangesto a table in the catalog.default Set<TableCatalogCapability>default TablecreateTable(Identifier ident, Column[] columns, Transform[] partitions, Map<String, String> properties) Deprecated.This is deprecated.default TablecreateTable(Identifier ident, TableInfo tableInfo) Create a table in the catalog.default TablecreateTable(Identifier ident, StructType schema, Transform[] partitions, Map<String, String> properties) Deprecated.This is deprecated.default TablecreateTableLike(Identifier ident, TableInfo tableInfo, Table sourceTable) Create a table in the catalog by copying metadata from an existing source table.booleandropTable(Identifier ident) Drop a table in the catalog.default voidinvalidateTable(Identifier ident) Invalidate cached table metadata for anidentifier.listTables(String[] namespace) List the tables in a namespace from the catalog.default TableSummary[]listTableSummaries(String[] namespace) List the table summaries in a namespace from the catalog.default ChangelogloadChangelog(Identifier ident, ChangelogInfo changelogInfo) Load aChangelogfor the given table, representing the row-level changes within the range specified bychangelogInfo.loadTable(Identifier ident) Load table metadata byidentifierfrom the catalog.default TableloadTable(Identifier ident, long timestamp) Load table metadata at a specific time byidentifierfrom the catalog.default TableloadTable(Identifier ident, String version) Load table metadata of a specific version byidentifierfrom the catalog.default TableloadTable(Identifier ident, Set<TableWritePrivilege> writePrivileges) Load table metadata byidentifierfrom the catalog.default booleanpurgeTable(Identifier ident) Drop a table in the catalog and completely remove its data by skipping a trash even if it is supported.voidrenameTable(Identifier oldIdent, Identifier newIdent) Renames a table in the catalog.default booleantableExists(Identifier ident) Test whether a table exists using anidentifierfrom the catalog.default booleanIf true, mark all the fields of the query schema as nullable when executing CREATE/REPLACE TABLE ...Methods inherited from interface org.apache.spark.sql.connector.catalog.CatalogPlugin
defaultNamespace, initialize, name
-
Field Details
-
PROP_LOCATION
A reserved property to specify the location of the table. The files of the table should be under this location. The location is a Hadoop Path string.- See Also:
-
PROP_IS_MANAGED_LOCATION
A reserved property to indicate that the table location is managed, not user-specified. If this property is "true", it means it's a managed table even if it has a location. As an example, SHOW CREATE TABLE will not generate the LOCATION clause.- See Also:
-
PROP_EXTERNAL
A reserved property to specify a table was created with EXTERNAL.- See Also:
-
PROP_TABLE_TYPE
A reserved property that indicates table entity type (external, managed, view, etc.).- See Also:
-
PROP_COMMENT
A reserved property to specify the description of the table.- See Also:
-
PROP_COLLATION
A reserved property to specify the collation of the table.- See Also:
-
PROP_PROVIDER
A reserved property to specify the provider of the table.- See Also:
-
PROP_OWNER
A reserved property to specify the owner of the table.- See Also:
-
OPTION_PREFIX
A prefix used to pass OPTIONS in table properties- See Also:
-
-
Method Details
-
capabilities
- Returns:
- the set of capabilities for this TableCatalog
-
listTables
Identifier[] listTables(String[] namespace) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException List the tables in a namespace from the catalog.If the catalog supports views, this must return identifiers for only tables and not views.
- Parameters:
namespace- a multi-part namespace- Returns:
- an array of Identifiers for tables
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException- If the namespace does not exist (optional).
-
listTableSummaries
default TableSummary[] listTableSummaries(String[] namespace) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException, org.apache.spark.sql.catalyst.analysis.NoSuchTableException List the table summaries in a namespace from the catalog.This method should return all tables entities from a catalog regardless of type (i.e. views should be listed as well).
- Parameters:
namespace- a multi-part namespace- Returns:
- an array of Identifiers for tables
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException- If the namespace does not exist (optional).org.apache.spark.sql.catalyst.analysis.NoSuchTableException- If certain table listed by listTables API does not exist.
-
loadTable
Table loadTable(Identifier ident) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException Load table metadata byidentifierfrom the catalog.If the catalog supports views and contains a view for the identifier and not a table, this must throw
NoSuchTableException.- Parameters:
ident- a table identifier- Returns:
- the table's metadata
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException- If the table doesn't exist or is a view
-
loadTable
default Table loadTable(Identifier ident, Set<TableWritePrivilege> writePrivileges) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException Load table metadata byidentifierfrom the catalog. Spark will write data into this table later.If the catalog supports views and contains a view for the identifier and not a table, this must throw
NoSuchTableException.- Parameters:
ident- a table identifierwritePrivileges-- Returns:
- the table's metadata
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException- If the table doesn't exist or is a view- Since:
- 3.5.3
-
loadTable
default Table loadTable(Identifier ident, String version) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException Load table metadata of a specific version byidentifierfrom the catalog.If the catalog supports views and contains a view for the identifier and not a table, this must throw
NoSuchTableException.- Parameters:
ident- a table identifierversion- version of the table- Returns:
- the table's metadata
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException- If the table doesn't exist or is a view
-
loadTable
default Table loadTable(Identifier ident, long timestamp) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException Load table metadata at a specific time byidentifierfrom the catalog.If the catalog supports views and contains a view for the identifier and not a table, this must throw
NoSuchTableException.- Parameters:
ident- a table identifiertimestamp- timestamp of the table, which is microseconds since 1970-01-01 00:00:00 UTC- Returns:
- the table's metadata
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException- If the table doesn't exist or is a view
-
loadChangelog
default Changelog loadChangelog(Identifier ident, ChangelogInfo changelogInfo) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException Load aChangelogfor the given table, representing the row-level changes within the range specified bychangelogInfo.The default implementation throws an analysis exception indicating that the catalog does not support CDC. Catalogs that support CDC must override this method.
- Parameters:
ident- a table identifierchangelogInfo- the CDC query parameters (range, deduplication mode, etc.)- Returns:
- a Changelog instance for the requested table and range
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException- If the table doesn't exist- Since:
- 4.2.0
-
invalidateTable
Invalidate cached table metadata for anidentifier.If the table is already loaded or cached, drop cached data. If the table does not exist or is not cached, do nothing. Calling this method should not query remote services.
- Parameters:
ident- a table identifier
-
tableExists
Test whether a table exists using anidentifierfrom the catalog.If the catalog supports views and contains a view for the identifier and not a table, this must return false.
- Parameters:
ident- a table identifier- Returns:
- true if the table exists, false otherwise
-
createTable
@Deprecated(since="3.4.0") default Table createTable(Identifier ident, StructType schema, Transform[] partitions, Map<String, String> properties) throws org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException, org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceExceptionDeprecated.This is deprecated. Please overridecreateTable(Identifier, Column[], Transform[], Map)instead.Create a table in the catalog.- Throws:
org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsExceptionorg.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
-
createTable
@Deprecated(since="4.1.0") default Table createTable(Identifier ident, Column[] columns, Transform[] partitions, Map<String, String> properties) throws org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException, org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceExceptionDeprecated.This is deprecated. Please overridecreateTable(Identifier, TableInfo)instead.Create a table in the catalog.- Throws:
org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsExceptionorg.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
-
createTable
default Table createTable(Identifier ident, TableInfo tableInfo) throws org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException, org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException Create a table in the catalog.- Parameters:
ident- a table identifiertableInfo- information about the table.- Returns:
- metadata for the new table. This can be null if getting the metadata for the new table
is expensive. Spark will call
loadTable(Identifier)if needed (e.g. CTAS). - Throws:
org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException- If a table or view already exists for the identifierUnsupportedOperationException- If a requested partition transform is not supportedorg.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException- If the identifier namespace does not exist (optional)- Since:
- 4.1.0
-
createTableLike
default Table createTableLike(Identifier ident, TableInfo tableInfo, Table sourceTable) throws org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException, org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException Create a table in the catalog by copying metadata from an existing source table.This method is called for
CREATE TABLE ... LIKE ...statements targeting this catalog. ThetableInfoparameter contains all the explicit information for the new table: columns and partitioning copied from the source, any constraints copied from the source, user-specified TBLPROPERTIES / LOCATION / USING provider (if given), andPROP_OWNERset to the current user. Source table properties are intentionally excluded fromtableInfo; connectors may readsourceTable.properties()to clone additional format-specific or custom state as appropriate for their implementation.The default implementation throws
UnsupportedOperationException. Connectors that supportCREATE TABLE ... LIKE ...must override this method.- Parameters:
ident- a table identifier for the new tabletableInfo- complete description of the new table: columns, partitioning, constraints, explicit properties (user overrides + owner); source table properties are NOT includedsourceTable- the resolved source table; connectors may read format-specific properties or other custom state from this object to clone additional metadata- Returns:
- metadata for the new table
- Throws:
org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException- If a table or view already exists for the identifierorg.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException- If the identifier namespace does not exist (optional)UnsupportedOperationException- If the catalog does not support CREATE TABLE LIKE- Since:
- 4.2.0
-
useNullableQuerySchema
default boolean useNullableQuerySchema()If true, mark all the fields of the query schema as nullable when executing CREATE/REPLACE TABLE ... AS SELECT ... and creating the table. -
alterTable
Table alterTable(Identifier ident, TableChange... changes) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException Apply a set ofchangesto a table in the catalog.Implementations may reject the requested changes. If any change is rejected, none of the changes should be applied to the table.
The requested changes must be applied in the order given.
If the catalog supports views and contains a view for the identifier and not a table, this must throw
NoSuchTableException.- Parameters:
ident- a table identifierchanges- changes to apply to the table- Returns:
- updated metadata for the table. This can be null if getting the metadata for the updated table is expensive. Spark always discard the returned table here.
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException- If the table doesn't exist or is a viewIllegalArgumentException- If any change is rejected by the implementation.
-
dropTable
Drop a table in the catalog.If the catalog supports views and contains a view for the identifier and not a table, this must not drop the view and must return false.
- Parameters:
ident- a table identifier- Returns:
- true if a table was deleted, false if no table exists for the identifier
-
purgeTable
Drop a table in the catalog and completely remove its data by skipping a trash even if it is supported.If the catalog supports views and contains a view for the identifier and not a table, this must not drop the view and must return false.
If the catalog supports to purge a table, this method should be overridden. The default implementation throws
UnsupportedOperationException.- Parameters:
ident- a table identifier- Returns:
- true if a table was deleted, false if no table exists for the identifier
- Throws:
UnsupportedOperationException- If table purging is not supported- Since:
- 3.1.0
-
renameTable
void renameTable(Identifier oldIdent, Identifier newIdent) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException, org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException Renames a table in the catalog.If the catalog supports views and contains a view for the old identifier and not a table, this throws
NoSuchTableException. Additionally, if the new identifier is a table or a view, this throwsTableAlreadyExistsException.If the catalog does not support table renames between namespaces, it throws
UnsupportedOperationException.- Parameters:
oldIdent- the table identifier of the existing table to renamenewIdent- the new table identifier of the table- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException- If the table to rename doesn't exist or is a vieworg.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException- If the new table name already exists or is a viewUnsupportedOperationException- If the namespaces of old and new identifiers do not match (optional)
-