Interface TableCatalog
- All Superinterfaces:
CatalogPlugin
- All Known Subinterfaces:
CatalogExtension,StagingTableCatalog
- All Known Implementing Classes:
DelegatingCatalogExtension
TableCatalog implementations may be case sensitive or case insensitive. Spark will pass
table identifiers without modification. Field names passed to
alterTable(Identifier, TableChange...) will be normalized to match the case used in the
table schema when updating, renaming, or dropping existing columns when catalyst analysis is case
insensitive.
- Since:
- 3.0.0
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final StringA prefix used to pass OPTIONS in table propertiesstatic final StringA reserved property to specify the collation of the table.static final StringA reserved property to specify the description of the table.static final StringA reserved property to specify a table was created with EXTERNAL.static final StringA reserved property to indicate that the table location is managed, not user-specified.static final StringA reserved property to specify the location of the table.static final StringA reserved property to specify the owner of the table.static final StringA reserved property to specify the provider of the table. -
Method Summary
Modifier and TypeMethodDescriptionalterTable(Identifier ident, TableChange... changes) Apply a set ofchangesto a table in the catalog.default Set<TableCatalogCapability>default TablecreateTable(Identifier ident, Column[] columns, Transform[] partitions, Map<String, String> properties) Create a table in the catalog.default TablecreateTable(Identifier ident, StructType schema, Transform[] partitions, Map<String, String> properties) Deprecated.This is deprecated.booleandropTable(Identifier ident) Drop a table in the catalog.default voidinvalidateTable(Identifier ident) Invalidate cached table metadata for anidentifier.listTables(String[] namespace) List the tables in a namespace from the catalog.loadTable(Identifier ident) Load table metadata byidentifierfrom the catalog.default TableloadTable(Identifier ident, long timestamp) Load table metadata at a specific time byidentifierfrom the catalog.default TableloadTable(Identifier ident, String version) Load table metadata of a specific version byidentifierfrom the catalog.default TableloadTable(Identifier ident, Set<TableWritePrivilege> writePrivileges) Load table metadata byidentifierfrom the catalog.default booleanpurgeTable(Identifier ident) Drop a table in the catalog and completely remove its data by skipping a trash even if it is supported.voidrenameTable(Identifier oldIdent, Identifier newIdent) Renames a table in the catalog.default booleantableExists(Identifier ident) Test whether a table exists using anidentifierfrom the catalog.default booleanIf true, mark all the fields of the query schema as nullable when executing CREATE/REPLACE TABLE ...Methods inherited from interface org.apache.spark.sql.connector.catalog.CatalogPlugin
defaultNamespace, initialize, name
-
Field Details
-
PROP_LOCATION
A reserved property to specify the location of the table. The files of the table should be under this location. The location is a Hadoop Path string.- See Also:
-
PROP_IS_MANAGED_LOCATION
A reserved property to indicate that the table location is managed, not user-specified. If this property is "true", it means it's a managed table even if it has a location. As an example, SHOW CREATE TABLE will not generate the LOCATION clause.- See Also:
-
PROP_EXTERNAL
A reserved property to specify a table was created with EXTERNAL.- See Also:
-
PROP_COMMENT
A reserved property to specify the description of the table.- See Also:
-
PROP_COLLATION
A reserved property to specify the collation of the table.- See Also:
-
PROP_PROVIDER
A reserved property to specify the provider of the table.- See Also:
-
PROP_OWNER
A reserved property to specify the owner of the table.- See Also:
-
OPTION_PREFIX
A prefix used to pass OPTIONS in table properties- See Also:
-
-
Method Details
-
capabilities
- Returns:
- the set of capabilities for this TableCatalog
-
listTables
Identifier[] listTables(String[] namespace) throws org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException List the tables in a namespace from the catalog.If the catalog supports views, this must return identifiers for only tables and not views.
- Parameters:
namespace- a multi-part namespace- Returns:
- an array of Identifiers for tables
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException- If the namespace does not exist (optional).
-
loadTable
Table loadTable(Identifier ident) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException Load table metadata byidentifierfrom the catalog.If the catalog supports views and contains a view for the identifier and not a table, this must throw
NoSuchTableException.- Parameters:
ident- a table identifier- Returns:
- the table's metadata
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException- If the table doesn't exist or is a view
-
loadTable
default Table loadTable(Identifier ident, Set<TableWritePrivilege> writePrivileges) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException Load table metadata byidentifierfrom the catalog. Spark will write data into this table later.If the catalog supports views and contains a view for the identifier and not a table, this must throw
NoSuchTableException.- Parameters:
ident- a table identifierwritePrivileges-- Returns:
- the table's metadata
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException- If the table doesn't exist or is a view- Since:
- 3.5.3
-
loadTable
default Table loadTable(Identifier ident, String version) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException Load table metadata of a specific version byidentifierfrom the catalog.If the catalog supports views and contains a view for the identifier and not a table, this must throw
NoSuchTableException.- Parameters:
ident- a table identifierversion- version of the table- Returns:
- the table's metadata
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException- If the table doesn't exist or is a view
-
loadTable
default Table loadTable(Identifier ident, long timestamp) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException Load table metadata at a specific time byidentifierfrom the catalog.If the catalog supports views and contains a view for the identifier and not a table, this must throw
NoSuchTableException.- Parameters:
ident- a table identifiertimestamp- timestamp of the table, which is microseconds since 1970-01-01 00:00:00 UTC- Returns:
- the table's metadata
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException- If the table doesn't exist or is a view
-
invalidateTable
Invalidate cached table metadata for anidentifier.If the table is already loaded or cached, drop cached data. If the table does not exist or is not cached, do nothing. Calling this method should not query remote services.
- Parameters:
ident- a table identifier
-
tableExists
Test whether a table exists using anidentifierfrom the catalog.If the catalog supports views and contains a view for the identifier and not a table, this must return false.
- Parameters:
ident- a table identifier- Returns:
- true if the table exists, false otherwise
-
createTable
@Deprecated(since="3.4.0") default Table createTable(Identifier ident, StructType schema, Transform[] partitions, Map<String, String> properties) throws org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException, org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceExceptionDeprecated.This is deprecated. Please overridecreateTable(Identifier, Column[], Transform[], Map)instead.Create a table in the catalog.- Throws:
org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsExceptionorg.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException
-
createTable
default Table createTable(Identifier ident, Column[] columns, Transform[] partitions, Map<String, String> properties) throws org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException, org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceExceptionCreate a table in the catalog.- Parameters:
ident- a table identifiercolumns- the columns of the new table.partitions- transforms to use for partitioning data in the tableproperties- a string map of table properties- Returns:
- metadata for the new table. This can be null if getting the metadata for the new table
is expensive. Spark will call
loadTable(Identifier)if needed (e.g. CTAS). - Throws:
org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException- If a table or view already exists for the identifierUnsupportedOperationException- If a requested partition transform is not supportedorg.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException- If the identifier namespace does not exist (optional)
-
useNullableQuerySchema
default boolean useNullableQuerySchema()If true, mark all the fields of the query schema as nullable when executing CREATE/REPLACE TABLE ... AS SELECT ... and creating the table. -
alterTable
Table alterTable(Identifier ident, TableChange... changes) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException Apply a set ofchangesto a table in the catalog.Implementations may reject the requested changes. If any change is rejected, none of the changes should be applied to the table.
The requested changes must be applied in the order given.
If the catalog supports views and contains a view for the identifier and not a table, this must throw
NoSuchTableException.- Parameters:
ident- a table identifierchanges- changes to apply to the table- Returns:
- updated metadata for the table. This can be null if getting the metadata for the updated table is expensive. Spark always discard the returned table here.
- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException- If the table doesn't exist or is a viewIllegalArgumentException- If any change is rejected by the implementation.
-
dropTable
Drop a table in the catalog.If the catalog supports views and contains a view for the identifier and not a table, this must not drop the view and must return false.
- Parameters:
ident- a table identifier- Returns:
- true if a table was deleted, false if no table exists for the identifier
-
purgeTable
Drop a table in the catalog and completely remove its data by skipping a trash even if it is supported.If the catalog supports views and contains a view for the identifier and not a table, this must not drop the view and must return false.
If the catalog supports to purge a table, this method should be overridden. The default implementation throws
UnsupportedOperationException.- Parameters:
ident- a table identifier- Returns:
- true if a table was deleted, false if no table exists for the identifier
- Throws:
UnsupportedOperationException- If table purging is not supported- Since:
- 3.1.0
-
renameTable
void renameTable(Identifier oldIdent, Identifier newIdent) throws org.apache.spark.sql.catalyst.analysis.NoSuchTableException, org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException Renames a table in the catalog.If the catalog supports views and contains a view for the old identifier and not a table, this throws
NoSuchTableException. Additionally, if the new identifier is a table or a view, this throwsTableAlreadyExistsException.If the catalog does not support table renames between namespaces, it throws
UnsupportedOperationException.- Parameters:
oldIdent- the table identifier of the existing table to renamenewIdent- the new table identifier of the table- Throws:
org.apache.spark.sql.catalyst.analysis.NoSuchTableException- If the table to rename doesn't exist or is a vieworg.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException- If the new table name already exists or is a viewUnsupportedOperationException- If the namespaces of old and new identifiers do not match (optional)
-