Package org.apache.spark.sql
Class Observation
Object
org.apache.spark.sql.ObservationBase
org.apache.spark.sql.Observation
Helper class to simplify usage of
Dataset.observe(String, Column, Column*):
// Observe row count (rows) and highest id (maxid) in the Dataset while writing it
val observation = Observation("my metrics")
val observed_ds = ds.observe(observation, count(lit(1)).as("rows"), max($"id").as("maxid"))
observed_ds.write.parquet("ds.parquet")
val metrics = observation.get
This collects the metrics while the first action is executed on the observed dataset. Subsequent
actions do not modify the metrics returned by ObservationBase.get(). Retrieval of the metric via ObservationBase.get()
blocks until the first action has finished and metrics become available.
This class does not support streaming datasets.
param: name name of the metric
- Since:
- 3.3.0
-
Constructor Summary
ConstructorsConstructorDescriptionCreate an Observation instance without providing a name.Observation(String name) -
Method Summary
Modifier and TypeMethodDescriptionstatic Observationapply()Observation constructor for creating an anonymous observation.static ObservationObservation constructor for creating a named observation.Methods inherited from class org.apache.spark.sql.ObservationBase
get, getAsJava, name
-
Constructor Details
-
Observation
-
Observation
public Observation()Create an Observation instance without providing a name. This generates a random name.
-
-
Method Details
-
apply
Observation constructor for creating an anonymous observation.- Returns:
- (undocumented)
-
apply
Observation constructor for creating a named observation.- Parameters:
name- (undocumented)- Returns:
- (undocumented)
-