Class VariantUtil

Object
org.apache.spark.types.variant.VariantUtil

public class VariantUtil extends Object
This class defines constants related to the variant format and provides functions for manipulating variant binaries. A variant is made up of 2 binaries: value and metadata. A variant value consists of a one-byte header and a number of content bytes (can be zero). The header byte is divided into upper 6 bits (called "type info") and lower 2 bits (called "basic type"). The content format is explained in the below constants for all possible basic type and type info values. The variant metadata includes a version id and a dictionary of distinct strings (case-sensitive). Its binary format is: - Version: 1-byte unsigned integer. The only acceptable value is 1 currently. - Dictionary size: 4-byte little-endian unsigned integer. The number of keys in the dictionary. - Offsets: (size + 1) * 4-byte little-endian unsigned integers. `offsets[i]` represents the starting position of string i, counting starting from the address of `offsets[0]`. Strings must be stored contiguously, so we don’t need to store the string size, instead, we compute it with `offset[i + 1] - offset[i]`. - UTF-8 string data.
  • Field Details

  • Constructor Details

    • VariantUtil

      public VariantUtil()
  • Method Details

    • writeLong

      public static void writeLong(byte[] bytes, int pos, long value, int numBytes)
    • primitiveHeader

      public static byte primitiveHeader(int type)
    • shortStrHeader

      public static byte shortStrHeader(int size)
    • objectHeader

      public static byte objectHeader(boolean largeSize, int idSize, int offsetSize)
    • arrayHeader

      public static byte arrayHeader(boolean largeSize, int offsetSize)
    • getType

      public static VariantUtil.Type getType(byte[] value, int pos)
    • valueSize

      public static int valueSize(byte[] value, int pos)
    • getBoolean

      public static boolean getBoolean(byte[] value, int pos)
    • getLong

      public static long getLong(byte[] value, int pos)
    • getDouble

      public static double getDouble(byte[] value, int pos)
    • getDecimal

      public static BigDecimal getDecimal(byte[] value, int pos)
    • getFloat

      public static float getFloat(byte[] value, int pos)
    • getBinary

      public static byte[] getBinary(byte[] value, int pos)
    • getString

      public static String getString(byte[] value, int pos)
    • handleObject

      public static <T> T handleObject(byte[] value, int pos, VariantUtil.ObjectHandler<T> handler)
    • handleArray

      public static <T> T handleArray(byte[] value, int pos, VariantUtil.ArrayHandler<T> handler)
    • getMetadataKey

      public static String getMetadataKey(byte[] metadata, int id)