The different DataType
s available for Spark SQL can be found here. Can anyone please tell me what would be the corresponding Java/Scala data type for each of Spark SQL's DataType
s?
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
Directly from the Spark SQL and DataFrame Guide:
Data type | Value type in Scala
------------------------------------------------
ByteType | Byte
ShortType | Short
IntegerType | Int
LongType | Long
FloatType | Float
DoubleType | Double
DecimalType | java.math.BigDecimal
StringType | String
BinaryType | Array[Byte]
BooleanType | Boolean
TimestampType | java.sql.Timestamp
DateType | java.sql.Date
ArrayType | scala.collection.Seq
MapType | scala.collection.Map
StructType | org.apache.spark.sql.Row
回答2:
For those trying to find the Java types, they're now also hosted at the link from zero323's answer. To document the current revision here:
Data type | Value type in Java | API to access or create a data type
-------------------------------------------------------------------------------------------
ByteType | byte or Byte | DataTypes.ByteType
ShortType | short or Short | DataTypes.ShortType
IntegerType | int or Integer | DataTypes.IntegerType
LongType | long or Long | DataTypes.LongType
FloatType | float or Float | DataTypes.FloatType
DoubleType | double or Double | DataTypes.DoubleType
DecimalType | java.math.BigDecimal | DataTypes.createDecimalType() or DataTypes.createDecimalType(precision, scale).
StringType | String | DataTypes.StringType
BinaryType | byte[] | DataTypes.BinaryType
BooleanType | boolean or Boolean | DataTypes.BooleanType
TimestampType | java.sql.Timestamp | DataTypes.TimestampType
DateType | java.sql.Date | DataTypes.DateType
ArrayType | java.util.List | DataTypes.createArrayType(elementType) or DataTypes.createArrayType(elementType, containsNull).
MapType | java.util.Map | DataTypes.createMapType(keyType, valueType) or DataTypes.createMapType(keyType, valueType, valueContainsNull)
StructType | org.apache.spark.sql.Row | DataTypes.createStructType(fields)
StructField | The value type in Java of the | DataTypes.createStructField(name, dataType, nullable)
| data type of this field (For |
| example, int for a StructField |
| with the data type IntegerType) |
One thing of note when working with StructTypes in particular - it appears that, if you wish to declare an empty StructType in another as a placeholder value, you must use a new StructType()
rather than the suggested DataTypes.createStructType((StructField)null)
to prevent null pointers. Remember to instantiate the nested StructType with StructFields prior to usage.