我很新的火星,Scala和我试图加载类似于一个CSV到:
A,09:33:57.570
B,09:43:02.577
...
唯一的时间I型scala.sql.types看到的是TimestampType,所以我加载的CSV:
val schema = StructType(Array( StructField("A", StringType, true), StructField("time", TimestampType, true)))
val table = spark.read.option("header","false").option("inferSchema","false").schema(schema).csv("../table.csv")
这似乎做工精细,直到我做table.show()
或table.take(5)
等,在这种情况下,我得到下面的异常:
scala> table.show()
16/10/07 16:32:25 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1)
java.lang.IllegalArgumentException
at java.sql.Date.valueOf(Date.java:143)
at org.apache.spark.sql.catalyst.util.DateTimeUtils$.stringToTime(DateTimeUtils.scala:137)
at org.apache.spark.sql.execution.datasources.csv.CSVTypeCast$.castTo(CSVInferSchema.scala:287)
at org.apache.spark.sql.execution.datasources.csv.CSVRelation$$anonfun$csvParser$3.apply(CSVRelation.scala:115)
at org.apache.spark.sql.execution.datasources.csv.CSVRelation$$anonfun$csvParser$3.apply(CSVRelation.scala:84)
at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$1$$anonfun$apply$1.apply(CSVFileFormat.scala:125)
at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat$$anonfun$buildReader$1$$anonfun$apply$1.apply(CSVFileFormat.scala:124)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
是否有具有存储火花时间内数据的首选方法是什么? 我也曾尝试把它当作从每个值java.time字符串和映射LocalTime.parse(),但未能说是有该类型没有编码器。