How to read json data using scala from kafka topic

2019-04-29 06:36发布

I am new spark, Could you please let me know how to read json data using scala from kafka topic in apache spark.

Thanks.

标签： scala apache-spark apache-kafka spark-streaming

2条回答

Bombasti

2楼-- · 2019-04-29 07:06

The simplest method would be to make use of the DataFrame abstraction shipped with Spark.

val sqlContext = new SQLContext(sc)
val stream = KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder](
                  ssc, kafkaParams, Set("myTopicName"))

stream.foreachRDD(
  rdd => {
     val dataFrame = sqlContext.read.json(rdd.map(_._2)) //converts json to DF
     //do your operations on this DF. You won't even require a model class.
        })

0人赞添加讨论(0) 举报

爷的心禁止访问

3楼-- · 2019-04-29 07:20

I use Play Framework's library for Json. You can add it to your project as a standalone module. Usage is as follows:

import play.api.libs.json._
import org.apache.spark.streaming.kafka.KafkaUtils

case class MyClass(field1: String,
                   field2: Int)

implicit val myClassFormat = Json.format[MyClass]

val kafkaParams = Map[String, String](...here are your params...)    
KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder](
  ssc, kafkaParams, Set("myTopicName"))
  .map(m => Json.parse(m._2).as[MyClass])

0人赞添加讨论(0) 举报

How to read json data using scala from kafka topic

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间