Pretty JSON multi reg to one line JSON multi reg

2019-07-28 19:12发布

问题:

I have a String with many records in JSON Format. I have to convert each JSON record to one-line JSON record.

Example: Input:

{
  "field1" : "aa11",
  "field2" : "aa22",
  "structField" : {
    "sf1" : "aaa11",
    "sf2" : "aaa22"
  }
}, {
  "field1" : "bb11",
  "field2" : "bb22",
  "structField" : {
    "sf1" : "bbb11",
    "sf2" : "bbb22"
  }
}, {
  "field1" : "cc11",
  "field2" : "cc22",
  "structField" : {
    "sf1" : "ccc11",
    "sf2" : "ccc22"
  }
}

Output:

{"field1":"aa11","field2":"aa22", "structField":{"sf1" : "aaa11","sf2" : "aaa22"}},
{"field1":"bb11","field2":"bb22","structField":{"sf1" : "bbb11","sf2" : "bbb22"}}, 
{"field1" : "cc11","field2" : "cc22","structField" : {"sf1" : "ccc11","sf2" : "ccc22"}}

I am using Scala to try to parse the String and split it by "}, {" and reformat my json:

myMultiJSONString.
  substring(2,myMultiJSONString.length-2).
  split("\\}, \\{").
  map(reg => "{" + reg.trim.replaceAll("\\n","") + "}")

I think this is a dirty way.

¿Is there some library wich can help with this stuff?

For example, deserializing JSON String to "something" and serializing later in one-line JSON String.

Any idea?

Thanks!

回答1:

If the input JSON is not too huge, one of possible approaches to achieve that without using "dirty" techniques is to use a JSON parsing library to parse the input data and output it line by line with disabled "pretty print" feature.

The structure of the input data does not matter, this can be done almost directly.

For example, using Json4s:

// since the input is not wrapped as JSON array, we need to wrap it to parse properly
val wrappedAsJsonArray = new StringBuilder("[").append(json).append("]").toString()

val parsed = parse(wrappedAsJsonArray)

implicit val formats = DefaultFormats

parsed.children.foreach(obj => {
  val oneLineJson = write(obj) + ","
  println(oneLineJson) // or write to output file
})

// the output:
{"field1":"aa11","field2":"aa22","structField":{"sf1":"aaa11","sf2":"aaa22"}},
{"field1":"bb11","field2":"bb22","structField":{"sf1":"bbb11","sf2":"bbb22"}},
{"field1":"cc11","field2":"cc22","structField":{"sf1":"ccc11","sf2":"ccc22"}},


回答2:

It is always better to use proper json api if that fits in your use case. There are tons of json apis - What JSON library to use in Scala?

I would say you can go with circe which is a functional scala json api. They have pretty good documentation - https://circe.github.io/circe/parsing.html

Example,

import io.circe._, io.circe.parser._

object CirceAgainSerialisers {

  def main(args: Array[String]): Unit = {

    val rawFakeJson: String =
      """
        |  {
        |    "field1": "aa11",
        |    "field2": "aa22",
        |    "structField": {
        |      "sf1": "aaa11",
        |      "sf2": "aaa22"
        |    }
        |  },
        |  {
        |    "field1": "bb11",
        |    "field2": "bb22",
        |    "structField": {
        |      "sf1": "bbb11",
        |      "sf2": "bbb22"
        |    }
        |  },
        |  {
        |    "field1": "cc11",
        |    "field2": "cc22",
        |    "structField": {
        |      "sf1": "ccc11",
        |      "sf2": "ccc22"
        |    }
        |  }
      """.stripMargin

    val deserialised: Either[ParsingFailure, Json] = parse(s"[$rawFakeJson]")

    val fakeSerialise = deserialised.map(json => json.asArray.getOrElse(Vector.empty).mkString(","))

    fakeSerialise match {
      case Right(json) => println(json)
      case Left(failed) => println(failed)
    }
  }
}

your build.sbt would look like,

name := "serialisers-deserialisers"

version := "0.1"

scalaVersion := "2.12.2"

val circeVersion = "0.9.3"

libraryDependencies ++= Seq(
  "io.circe" %% "circe-core",
  "io.circe" %% "circe-generic",
  "io.circe" %% "circe-parser"
).map(_ % circeVersion)


标签: json scala