I Have the following list of stings in input:
val input = List(
"17 SD1-MONT_FF_13 7(14)",
"17 QXRI1-SEDDS_13 S(01)",
"17 XFDRI1-MONDT_TT_14 7(18)",
"17 SQXI1-SSENS_14 S(01)",
"12 CRI1-MSONT_TT_15 7(18)",
"13 QSDRI1-SEDNS_15 S(01)",
"14 WSQSRI1-DEVSISE S(05)")
I coded the following function which calclulates the datatype over the third element of each line of the list :
But I don't know how to call this function recursively over each line so it adds the datatype as a 4th element is each line , the expected result should be a list as follows :
val input = List(
"17 SD1-MONT_FF_13 7(14) IntegerType",
"17 QXRI1-SEDDS_13 S(01) StringType",
"17 XFDRI1-MONDT_TT_14 7(18) IntegerType",
"17 SQXI1-SSENS_14 S(01) StringType",
"12 CRI1-MSONT_TT_15 7(18) IntegerType ",
"13 QSDRI1-SEDNS_15 S(01) StringType",
"14 WSQSRI1-DEVSISE S(05) StringType")
The function related to data type calculation is:
def dataType (input:String) : String = (input.charAt(0),
input.contains('F')) match {
case ('S', _) => "StringType"
case ('7', false) => "IntegerType"
case ('7', true) => "FloatType"
case _ => "UnknowType"
}
All you need is pass the exact string value to the dataType
function as below
input.map(line => line +" "+ dataType(line.split(" ")(2)))
which should give you
17 SD1-MONT_FF_13 7(14) IntegerType
17 QXRI1-SEDDS_13 S(01) StringType
17 XFDRI1-MONDT_TT_14 7(18) IntegerType
17 SQXI1-SSENS_14 S(01) StringType
12 CRI1-MSONT_TT_15 7(18) IntegerType
13 QSDRI1-SEDNS_15 S(01) StringType
14 WSQSRI1-DEVSISE S(05) StringType
Just mapping over your input should work:
val result = input.map(line => s"$line ${dataType(line)}")
To utilize your dataType function that you have written, you could simply call this input list as shown below
val result = input.map{x =>
x + " "+dataType(x.split(" ")(2))
}
You will get your desired output in the format.
List(17 SD1-MONT_FF_13 7(14) IntegerType,
17 QXRI1-SEDDS_13 S(01) StringType,
17 XFDRI1-MONDT_TT_14 7(18) IntegerType,
17 SQXI1-SSENS_14 S(01) StringType,
12 CRI1-MSONT_TT_15 7(18) IntegerType,
13 QSDRI1-SEDNS_15 S(01) StringType,
14 WSQSRI1-DEVSISE S(05) StringType)
A couple of syntax suggestions. Regexes can be handy but also confusing. You can also use an extractor that uses comfortable API like split, or code that is more efficient.
scala> val s = "17 SD1-MONT_FF_13 7(14)"
s: String = 17 SD1-MONT_FF_13 7(14)
scala> val r = raw"\S+ \S+ (\S+)".r
r: scala.util.matching.Regex = \S+ \S+ (\S+)
scala> val r(field) = s
field: String = 7(14)
scala> val input = List(s, "17 QXRI1-SEDDS_13 S(01)")
input: List[String] = List(17 SD1-MONT_FF_13 7(14), 17 QXRI1-SEDDS_13 S(01))
scala> input map { case s @ r(field) => s"$s $field" }
res0: List[String] = List(17 SD1-MONT_FF_13 7(14) 7(14), 17 QXRI1-SEDDS_13 S(01) S(01))
scala> def decoder(s: String) = s.head match { case '7' => if (s contains 'F') "float" else "int" case _ => "other" }
decoder: (s: String)String
scala> input map { case s @ r(field) => s"$s ${decoder(field)}" }
res1: List[String] = List(17 SD1-MONT_FF_13 7(14) int, 17 QXRI1-SEDDS_13 S(01) other)
scala> object Field { def unapply(s: String) = Option(s.split(' ')(2)) }
defined object Field
scala> input map { case s @ Field(field) => s"$s ${decoder(field)}" }
res2: List[String] = List(17 SD1-MONT_FF_13 7(14) int, 17 QXRI1-SEDDS_13 S(01) other)