How can I parse String to (int * int) tuple in SML

2019-07-03 13:46发布

问题:

I have a string something like this "3,4\r\n", and I want to convert them into a tuple i.e (3,4).

How can we achieve this in SML ?

The reason why I'm getting a string value is because I'm reading a file which returns strings like that.

回答1:

You need a simple parser to achieve that. An appropriate function to parse integers is already available in the library as Int.scan (along with friends for other types), but you have to write the rest yourself. For example:

(* scanLine : (char, 's) StringCvt.reader -> (int * int, 's) StringCvt.reader *)
fun scanLine getc stream =
    case Int.scan StringCvt.DEC getc stream
      of NONE => NONE
       | SOME (x1, stream') =>
    case getc stream'
      of NONE => NONE
       | SOME (c1, stream'') =>
    if c1 <> #"," then NONE else
    case Int.scan StringCvt.DEC getc stream''
      of NONE => NONE
       | SOME (x2, stream''') => 
    case getc stream'''
      of NONE => NONE
       | SOME (c2, stream'''') =>
    if c2 <> #"\n" then NONE else
    SOME ((x1, x2), stream'''')

And then, to parse all lines:

(* scanList : ((char, 's) StringCvt.reader -> ('a, 's) StringCvt.reader) -> (char, 's)  StringCvt.reader -> ('a list, 's) StringCvt.reader *)
fun scanList scanElem getc stream =
    case scanElem getc stream
      of NONE => SOME ([], stream)
       | SOME (x, stream') =>
    case scanList scanElem getc stream'
      of NONE => NONE
       | SOME (xs, stream'') => SOME (x::xs, stream'')

To use it, for example:

val test = "4,5\n2,3\n"
val result = StringCvt.scanString (scanList scanLine) test
(* val result : (int * int) list = [(4, 5), (2, 3)] *)

As you can see, the code is a bit repetitive. To get rid of all the matching of option types you could write a few basic parser combinators:

(* scanCharExpect : char -> (char, 's) StringCvt.reader -> (char, 's) StringCvt.reader *)
fun scanCharExpect expect getc stream =
    case getc stream
      of NONE => NONE
       | SOME (c, stream') =>
         if c = expect then SOME (c, stream') else NONE

(* scanSeq : ((char, 's) StringCvt.reader -> ('a, 's) StringCvt.reader) * ((char, 's) StringCvt.reader -> ('b, 's) StringCvt.reader) -> (char, 's) StringCvt.reader -> ('a * 'b, 's) StringCvt.reader *)
fun scanSeq (scan1, scan2) getc stream =
    case scan1 getc stream
      of NONE => NONE
       | SOME (x1, stream') =>
    case scan2 getc stream'
      of NONE => NONE
       | SOME (x2, stream'') => SOME ((x1, x2), stream'')

fun scanSeqL (scan1, scan2) getc stream =
    Option.map (fn ((x, _), stream) => (x, stream)) (scanSeq (scan1, scan2) getc stream)
fun scanSeqR (scan1, scan2) getc stream =
    Option.map (fn ((_, x), stream) => (x, stream)) (scanSeq (scan1, scan2) getc stream)

(* scanLine : (char, 's) StringCvt.reader -> (int * int, 's) StringCvt.reader *)
fun scanLine getc stream =
    scanSeq (
        scanSeqL (Int.scan StringCvt.DEC, scanCharExpect #","),
        scanSeqL (Int.scan StringCvt.DEC, scanCharExpect #"\n")
    ) getc stream

There are a lot more cool abstractions you can build along these lines, especially when defining your own infix operators. But I'll leave it at that.

You might also want to handle white space between tokens. The StringCvt.skipWS reader is readily available in the lib for that, just insert it in the right places.



回答2:

The following is a crude example of how this can be done

fun toPair s =
    let
      val s' = String.substring(s, 0, size s-2)
    in
      List.mapPartial Int.fromString (String.tokens (fn c => c = #",") s')
    end

However note that mapPartial discards any thing that can't be converted to an integer (when Int.fromString returns NONE), and that it is assumed that the string always contains \r\n, as the last two characters is removed by taking the substring.

Update

Obviously the answer by Rossberg is the correct way of doing it. However depending on the task at hand this may still serve as an example of a quick and stupid way of doing it.



回答3:

Here is a straightforward way to extract all of the unsigned integers from a string and return them in a list (converting the list to a tuple is left as an exercise for the reader).

fun ints_from_str str =
  List.mapPartial
    Int.fromString
    (String.tokens (not o Char.isDigit) str);

ints_from_str " foo 1, bar:22? and 333___  ";

(* val it = [1,22,333] : int list *)


回答4:

Following should achieve this.

 exception MyError

 fun convert(s) = 
   case String.explode(s) of
        x::','::y::_ => (x,y)
       | _ => raise MyError

PS - did not have access to an SML interpreter at work. So may need slight changes.



标签: sml smlnj ml