How can I split a String into a (int * int) tuple in SML?
This requires a simple parser. The corresponding function for parsing integers is already available in the library as Int.scan
(along with friends for other types), but you have to write everything yourself. For example:
(* scanLine : (char, 's) StringCvt.reader -> (int * int, 's) StringCvt.reader *)
fun scanLine getc stream =
case Int.scan StringCvt.DEC getc stream
of NONE => NONE
| SOME (x1, stream') =>
case getc stream'
of NONE => NONE
| SOME (c1, stream'') =>
if c1 <> #"," then NONE else
case Int.scan StringCvt.DEC getc stream''
of NONE => NONE
| SOME (x2, stream''') =>
case getc stream'''
of NONE => NONE
| SOME (c2, stream'''') =>
if c2 <> #"\n" then NONE else
SOME ((x1, x2), stream'''')
And then, to parse all the lines:
(* scanList : ((char, 's) StringCvt.reader -> ('a, 's) StringCvt.reader) -> (char, 's) StringCvt.reader -> ('a list, 's) StringCvt.reader *)
fun scanList scanElem getc stream =
case scanElem getc stream
of NONE => SOME ([], stream)
| SOME (x, stream') =>
case scanList scanElem getc stream'
of NONE => NONE
| SOME (xs, stream'') => SOME (x::xs, stream'')
To use it for example:
val test = "4,5\n2,3\n"
val result = StringCvt.scanString (scanList scanLine) test
(* val result : (int * int) list = [(4, 5), (2, 3)] *)
As you can see, the code is a bit repetitive. To get rid of all option type matches, you could write a few basic parser combinators:
(* scanCharExpect : char -> (char, 's) StringCvt.reader -> (char, 's) StringCvt.reader *)
fun scanCharExpect expect getc stream =
case getc stream
of NONE => NONE
| SOME (c, stream') =>
if c = expect then SOME (c, stream') else NONE
(* scanSeq : ((char, 's) StringCvt.reader -> ('a, 's) StringCvt.reader) * ((char, 's) StringCvt.reader -> ('b, 's) StringCvt.reader) -> (char, 's) StringCvt.reader -> ('a * 'b, 's) StringCvt.reader *)
fun scanSeq (scan1, scan2) getc stream =
case scan1 getc stream
of NONE => NONE
| SOME (x1, stream') =>
case scan2 getc stream'
of NONE => NONE
| SOME (x2, stream'') => SOME ((x1, x2), stream'')
fun scanSeqL (scan1, scan2) getc stream =
Option.map (fn ((x, _), stream) => (x, stream)) (scanSeq (scan1, scan2) getc stream)
fun scanSeqR (scan1, scan2) getc stream =
Option.map (fn ((_, x), stream) => (x, stream)) (scanSeq (scan1, scan2) getc stream)
(* scanLine : (char, 's) StringCvt.reader -> (int * int, 's) StringCvt.reader *)
fun scanLine getc stream =
scanSeq (
scanSeqL (Int.scan StringCvt.DEC, scanCharExpect #","),
scanSeqL (Int.scan StringCvt.DEC, scanCharExpect #"\n")
) getc stream
There are much cooler abstractions you can build along these lines, especially when defining your own infix operators. But I'll leave it at that.
You can also handle spaces between tokens. StringCvt.skipWS
the reader is readily available in the lib for this, just paste it where you want it.
source to share
Below is a rough example of how this can be done
fun toPair s =
let
val s' = String.substring(s, 0, size s-2)
in
List.mapPartial Int.fromString (String.tokens (fn c => c = #",") s')
end
Note, however, that mapPartial discards any thing that cannot be converted to an integer (when it Int.fromString
returns NONE
), and the string is assumed to always contain \r\n
, since the last two characters are removed using the substring.
Update
Obviously Rossberg's answer is the correct way to do it. However, depending on the task at hand, this can still serve as an example of a quick and silly way to do it.
source to share
Here's a simple way to extract all unsigned integers from a string and put them back in a list (the list-to-tuple conversion is left as an exercise for the reader).
fun ints_from_str str =
List.mapPartial
Int.fromString
(String.tokens (not o Char.isDigit) str);
ints_from_str " foo 1, bar:22? and 333___ ";
(* val it = [1,22,333] : int list *)
source to share