StringCvt
structureThe StringCvt structure provides types and functions for handling the conversion between strings and values of various basic types.
signature STRING_CVT
structure StringCvt
: STRING_CVT
datatype radix = BIN | OCT | DEC | HEX
datatype realfmt
= SCI of int option
| FIX of int option
| GEN of int option
| EXACT
type ('a, 'b) reader = 'b -> ('a * 'b) option
val padLeft : char -> int -> string -> string
val padRight : char -> int -> string -> string
val splitl : (char -> bool) -> (char, 'a) reader ->'a -> (string * 'a)
val takel : (char -> bool) -> (char, 'a) reader ->'a -> string
val dropl : (char -> bool) -> (char, 'a) reader ->'a -> 'a
val skipWS : (char, 'a) reader -> 'a -> 'a
type cs
val scanString : ((char, cs) reader -> ('a, cs) reader) -> string -> 'a option
datatype radix
datatype realfmt
The third constructor GEN allows a formatting function to use either the scientific or fixed-point notation, typically guided by the magnitude of the number. The optional integer value specifies the maximum number of significant digits, with 12 being the default.
The fourth constructor EXACT specifies that the string should represent the real using an exact decimal representation. The string contains enough information in order to reconstruct a semantically equivalent real value using REAL.fromDecimal o valOf o IEEEReal.fromString
. Refer to the description of IEEEReal.toString for more precise information concerning this format.
type ('a, 'b) reader
SOME(a,b)
corresponds to a value a scanned from the stream, plus the remainder b of the stream. A return value of NONE indicates that no value of the correct type could be scanned from the stream.
The reader type is designed for use with a stream or functional view of I/O. Scanning functions using the reader type, such as skipWS, splitl and Int.scan, will often use lookahead characters to determine when to stop scanning. If the character source ('b
in an ('a,'b) reader
) is imperative, the lookahead characters will be lost to any subsequent scanning of the source. One mechanism for combining imperative I/O with the standard scanning functions is provided by the TextIO.scanStream function.
padLeft c i s
padRight c i s
i - size s
copies of the character c. If size
s >= i, they just return the string s. In other words, these functions right- and left-justify s in a field i characters wide, never trimming off any part of s. Note that if i <= 0, s is returned. These functions raise Size if the size of the resulting string would be greater than String.maxSize.
splitl p f src
(pref, src')
where pref is the longest prefix (left substring) of src, as produced from src by the character reader f, all of whose characters satisfy p, and src' is the remainder of src. Thus, the first character retrievable from src' is the leftmost character not satisfying p.
splitl can be used with scanning functions such as scanString by composing it with SOME; e.g., scanString (fn rdr => SOME o ((splitl p) rdr))
.
takel p f src
dropl p f src
takel p f s = #1(splitl p f s) dropl p f s = #2(splitl p f s)
skipWS f s
dropl Char.isSpace
.
type cs
cs
will be an integer index into a string.
scanString f s
The basis library emphasizes a functional view for scanning values from text. This provides a natural and elegant way to write simple scanners and parsers, especially as these typically involve some form of reading ahead and backtracking. The model involves two types of components: ways to produce character readers and functions to convert character readers into value readers. For the latter, most types T
have a corresponding scanning function of type
(char, 'a) reader -> (T, 'a) readerCharacter readers are provided for the common sources of characters, either explicitly, such as the SUBSTRING.getc and STREAM_IO.input1 functions, or implicitly, such as the TEXT_IO.scanStream. As an example, suppose we expect to read a decimal integer followed by a date from TextIO.stdIn. This could be handled by the following code:
Example:In this example, we used the underlying stream I/O component of TextIO.stdIn, which is cleaner and more efficient. If, at some later point, we wish to return to the imperative model and do input directly using TextIO.stdIn, we need to reset it with the current stream I/O value using TextIO.setInstream. Alternatively, we could rewrite the code using imperative I/O:
let val scanInt = Int.scan StringCvt.DEC TextIO.StreamIO.input1 val scanDate = Date.scan TextIO.StreamIO.input1 in case scanInt (TextIO.getInstream TextIO.stdIn) of NONE => (* error *) | SOME (intVal, ins') => case scanDate ins' of NONE => (* error *) | SOME (dateVal, ins'') => (* ... *) end
Example:
case TextIO.scanStream (Int.scan StringCvt.DEC) TextIO.stdIn of NONE => (* error *) | SOME intVal => case TextIO.scanStream Date.scan TextIO.stdIn of NONE => (* error *) | SOME dateVal => (* ... *)
The scanString function was designed specifically to be combined with the scan
function of some type T
, producing a function val fromString : string -> T option
for the type. For this reason, scanString only returns a scanned value, and not some indication of where scanning stopped in the string. For the user who wants to receive a scanned value and the unscanned portion of a string, the recommended technique is to convert the string into a substring and combine scanning functions with Substring.getc, e.g., Bool.scan Substring.getc
.
When the input source is a list of characters, scanning values can be accomplished by applying the appropriate scan function to the function List.getItem. Thus, Bool.scan List.getItem
has the type (bool, char list) reader
, which will scan a boolean value and return that value and the remainder of the list.
String, Char
Last Modified October 4, 1997
Comments to John Reppy.
Copyright © 1997 Bell Labs, Lucent Technologies