This document describes the Standard ML Basis Library. This library provides an extensive initial basis for Standard ML, which complements the language described by the Definition of Standard ML. The goals of the Basis Library are to:
In this chapter, we discuss the principles and conventions used in the design of the Library, and present a high-level view of the library structure.
By design, the Basis Library is meant to provide a fairly rich collection of general-purpose modules that can serve as the basis for applications programming or for more domain-specific libraries. One criterion for inclusion in the Basis Library is that a type or value requires compiler or run-time system support. In addition, the Library defines a standard minimal environment that anyone using SML can expect to find. The Library also attempts to provide similar functions in similar contexts. Thus, the traditional app
function for lists, which applies a function to each member of a list, has also been provided for arrays and vectors.
An opposite design force has been the desire to keep the basis library small. In general, a function has been included only if it has clear or proven utility, with additional emphasis on those that are complicated to implement, require compiler support, or are more concise or efficient than an equivalent combination of other functions. Some exceptions were made for historical reasons.
The Basis Library is contained in a set of structures. Almost every type, exception constructor and value belongs to some structure. Although some identifiers are also bound in the initial top-level environment we have attempted to keep the number of top-level identifiers small. Infix declarations and overloading are specified for the top-level environment.
We view the signature and structure names used below as being reserved. For an implementation to be conforming, any module it provides that is named in the SML Basis Library must exactly match the description specified in the library. For example, the Int
structure provided by an implementation should not match a superset of the INTEGER
signature. If an implementation provides any types, values or modules not described in the SML Basis Library, they must be encapsulated in additional structures whose names are not used by the SML Basis Library. In particular, an implementation must not introduce any new non-module identifiers into the top-level environment.
We have divided the modules into required and optional categories. Any conforming implementation of SML Basis Library must provide implementations of all of the required modules.
Many of the structures are variations on some generic module (e.g., single and double-precision floating-point numbers). The following table gives a list of the required generic signatures.
Signature | Description |
---|---|
CHAR | Generic character interface |
INTEGER | Generic integer interface |
MATH | Generic math library interface |
IMPERATIVE_IO | Imperative I/O interface |
MONO_ARRAY | Mutable monomorphic arrays |
MONO_VECTOR | Immutable monomorphic vectors |
PRIM_IO | System-call operations for IO |
REAL | Generic real number interface |
STREAM_IO | Stream I/O interface |
STRING | Generic string interface |
SUBSTRING | Generic substring interface |
TEXT_IO | Text I/O interface |
TEXT_STREAM_IO | Text stream I/O interface |
WORD | Generic word (i.e., unsigned modular integer) interface |
Signature | Description |
---|---|
ARRAY | Mutable polymorphic arrays |
BIN_IO | Binary input/output types and operations |
BOOL | Boolean type and values |
BYTE | Conversions between Word8 and Char values |
COMMAND_LINE | Program name and arguments |
DATE | Calendar operations |
GENERAL | General-purpose types, exceptions and values |
IEEE_REAL | Floating-point classes and hardware control |
IO | Basic I/O types and exceptions |
LIST | List type and utility functions |
LIST_PAIR | List of pairs and utility functions |
OPTION | Optional values and partial functions |
OS | Basic operating system services |
OS_FILE_SYS | File status and directory operations |
OS_IO | Support for polling I/O devices |
OS_PATH | Pathname operations |
OS_PROCESS | Simple process operations |
SML90 | Structure for backward compatability |
STRING_CVT | Support for conversions between strings and values |
TIME | Representation of time values |
TIMER | Timing operations |
VECTOR | Immutable polymorphic arrays |
Structure | Signature | Description |
---|---|---|
Array | ARRAY | Mutable polymorphic arrays |
BinIO | BIN_IO | Binary input/output types and operations |
BinPrimIO | PRIM_IO | Low-level binary IO |
Bool | BOOL | Boolean type and values |
Byte | BYTE | Conversions between Word8 and Char values |
Char | CHAR | Default characters |
CharArray | MONO_ARRAY | Mutable arrays of characters |
CharVector | MONO_VECTOR | Immutable arrays of characters |
CommandLine | COMMAND_LINE | Program name and arguments |
Date | DATE | Calendar operations |
General | GENERAL | General-purpose types, exceptions and values |
IEEEReal | IEEE_REAL | Floating-point classes and hardware control |
Int | INTEGER | Default integer type |
IO | IO | Basic I/O types and exceptions |
LargeInt | INTEGER | Largest integer representation |
LargeReal | REAL | Largest floating-point representation |
LargeWord | WORD | Largest word representation |
List | LIST | List type and utility functions |
ListPair | LIST_PAIR | List of pairs and utility functions |
Math | MATH | Default math structure |
Option | OPTION | Optional values and partial functions |
OS | OS | Basic operating system services |
OS.FileSys | OS_FILE_SYS | File status and directory operations |
OS.IO | OS_IO | Support for polling I/O devices |
OS.Path | OS_PATH | Pathname operations |
OS.Process | OS_PROCESS | Simple process operations |
Position | INTEGER | File system positions |
Real | REAL | Default floating-point type |
SML90 | SML90 | Structure for backward compatability |
String | STRING | Default strings |
StringCvt | STRING_CVT | Conversions between strings and various types |
Substring | SUBSTRING | Substrings |
TextIO | TEXT_IO | Text input/output types and operations |
TextPrimIO | PRIM_IO | Low-level text IO |
Time | TIME | Representation of time values |
Timer | TIMER | Timing operations |
Vector | VECTOR | Immutable polymorphic vectors |
Word | WORD | Default word type |
Word8 | WORD | 8-bit words |
Word8Array | MONO_ARRAY | Arrays of 8-bit words |
Word8Vector | MONO_VECTOR | Vectors of 8-bit words |
The library specifies a large collection of signatures and structures that are considered optional in a conforming implementation. They provide features that, although useful, are not considered fundamental to a workable SML implementation. These modules include additional representations of integers, words, characters and reals; more efficient array and vector representations; and a subsystem providing Posix compatability.
Although an implementation may or may not provide one of these modules, if it provides one, the module must exactly match the specification given in this document. The names specified here for optional signatures and structures must be used at top-level only to denote implementations of the specified library module. On the other hand, if an implementation offers features related to an optional module, it should also provide the optional module.
The library specifies the following optional signatures.
Signature | Description |
---|---|
ARRAY2 | Mutable polymorphic 2-dimensional arrays |
INT_INF | Arbitrary-precision integers |
LOCALE | Support for locale-dependent applications |
MONO_ARRAY2 | Mutable monomorphic 2-dimensional arrays |
MULTIBYTE | Support for multibyte characters |
PACK_REAL | Support for packing floats into vectors of 8-bit words |
PACK_WORD | Support for packing words into vectors of 8-bit words |
POSIX | Root POSIX structure |
POSIX_ERROR | POSIX error values |
POSIX_FILE_SYS | POSIX file system operations |
POSIX_FLAGS | Support for sets of system flags |
POSIX_IO | POSIX I/O operations |
POSIX_PROC_ENV | POSIX process environment operations |
POSIX_PROCESS | POSIX process operations |
POSIX_SIGNAL | POSIX signal types and values |
POSIX_SYS_DB | POSIX system database types and values |
POSIX_TTY | Control of POSIX TTY drivers |
UNIX | Various Unix specific operations |
Structure | Signature | Description |
---|---|---|
Array2 | ARRAY2 | Mutable polymorphic 2-dimensional arrays |
BoolArray | MONO_ARRAY | Mutable arrays of booleans |
BoolArray2 | MONO_ARRAY2 | 2-dimensional arrays of booleans |
BoolVector | MONO_VECTOR | Immutable arrays of booleans |
CharArray2 | MONO_ARRAY2 | 2-dimensional arrays of characters |
FixedInt | INTEGER | Largest fixed precision integers |
ImperativeIO | IMPERATIVE_IO | Functor to convert stream I/O into imperative IO |
IntInf | INT_INF | Arbitrary-precision integers |
IntN | INTEGER | N-bit, fixed precision integers |
IntArray | MONO_ARRAY | Mutable arrays of default integers |
IntNArray | MONO_ARRAY | Mutable arrays of N-bit integers |
IntArray2 | MONO_ARRAY2 | 2-dimensional arrays of integers |
IntNArray2 | MONO_ARRAY2 | 2-dimensional arrays of N-bit integers |
IntVector | MONO_VECTOR | Immutable vectors of default integers |
IntNVector | MONO_VECTOR | Immutable vectors of N-bit integers |
Locale | LOCALE | Support for locale-dependent applications |
MultiByte | MULTIBYTE | Support for multibyte characters |
PackRealNBig | PACK_REAL | Big-endian packing for N-bit floats |
PackRealNLittle | PACK_REAL | Little-endian packing for N-bit floats |
PackRealBig | PACK_REAL | Big-endian packing for default floats |
PackRealLittle | PACK_REAL | Little-endian packing for default floats |
PackNBig | PACK_WORD | Big-endian packing for N-byte words |
PackNLittle | PACK_WORD | Little-endian packing for N-byte words |
Posix | POSIX | Root POSIX structure |
Posix.Error | POSIX_ERROR | POSIX error values |
Posix.FileSys | POSIX_FILE_SYS | POSIX file system operations |
Posix.IO | POSIX_IO | POSIX I/O operations |
Posix.ProcEnv | POSIX_PROC_ENV | POSIX process environment operations |
Posix.Process | POSIX_PROCESS | POSIX process operations |
Posix.Signal | POSIX_SIGNAL | POSIX signal types and values |
Posix.SysDB | POSIX_SYS_DB | POSIX system database types and values |
Posix.TTY | POSIX_TTY | Control of POSIX TTY drivers |
PrimIO | PRIM_IO |
Functor to build PRIM_IO structure
|
RealArray | MONO_ARRAY | Mutable arrays for default floats |
RealVector | MONO_VECTOR | Immutable vectors for default floats |
RealN | REAL | N-bit floating-point numbers |
RealNArray | MONO_ARRAY | Mutable arrays of N-bit floating-point numbers |
RealNVector | MONO_VECTOR | Immutable vectors of N-bit floating-point numbers |
RealArray2 | MONO_ARRAY2 | 2-dimensional arrays of floating-point numbers |
RealNArray2 | MONO_ARRAY2 | 2-dimensional arrays of N-bit floating-point numbers |
StreamIO | STREAM_IO | Functor to convert primitive I/O into stream I/O |
SysWord | WORD | Words sufficient for OS operations |
WideChar | CHAR | Support for wide characters |
WideCharArray | MONO_ARRAY | Mutable arrays of wide characters |
WideCharArray2 | MONO_ARRAY2 | 2-dimensional arrays of wide characters |
WideCharVector | MONO_VECTOR | Immutable vectors of wide characters |
WideString | STRING | Support for wide strings |
WideSubstring | SUBSTRING | Support for wide substrings |
WideTextPrimIO | PRIM_IO | Low-level wide char IO |
WideTextIO | TEXT_IO | Text I/O on wide characters |
WordN | WORD | N-bit words |
Word8Array2 | MONO_ARRAY2 | 2-dimensional arrays of 8-bit words |
Unix | UNIX | Unix-like process invocation |
We specify certain relationships among the modules.
To permit users to compile programs written under the old basis, we require that each implementation provide the structure SML90
. This structure contains the top-level bindings specified in the 1990 version of the [CITE]Definition/, along with one or more substructures that define the top-level bindings of various implementations. For example, a user might write:
local open SML90 SML90.SMLNJ in (* user's program *) endto compile a user's program under the old SML/NJ basis.
We expect that at some future point, the SML90
module will be deemed obsolete, and will be dropped from the standard basis.
In designing the library, we have tried to follow a set of stylistic rules to make library usage consistent and predictable, and to preclude certain errors. These rules are not meant to be prescriptive for the programmer using or extending the library. On the other hand, although the library itself thwarts the conventions on occasion, we feel the rules are reasonable and helpful, and would encourage their use.
We use a new set of spelling and capitalization conventions. Some of these conventions, e.g., the capitalization of value constructors, seem to be widely accepted in the user community. Other decisions were based less on dominant style or compelling reason than on compromise and the need for consistency and some sense of good taste.
The conventions we use are:
map
, openIn
.
word
, file_desc
.
PACK_WORD
, OS_PATH
. We refer to this as the signature convention.
General
, WideChar
. We refer to this as the structure convention.
SOME
, A_READ
, FOLLOW_ALL
. In certain cases, where external usage or aesthetics dictates otherwise, the structure convention is followed; e.g., Jan
, Mon
. Within the basis library, the only use of the latter convention occurs with the months and weekdays in Date. The only exceptions to these rules are the identifiers nil
, true
and false
, where we bow to tradition.
Domain
, TerminatedStream
.
Similar values should have similar names, with similar type shapes, following the conventions outlined above. For example, the function Array.app
has the type:
val app : ('a -> unit) -> 'a array -> unitwhich has the same shape as
List.app
. Names should be meaningful, but concise. We have broken this rule, however, in certain instances where previous usage seemed compelling. For example, we have kept the name app
rather than adopt apply
. More dramatically, we have purposely kept most of the traditional Unix names in the optional Posix modules, to capitalize on the familiarity of these names and the available documentation.
Many structures define a type ty
along with a comparison function
val compare : ty * ty -> orderplus the expected relational operators
>
, >=
, <
and <=
. In all cases, the standard relationships hold between these functions. For example, we have x > y = true
if and only if compare(x, y) = GREATER
. If, in addition, ty
is an equality type, we assume that the operators =
and <>
satisfy the usual relationships with compare
and the relational operators. For example, if x = y
, then compare(x,y) = EQUAL
. Note that these assumptions are not quite true for real values; see the REAL signature for more details.
Types that have a standard or obvious linear order come with the full set of relational operators plus a compare
function. Certain abstract types, e.g., OS.FileSys.file_id, provide a compare
function for use with, for example, ordered binary trees.
Most structures defining a type provide conversion functions to and from other types. When unambiguous, we use the naming convention toT and fromT, where T is some version of the name of the other type. For example, in WORD, we have
val fromInt : Int.int -> word val toInt : word -> Int.intIf this naming is ambiguous (e.g., a structure defines multiple types that have conversions from integers), we use the convention TFromTT and TToTT. For example, in POSIX_PROC_ENV, we have
val uidToWord : uid -> SysWord.word val gidToWord : gid -> SysWord.word
There should be conversions to and from strings for most types. Following the convention above, these functions are typically called toString
and fromString
. Usually, modules provide additional string conversion functions that allow more control over format and operate on an abstract character stream. These functions are called fmt
and scan
. The input accepted by fromString
and scan
consists of printable ASCII characters. The output generated by toString
and fmt
consists of printable ASCII characters.
We adopt the convention that conversions from strings should be forgiving, allowing initial white space and multiple formats, and ignoring additional terminating characters. On the other hand, we have tried to specify conversions to strings precisely. In addition, for basic types, scanning functions should accept legal SML literals, and formatting functions should, whenever possible, produce the value part of a valid SML literal but, for flexibility, may omit certain annotations. For example, String.toString
produces a valid SML string constant, but without the enclosing quotes, and Word.toString
produces a word constant without the "0wx"
prefix.
The old basis did not provide a character type, only a string type. To manipulate characters, programmers used integers corresponding to the character's code. This was unsatisfactory for several reasons:
The revised SML Definition introduces a new char
type and literal syntax along with old string
type. The SML Standard Basis provides support for both string
and char
types, where the string
type is a vector of characters. In addition, we define the optional types WideString.string
and WideChar.char
, in which the former is again a vector of the latter, for handling character sets more extensive than Latin-1.
Functional arguments that are evaluated solely for their side-effects should have a return type of unit
. For example, the list application function should have the type:
val app : ('a -> unit) -> 'a list -> unit
Last Modified August 5, 1997
Comments to John Reppy.
Copyright © 1997 Bell Labs, Lucent Technologies