The Standard ML Basis Library

The `STREAM_IO` signature

The STREAM_IO signature defines the interface of the Stream IO layer in the I/O stack. This layer provides buffering over the primitive readers and writers of the primitive IO layer.

Input streams are treated in the lazy functional style: that is, input from a stream f yields a finite vector of elements, plus a new stream f'. Input from f again will yield the same elements; to advance within the stream in the usual way it is necessary to do further input from f'. This interface allows arbitrary lookahead to be done very cleanly, which should be useful both for ad hoc lexical analysis and for table-driven, regular-expression-based lexing.

Output streams are handled more conventionally, since the lazy functional style doesn't seem to make sense for output.

Stream I/O functions may raise the Size exception if a resulting vector of elements would exceed the maximum vector size, or the IO.Io exception. In general, when IO.Io is raised as a result of a failure in a lower-level module, the underlying exception is propagated up as the cause component of the IO.Io exception value. This will usually be a Subscript, OS.SysErr or Fail exception, but the stream I/O module will rarely (perhaps never) need to inspect it.

Synopsis

signature STREAM_IO

Interface

type elem type vector type reader type writer type instream type outstream type in_pos type out_pos type pos val input : instream -> (vector * instream) val input1 : instream -> (elem * instream) option val inputN : (instream * int) -> (vector * instream) val inputAll : instream -> vector val canInput : (instream * int) -> int option val closeIn : instream -> unit val endOfStream : instream -> bool val mkInstream : (reader * vector) -> instream val getReader : instream -> (reader * vector) val getPosIn : instream -> in_pos val setPosIn : in_pos -> instream val filePosIn : in_pos -> pos val output : (outstream * vector) -> unit val output1 : (outstream * elem) -> unit val flushOut : outstream -> unit val closeOut : outstream -> unit val setBufferMode : (outstream * IO.buffer_mode) -> unit val getBufferMode : outstream -> IO.buffer_mode val mkOutstream : (writer * IO.buffer_mode) -> outstream val getWriter : outstream -> (writer * IO.buffer_mode) val getPosOut : outstream -> out_pos val setPosOut : out_pos -> outstream val filePosOut : out_pos -> pos

Description

type elem

type vector

These are the abstract types of stream elements and vectors of elements. For text streams, these are Char.char and String.string, while for binary streams, these are Word8.word and Word8Vector.vector.

type reader

type writer

These are the types of the readers and writers that underlie the input and output streams.

type instream

These are buffered functional input streams.

type outstream

These are buffered output streams. Unlike input streams, these are imperative objects.

type in_pos

type out_pos

These are the abstract types of positions in input and output streams.

type pos

This is the type of positions in the underlying readers and writers.

input f

if elements are available, returns a vector of one or more elements from the stream and the remainder of the stream. If the end-of-stream has been reached, then the empty vector is returned. May block until one of these conditions is satisfied. This function raises the Io exception if there is an error in the underlying system calls.

input1 f

returns the next element in the stream f and the remainder of the stream. If the stream is at the end, then NONE is returned. May block until one of these conditions is satisfied. This function raises the Io exception if there is an error in the underlying system calls.

inputN (f, n)

returns a vector of the next n elements from f and the rest of the stream. If fewer than n elements are available, then it returns all of the elements up to the end-of-stream (the empty vector means that there is no more input). May block until it can determine if additional characters are available or the end-of-stream condition holds. This function raises the Io exception if there is an error in the underlying system calls. Raises Size if n < 0 or the number of elements to be returned is greater than maxLen.

Using instreams, one can synthesize a non-blocking version of inputN from inputN and canInput, as inputN is guaranteed not to block if a previous call to canInput returned SOME _.

inputAll f

returns the vector of the rest of the elements in the stream f (i.e., up to end-of-stream). Care should be taken when using this function, since it can block indefinitely on interactive streams. This function raises the Io exception if there is an error in the underlying system calls. Raises Size if the number of elements to be returned is greater than maxLen.

canInput (f, n)

returns NONE if any attempt at input would block. Returns SOME k, where 0 <= k <= n, if a call to input would return immediately with k characters. Note that k = 0 corresponds to the stream being at end-of-stream.

Some streams may not support this operation, in which case the Io exception will be raised. This function also raises the Io exception if there is an error in the underlying system calls. It raises the Size exception if n < 0.

Implementation note:

Implementations of canInput should attempt to return as large a k as possible. For example, if the buffer contains 10 characters and the user calls canInput (f, 15), canInput should call readVecNB 5 to see if an additional 5 characters are available.

closeIn f

truncates the instream f, and releases any associated system resources. Applying closeIn on a closed stream has no effect.

endOfStream f

tests if f satisfies the end-of-stream condition. If there is no further input in the stream, then this returns true; otherwise it returns false. This function raises the Io exception if there is an error in the underlying system calls.

This function may block when checking for more input. It is equivalent to

            (length(#1(input f)) = 0)

where length is the vector length operation

Note that even if this returns true, subsequent input operations may succeed if more data becomes available. We always have

            endOfStream f = endOfStream f

In addition, if endOfStream f returns true, then input f returns ("",f') and endOfStream f' may or may not be true.

mkInstream (rd, v)

returns a new instream built on top of the reader rd with initial buffer contents v.

Question:

We should explain the mapping between optional fields of the reader and supported operations (as a table?).

Note that building more than one instream on top of a single reader has unpredictable effects, since readers are imperative objects.

getReader f

truncates the instream f and returns the underlying reader along with any unconsumed data from its buffer. This raises the exception Io if f is closed or truncated.

getPosIn strm

returns the current position in the stream strm.

setPosIn pos

returns a stream based on the position and stream recorded in pos.

filePosIn pos

returns the primitive-level reader position that corresponds to the abstract input stream position pos.

output (f, vec)

writes the vector of elements vec to the stream f. This raises the exception Io if f is terminated. This function also raises the Io exception if there is an error in the underlying system calls.

output1 (f, elem)

writes the element elem to the stream f. This raises the exception Io if f is terminated. This function also raises the Io exception if there is an error in the underlying system calls.

flushOut f

flushes any output in the outstream's buffer to the underlying writer; it is a no-op on terminated streams. This function raises the Io exception if there is an error in the underlying system calls.

closeOut f

flushes f's buffers, marks the stream closed, and closes the underlying writer. This operation has no effect if f is already closed. If f is terminated, it should close the underlying writer. This function raises the Io exception if there is an error in the underlying system calls.

setBufferMode (ostr, mode)

getBufferMode ostr

set and get the buffering mode of the output stream ostr. Setting the buffer mode to IO.NO_BUF causes any buffered output to be flushed.

mkOutstream wr

returns a new outstream built on top of the writer wr.

Question:

We should explain the mapping between optional fields of the writer and supported operations (as a table?).

Note that building more than one outstream on top of a single writer has unpredictable effects, since buffering may change the order of output.

getWriter f

flushes and terminates the outstream f, and returns the underlying writer. This raises the exception Io if f is closed.

getPosOut strm

returns the current position out the stream strm.

setPosOut pos

sets the current position of the stream underlying pos to the position recorded in pos, and returns the stream.

filePosOut pos

returns the primitive-level writer position that corresponds to the abstract output stream position pos.

Discussion

The following expressions are all guaranteed true, if they complete without exception.

Input is semi-deterministic: input may read any number of elements from f the ``first'' time, but then it is committed to its choice, and must return the same number of elements on subsequent reads from the same point.

let val (a,_) = input f
    val (b,_) = input f
 in  a=b
end

Closing a stream just causes the not-yet-determined part of the stream to be empty:

let val (a,f') = input f
    val _ = closeIn f
    val (b,_) = input f
 in  a=b andalso endOfStream f'
end

Closing a terminated stream is legal and harmless:

  (closeIn f; closeIn f; true)

If a stream has already been at least partly determined, then input cannot possibly block:

let val (a,_) = input f
 in canInput (f, length a) 
end (* must be true *)

Note that a successful canInput does not imply that more characters remain before end-of-stream, just that reading won't block.

A freshly opened stream is still undetermined (no ``read'' has yet been done on the underlying reader):

let val a = mkInstream r
 in closeIn a;
    size(#1(input a)) = 0
end

This has the useful consequence that if one opens a stream, then extracts the underlying reader, the reader has not yet been advanced in its file.

Closing a stream guarantees that the underlying reader will never again be accessed; so input can't possibly block.

The endOfStream test is equivalent to input returning an empty sequence:

let val (a,_) = input f  
  in (length(a)=0) = (endOfStream f)   
end

Unbuffered I/O If chunkSize = 1 in the underlying reader, then input operations must be unbuffered:

let
val f = mkInstream(reader)
val (a,f') = input f
val PrimIO.Rd{chunkSize,...} = getReader f
in
  (chunkSize > 1) orelse endOfStream f'
nd

Although input may perform a read(k) operation on the reader (for k >= 1), it must immediately return all the elements it receives. However, this does not hold for partly determined instreams:

 let val f = mkInstream(reader)
     val _ = doInputOperationsOn(f)
     val (a,f') = input f
     val PrimIO.Rd{chunkSize,...} = getReader f
  in chunkSize>1 orelse endOfStream f'  (* could be false*)
 end

because in this case, the stream f may have accumulated a history of several responses, and input is required to repeat them one at a time.

Output buffering is controlled by the getBufferMode and setBufferMode functions.

Don't bother the reader Input must be done without any operation on the underlying reader, whenever it is possible to do so by using elements from the buffer. This is necessary so that repeated calls to endOfStream will not make repeated system calls.