STREAM_IO
signatureThe STREAM_IO signature defines the interface of the Stream IO layer in the I/O stack. This layer provides buffering over the primitive readers and writers of the primitive IO layer.
Input streams are treated in the lazy functional style: that is, input from a stream f yields a finite vector of elements, plus a new stream f'. Input from f again will yield the same elements; to advance within the stream in the usual way it is necessary to do further input from f'. This interface allows arbitrary lookahead to be done very cleanly, which should be useful both for ad hoc lexical analysis and for table-driven, regular-expression-based lexing.
Output streams are handled more conventionally, since the lazy functional style doesn't seem to make sense for output.
Stream I/O functions may raise the Size exception if a resulting vector of elements would exceed the maximum vector size, or the IO.Io exception. In general, when IO.Io is raised as a result of a failure in a lower-level module, the underlying exception is propagated up as the cause
component of the IO.Io exception value. This will usually be a Subscript, OS.SysErr or Fail exception, but the stream I/O module will rarely (perhaps never) need to inspect it.
signature STREAM_IO
type elem
type vector
type reader
type writer
type instream
type outstream
type in_pos
type out_pos
type pos
val input : instream -> (vector * instream)
val input1 : instream -> (elem * instream) option
val inputN : (instream * int) -> (vector * instream)
val inputAll : instream -> vector
val canInput : (instream * int) -> int option
val closeIn : instream -> unit
val endOfStream : instream -> bool
val mkInstream : (reader * vector) -> instream
val getReader : instream -> (reader * vector)
val getPosIn : instream -> in_pos
val setPosIn : in_pos -> instream
val filePosIn : in_pos -> pos
val output : (outstream * vector) -> unit
val output1 : (outstream * elem) -> unit
val flushOut : outstream -> unit
val closeOut : outstream -> unit
val setBufferMode : (outstream * IO.buffer_mode) -> unit
val getBufferMode : outstream -> IO.buffer_mode
val mkOutstream : (writer * IO.buffer_mode) -> outstream
val getWriter : outstream -> (writer * IO.buffer_mode)
val getPosOut : outstream -> out_pos
val setPosOut : out_pos -> outstream
val filePosOut : out_pos -> pos
type elem
type vector
type reader
type writer
type instream
type outstream
type in_pos
type out_pos
type pos
input f
input1 f
inputN (f, n)
Using instreams, one can synthesize a non-blocking version of inputN from inputN and canInput, as inputN is guaranteed not to block if a previous call to canInput returned SOME _
.
inputAll f
canInput (f, n)
SOME k
, where 0 <= k <= n, if a call to input would return immediately with k characters. Note that k = 0 corresponds to the stream being at end-of-stream.
Some streams may not support this operation, in which case the Io exception will be raised. This function also raises the Io exception if there is an error in the underlying system calls. It raises the Size exception if n < 0.
Implementation note:
Implementations of canInput should attempt to return as large a k as possible. For example, if the buffer contains 10 characters and the user calls
canInput (f, 15)
, canInput should callreadVecNB 5
to see if an additional 5 characters are available.
closeIn f
endOfStream f
true
; otherwise it returns false
. This function raises the Io exception if there is an error in the underlying system calls.
This function may block when checking for more input. It is equivalent to
(length(#1(input f)) = 0)where length is the vector length operation
Note that even if this returns true
, subsequent input operations may succeed if more data becomes available. We always have
endOfStream f = endOfStream fIn addition, if
endOfStream f
returns true
, then input f
returns ("",f')
and endOfStream f'
may or may not be true.
mkInstream (rd, v)
Question:Note that building more than one instream on top of a single reader has unpredictable effects, since readers are imperative objects.
We should explain the mapping between optional fields of the reader and supported operations (as a table?).
getReader f
getPosIn strm
setPosIn pos
filePosIn pos
output (f, vec)
output1 (f, elem)
flushOut f
closeOut f
setBufferMode (ostr, mode)
getBufferMode ostr
mkOutstream wr
Question:Note that building more than one outstream on top of a single writer has unpredictable effects, since buffering may change the order of output.
We should explain the mapping between optional fields of the writer and supported operations (as a table?).
getWriter f
getPosOut strm
setPosOut pos
filePosOut pos
The following expressions are all guaranteed true
, if they complete without exception.
Input is semi-deterministic: input
may read any number of elements from f the ``first'' time, but then it is committed to its choice, and must return the same number of elements on subsequent reads from the same point.
let val (a,_) = input f val (b,_) = input f in a=b end
Closing a stream just causes the not-yet-determined part of the stream to be empty:
let val (a,f') = input f val _ = closeIn f val (b,_) = input f in a=b andalso endOfStream f' end
Closing a terminated stream is legal and harmless:
(closeIn f; closeIn f; true)
If a stream has already been at least partly determined, then input cannot possibly block:
let val (a,_) = input f in canInput (f, length a) end (* must be true *)Note that a successful canInput does not imply that more characters remain before end-of-stream, just that reading won't block.
A freshly opened stream is still undetermined (no ``read'' has yet been done on the underlying reader):
let val a = mkInstream r in closeIn a; size(#1(input a)) = 0 endThis has the useful consequence that if one opens a stream, then extracts the underlying reader, the reader has not yet been advanced in its file.
Closing a stream guarantees that the underlying reader will never again be accessed; so input can't possibly block.
The endOfStream test is equivalent to input returning an empty sequence:
let val (a,_) = input f in (length(a)=0) = (endOfStream f) end
Unbuffered I/O If chunkSize
= 1 in the underlying reader, then input operations must be unbuffered:
let val f = mkInstream(reader) val (a,f') = input f val PrimIO.Rd{chunkSize,...} = getReader f in (chunkSize > 1) orelse endOfStream f' ndAlthough input may perform a
read(k)
operation on the reader (for k >= 1), it must immediately return all the elements it receives. However, this does not hold for partly determined instreams:
let val f = mkInstream(reader) val _ = doInputOperationsOn(f) val (a,f') = input f val PrimIO.Rd{chunkSize,...} = getReader f in chunkSize>1 orelse endOfStream f' (* could be false*) endbecause in this case, the stream
f
may have accumulated a history of several responses, and input is required to repeat them one at a time.
Output buffering is controlled by the getBufferMode and setBufferMode functions.
Don't bother the reader Input must be done without any operation on the underlying reader, whenever it is possible to do so by using elements from the buffer. This is necessary so that repeated calls to endOfStream will not make repeated system calls.
PRIM_IO, IMPERATIVE_IO, TEXT_STREAM_IO, StreamIO
Last Modified May 10, 1996
Comments to John Reppy.
Copyright © 1997 Bell Labs, Lucent Technologies