![]() | Library Documentation | Structures | Signatures | Identifiers | Packages | About |
STREAM_IO (basis)Streaming IO operations.
The STREAM_IO signature defines the interface of the Stream I/O layer in the I/O stack. This layer provides buffering over the readers and writers of the Primitive I/O layer. Input streams are treated in the lazy functional style: that is, input from a stream f yields a finite vector of elements, plus a new stream f'. Input from f again will yield the same elements; to advance within the stream in the usual way, it is necessary to do further input from f'. This interface allows arbitrary lookahead to be done very cleanly, which should be useful both for ad hoc lexical analysis and for table-driven, regular-expression-based lexing. Output streams are handled more conventionally, since the lazy functional style does not seem to make sense for output. Stream I/O functions may raise the Size exception if a resulting vector of elements would exceed the maximum vector size, or the IO.Io exception. In general, when IO.Io is raised as a result of a failure in a lower-level module, the underlying exception is caught and propagated up as the cause component of the IO.Io exception value. This will usually be a Subscript, IO.ClosedStream, OS.SysErr, or Fail exception (the last possible because of user-supplied readers or writers), but the stream I/O module will rarely (perhaps never) need to inspect it.signature STREAM_IO =
sig
type elem
type vector
type instream
type outstream
type out_pos
type reader
type writer
type pos
val input : instream -> vector * instream
val input1 : instream -> (elem * instream) option
val inputN : instream * int -> vector * instream
val inputAll : instream -> vector * instream
val canInput : instream * int -> int option
val closeIn : instream -> unit
val endOfStream : instream -> bool
val output : outstream * vector -> unit
val output1 : outstream * elem -> unit
val flushOut : outstream -> unit
val closeOut : outstream -> unit
val mkInstream : reader * vector -> instream
val getReader : instream -> reader * vector
val filePosIn : instream -> pos
val setBufferMode : outstream * IO.buffer_mode -> unit
val getBufferMode : outstream -> IO.buffer_mode
val mkOutstream : writer * IO.buffer_mode -> outstream
val getWriter : outstream -> writer * IO.buffer_mode
val getPosOut : outstream -> out_pos
val setPosOut : out_pos -> outstream
val filePosOut : out_pos -> pos
end
Implementation note:
It is suggested that implementations of canInput should attempt to
return as large a k as possible. For example, if the buffer
contains 10 characters and the user calls canInput (f, 15),
canInput should call readVecNB(5) to see if an additional 5
characters are available.
Such a lookahead commits the stream to the characters read by
readVecNB but it does not commit the stream to return those
characters on the next call to input. Indeed, a typical
implementation will simply return the remainder of the current
buffer, in this case, consisting of 10 characters, if input is
called. On the other hand, an implementation can decide to always
respond to input with all the elements currently available,
provided an earlier call to input has not committed the stream to
a particular response. The only requirement is that any future
call of input on the same input stream must return the same vector
of elements.
(length(#1(input f)) = 0)
where length is the vector length operation. Note that even if
endOfStream returns true, subsequent input operations may succeed if
more data becomes available. A stream can have multiple end-of-streams
interspersed with normal elements. This can happen on Unix, for
example, if a user types control-D (#"\^D") on a terminal device, and
then keeps typing characters; it may also occur on file descriptors
connected to sockets.
Multiple end-of-streams is a property of the underlying reader. Thus,
readVec on a reader may return an empty string, then another call to
readVec on the same reader may return a nonempty string, then a third
call may return an empty string. It is always true, however, that
endOfStream f = endOfStream f
In addition, if endOfStream f returns true, then input f returns
("",f') and endOfStream f' may or may not be true.
instream supports: if reader implements:
input, inputN, etc. readVec
canInput readVecNB
endOfStream readVec
filePosIn getPos and setPos
If the reader provides more operations, the resulting stream may use
them. mkInstream should construct the input stream using the reader
provided. If the user wishes to employ synthesized functions in the
reader, the user may call mkInstream with an augmented reader
augmentReader(rd). See PRIM_IO for a description of the functions
generated by augmentReader.
Building more than one input stream on top of a single reader has
unpredictable effects, since readers are imperative objects. In
general, there should be a 1-1 correspondence between a reader and a
sequence of input streams. Also note that creating an input stream
this way means that the stream could be unaware that the reader has
been closed until the stream actually attempts to read from it.
(setPos (filePosIn f); readVec (length v))
should also return v, assuming all operations are defined and terminate.
Implementation note:
If the pos type is a concrete integer corresponding to a byte
offset, and the translation function (between bytes and elements)
is known, the value can be computed directly. If not, the value is
given by
fun pos (bufp, n, r as RD rdr) = let
val readVec = valOf (#readVec rdr)
val getPos = valOf (#getPos rdr)
val setPos = valOf (#setPos rdr)
val savep = getPos()
in
setPos bufp;
readVec n;
getPos () before setPos savep
end
where bufp is the file position corresponding to the beginning of
the current buffer, n is the number of elements already read from
the current buffer, and r is the stream's underlying reader.
outstream supports: if augmented writer implements:
output, output1, etc. writeArr
flushOut writeArr
setBufferMode writeArr
getPosOut writeArr and getPos
setPosOut writeArr and setPos
If the writer provides more operations, the resulting stream may use
them. mkOutstream should construct the output stream using the writer
provided. If the user wishes to employ synthesized functions in the
writer, the user may call mkOutstream with an augmented writer
augmentWriter(wr). See PRIM_IO for a description of the functions
generated by augmentWriter.
Building more than one outstream on top of a single writer has
unpredictable effects, since buffering may change the order of
output. In general, there should be a 1-1 correspondence between a
writer and an output stream. Also note that creating an output stream
this way means that the stream could be unaware that the writer has
been closed until the stream actually attempts to write to it.
Implementation note:
A typical implementation of this function will require calculating
a value of type pos, capturing where the next element written to f
will be written in the underlying file. If the pos type is a
concrete integer corresponding to a byte offset, and the
translation function (between bytes and elements) is known, the
value can be computed directly using getPos. If not, the value is
given by
fun pos (f, w as WR wtr) = let
val getPos = valOf (#getPos wtr)
in
flushOut f;
getPos ()
end
where f is the output stream and w is the stream's underlying writer.
(setPos opos; writeVec{buf=v,i=0,sz=NONE})
should have the same effect as the last line of the function
fun put (outs,x) = (flushOut outs;
output(outs,x);flushOut outs)
when called with (f,v) assuming all operations are defined and
terminate, and that the call to writeVec returns length v.