Stdout gets truncated

When pola-run is forwarding stdout, the output is often truncated, but not always.

$ wc --bytes input-file.txt
19813 input-file.txt
$ pola-run -B --prog cat -fa input-file.txt | wc --bytes
2048

The problem is that there is a race condition between pola-run reading from the pipe that is connected to the subprocess's stdout, and receiving the subprocess's exit status. This was anticipated, and python/plash/filedesc.py tries to read any buffered data from the pipe, in order to flush stdout when the subprocess exits. This flushing code is broken because the buffer size it uses is smaller than the buffer size of pipes, and it does not try to read more than one buffer's worth.

There is a second problem, which can occur if the forwarder's input stream (i.e. the subprocess's stdout/stderr) is closed before the subprocess exits. When the forwarder reads an end-of-file condition on its input stream, it discards the contents of its buffer, and does not attempt to forward the buffer contents to the output stream.

Tasks to fix

There is still a flushing problem that can occur when the subprocess exits. The forwarder will read all the remaining data from the input stream, adding it to the buffer. However, it will not always successfully write all the buffered to the output stream. pola-run (for example) would exit and discard data. This can occur if the output stream's buffer was full.

This was complicated by the fact that it can continue to forward data afterwards. I have changed FDForwarder so that it keeps track of what point in the output buffer needs to be synced. pola-run can return when the output buffer has been flushed upto a saved point.

See EventLoopAndFDs for general implementation notes in this area

PlashIssues/StdoutGetsTruncated (last edited 2008-03-30 17:06:44 by MarkSeaborn)