Add unit tests for existing modules
Status: done
Add unit tests for existing modules that currently lack unit tests. This includes:
FsObjReadOnly: test the proxy
pola_run_args: add tests for each command line option
-- This was mostly done under PlashIssues/PolaRunEqualsSyntax plash.process: the code is covered by other test cases (e.g. libc_test), but it would be good to have a minimal test case
PlashObjectCapabilityProtocol: write a Python implementation to test the C implementation against
Add encoding/decoding of basic messages
Add pure-Python FD wrapper and event loop
Implement main part of PlashObjectCapabilityProtocol
Add call-return
Handle terminating connections
- Handle protocol violations
Handle exceptions occurring in callbacks from event loops
- Handle exceptions occurring in calls to cap_invoke() from the cap protocol
Implement event loop interface on top of gobject
Extended test framework with testrunner.TestCase and testrunner.CombinationTestCase. Makes it easier to run a test against multiple implementations
Hook up to test the C implementation against the newly-added tests
- Update call-return implementation to follow same encoding convention as the main codebase
- Modularize the marshalling code so that marshalling wrapper methods can be passed in when a connection is created (similar to the way the event loop is a parameter of the connection constructor), rather than being defined globally
Preserve equality (EQ) of objects sent across a connection
Make sure it works with TCP sockets as well as Unix domain sockets
Add test case for ECONNRESET ("Connection reset by peer") and handle it gracefully. ECONNRESET can occur when trying to read from a socket which the other end has closed without reading all the pending data. (If there is no data waiting to be sent to the other end, read() just returns an end-of-stream condition.) shutdown() also affects whether ECONNRESET is raised, with different results for TCP and AF_UNIX sockets.
- Handle Gtk global lock: allow all glib FD watches to be wrapped with Gtk lock
See EventLoopAndFDs for further notes
- Executable objects
Test that PlashGlibc's execve() works properly on executable objects
- Check that close-on-exec FDs are not passed
Check that FDs are not leaked by any test case
Test framework
The testing framework provided by the unittest module in Python's standard library has some shortcomings.
The biggest limitation is that there is no convenient way to parameterize tests. There are two cases for this:
Test multiple implementations of an interface against the same test code (a small, fixed number of cases). For example, test PlashObjectCapabilityProtocol when used with a glib-based event dispatcher and when used with a poll()-based event dispatcher.
Test against a large number of cases, e.g. encoding/decoding X protocol messages in X11ProxySpike.
(1) can be done using inheritance, but that is limited. Inheritance can't generalise in more than one direction without becoming unwieldy. Inheritance is usually best avoided.
(2) can be done using a simple "for" loop, but when it fails, the traceback doesn't show which subtest was being used, and one failure blocks all further failures. (The full set of failures and successes can be useful for diagnosing a problem.) A partial solution is a wrapper function which re-throws all subtests' exceptions with descriptions (see subtests() in scratch/x11-proxy/server_test.py).
Other issues:
Many tests need to create temporary directories, and we want to ensure these always get deleted when the test is finished. It is easy to implement a mixin for unittest.TestCase that provides a make_temp_dir() method and deletes the temp dirs on tearDown(). A more general mechanism, however, would be to register callbacks for running at the end of the test to perform tidy-ups, as a replacement for tearDown(). That would avoid inheritance weirdness with tearDown().
- Failure aborts a test straight away. Sometimes it would be more useful to be able to output a failure (which marks the test as failed), but continue to run the test in order to accumulate more failure messages, which can help with debugging.
Test ordering: Often tests are defined simplest-first in the source. You usually want to explore failures of the more fundamental tests first. However, unittest loses the ordering when it retrieves tests from modules and classes.
- Isolating tests, e.g. by fork()ing the process or simply re-ordering tests.
- No way to get more information out of a test beyond a success/failure status and a traceback. We might want:
- warnings
- trace information to aid debugging
- example output
golden files: where output differs from the golden file, we want a way to merge the differences into the golden file selectively using (e.g.) meld
- No way to enter pdb for debugging a traceback
Test discovery (i.e. searching for tests in Python modules and directories of modules) is the most well-known failing of unittest.
Other test frameworks:
- Offers "generative tests", in which a test can return a list of (function, argument) pairs to be called. It's not clear whether these will nest.
doctest (part of Python standard library). I'm not keen on doctest tests:
- The tests are part of the main non-test code (rather than being in a separate test module), so tend to clutter it up. It's not unusual for test code to be several times larger than the code it is testing.
- Test code is embedded inside string literals, so it is not syntax highlighted by the editor. It is not included in static analysis done by tools like pyflakes. Scoping rules are not clear. Odd syntax is used to indicate test expressions and expected values. The framework is not extensible; it is awkward to refactor doctest test code.
Parts not done
Tests for these areas:
- filename resolver
- marshalling code
FsObjReal: include tests for what happens when files and symlinks go away or directories are moved
