Ideas

A page for ideas that don't have a better place to go.

Dependency coverage

Code coverage tracks which lines of code are executed (and sometimes, which branches are taken). Dependency coverage would track which dependencies of a component are used. For example, Firefox depends on libgtk, which contains various library and resource files. We can track whether those files are ever opened when we run Firefox.

Code coverage tells us what source code is not covered by automated tests; dependency coverage can tell us what components are not covered. Code coverage can alert us to dead code; dependency coverage can inform us about unused dependencies.

QEMU images

Could provide QEMU images for demonstrating Plash. Parts of Plash require root authority to install, such as ChrootSetuidJail; kernel modifications would also require root authority. If we ever have more of a POLA desktop, it would be useful to have a minimal system that boots up into that.

Killing sandboxed processes

One of the Nix papers points out: "Ensure that there are no processes left running under the uid selected for the builder. On modern Unix systems this can be done by performing a kill(-1, SIGKILL) operation while executing under that uid, which has the effect of sending the KILL signal to all processes executing under uid."

With ChrootSetuidJail, we could use this to kill all processes running under a particular sandbox.

Using Caja with Plash

Using an object-capability language like Caja (a tamed Javascript) could help make Plash's package system a lot more flexible.

Plash implements a lot of the behaviour in the TCB. For Debian packages, it does interpreting of APT source lines, combining repositories, choosing deb dependencies and initialising packages. It has a fixed file namespace. It interprets some fixed file formats: the Debian Packages index file and the .pkg file format.

Describing data structures in Caja would be handy. Linked structures would be easier.

The copy-on-write arrangement is set up by Plash, but it could be set up by untrusted Caja.

Using Caja instead of Python solves a bootstrapping problem. Most languages would have to be run in a separate process, and we would have to fetch all dependencies using the same package system.

When I did the most work on the package system, in December 2006, Caja didn't exist. Meanwhile, E's promises system raises the question of how to integrate it with Plash.

See also: CapPython

Deallocating memory

There are trade-offs involved in returning pages to the OS. If you repeatedly free pages only to allocate new pages, the OS has to zero them each time. There is also the system call overhead of invoking the kernel and unmapping the pages. A possible fix is for the process to declare which pages it no longer needs by setting some fields in its address space. The kernel (or a keeper process) could read these fields when it needs to allocate pages elsewhere.

This could be applied to the stack, depending on conventions for use of the stack pointer. I think that dirtied stack pages never get deallocated under Linux until the process (or thread) exits.

GNU make

Makefiles are really complex, especially glibc's. It is really difficult to understand or get information out of them. Maybe we could get more help from GNU make by adding some Python extensions in useful places.

Accessing other processes' FD tables

Suppose there was a way to read and write another process's FD table. Then we could make a sandbox based on ptrace() where the open() syscall is caught and the FD is granted by writing the FD into the process table.

We would be doing emulation at the kernel syscall ABI level. Advantages:

We would still need to find a way to do execve() safely with ptrace(). If there is a way to change another process's memory mappings, a user space implementation of execve() can use that.

How to implement /proc/PID and /proc/self

The kernel is already providing a "same sandbox?" check based on the sandbox's UID which it performs when sandboxed processes use kill() or ptrace(). We could implement a similar check on PIDs in user space by reading the UID/GID in /proc/PID and use it to implement a filtered view of /proc. We could rebind /proc/self whenever a process does fsop_fork. This would require that run-as-anonymous tells us the UID of the sandbox it has created.

Problems:

Use User Mode Linux

For programs that don't run under Plash for whatever reason (e.g. use of /proc), we could run them under User Mode Linux, which would in turn run in a Plash sandbox.

Limitations: slow start-up time for UML?

Ideas (last edited 2008-07-20 14:14:51 by MarkSeaborn)