Native Client: Dynamic linking plan

This page has been moved to http://code.google.com/p/nativeclient/wiki/AcquiringDynamicLibraries and http://code.google.com/p/nativeclient/wiki/DynamicLinkingPlan

Background

Why use dynamic libraries?

Native Client is being extended to support dynamic loading. We would like to support dynamic linking and dynamic libraries on top of this. While the basic dynamic loading support (validating and loading a chunk of code) must be implemented in NaCl's trusted runtime, the dynamic linker that sits on top of this, implementing symbol resolution, library lookups, dlopen(), etc., will be entirely untrusted code.

Plan

The basic plan is to use GNU libc (glibc for short) to provide dynamic linking support. glibc includes a dynamic linker (ld.so).

Changes involved

How libraries will be loaded

Each .so file can be fetched from a URL.

The NaCl browser interface already provides a Javascript interface to fetch a URL and return a NaCl file descriptor for the file. We can use this interface for fetching the executable and the .so files it requires. Using a file descriptor for the file is important if we want to provide an mmap operation for loading code or data. However, if code is always dynamically loaded by copying rather than mapping, there is no advantage for NaCl to provide a mmap operation. If this is the case, the web app can fetch code by any means, such as XMLHttpRequest and any new mechanisms.

Same Origin Policy

XMLHttpRequest is constrained by a Same Origin Policy (SOP). NaCl's interface for fetching files will also be constrained by a SOP; note that the NaCl NPAPI plugin has to implement the SOP itself.

The main reason for the SOP is that XHR requests convey cookies -- a type of ambient authority. The Same Origin Policy is not intended to prevent web apps from sending messages across origins (this is possible via redirects and <img> elements); it is only intended to prevent the web app from seeing the server's response to the request.

Loading libraries in NaCl is analogous to loading Javascript files via the <script src=...> tag. However, interestingly, <script> is not constrained by the SOP. The server is effectively opting in to revealing the response across origins by setting the content-type to "text/javascript". Supposedly the response is not revealed directly to the web app: the DOM, which is trusted, evaluates the Javascript and so the web app only gets access to the values the script assigns to variables. In NaCl's case, however, interpreting .so files is done by untrusted code. We have to reveal the fetched data to the web app, so NaCl cannot be as unconstrained as the <script> tag.

Sharing libraries across sites

It will be desirable to share library files across sites, so that the browser does not have to download identical files multiple times. This problem already exists for Javascript libraries. NaCl executables and libraries are expected to be larger than Javascript libraries so the problem becomes more important.

For Javascript libraries, the main (only?) mechanism for doing this is the <script> tag. This allows sharing in a centralised model in which multiple sites pick a central site to download the library file from. This works because <script> does not follow a Same Origin Policy. Sites using the central site's services are vulnerable to the central site which can change the file contents it serves up. The script text is not available across origins so the web app cannot check the text against a hash before running it.

For NaCl, web apps could fetch libraries using CORS or Uniform Messaging (formerly known as GuestXHR), which are not NaCl-specific.

We might also wish to allow decentralised sharing of files. For example, sites A and B both host libfoo.so. If the browser has already downloaded libfoo.so from site A, it won't need to download it again from site B, and vice-versa. Schemes for doing this by embedding secure hashes into URLs have been proposed (see Douglas Crockford's post).

This problem is not unique to NaCl, so we should not adopt a solution which is NaCl-specific.

Prefetching

The naive approach is to fetch each library file as ld.so requests them. We could reduce latency by listing all the libraries we expect to load up-front. The Javascript code can request the files on startup, to fetch them into the browser's cache. This would just mean that the requests are pipelined.

Versioning

As with static linking, each web app gets to choose its own version of libc and other libraries. Furthermore, different NaCl processes in the same web app can use different libc versions. Libc is not supplied by the browser.

We don't expect there to be a huge number of libc versions, but older and newer versions of the same libc are likely to be around at the same time, as are different libc implementations (such as newlib and glibc).

Web apps get to pick a set of libraries that are known to work well together. This is analogous to selecting a set of Javascript libraries, or selecting a set of packages for a software distribution such as Debian or Fedora. This way we can avoid "DLL hell"; libraries are not the responsibility of the end user.

This provides extra flexibility that is not available to typical applications on Linux when packaged with commonly-used packaging systems like dpkg or RPM. Packaging systems such as Zero-Install and Nix allow multiple library versions to coexist in the same way that I am proposing for NaCl.

Though we have this extra flexibility we will still have all the versioning mechanisms that are available in ELF shared libraries normally: libraries can opt to provide stable ABIs and declare interfaces via sonames and ELF symbol versioning; we get the benefit of separate compilation.

Upgrading libraries is the responsibility of the web app.

Licensing issues

Disadvantages of dynamic linking

Terminology

These terms all mean much the same thing: dynamic library, shared library, shared object, dynamic shared object (DSO), .so file

See also

NativeClient/DynamicLibraries (last edited 2009-12-23 12:29:40 by MarkSeaborn)