Web powerbox interface
Update: A concrete Powerbox spec has now been posted on the public-device-apis mailing list, which partially supersedes the description on this page.
Contents
Background
We wish to provide a generic mechanism by which a user can grant one web app access to a service provided by another web app. Services are represented as web-keys (unguessable URLs), but these are not exposed to the user.
The immediate motivation is to provide a way for web apps to be granted access to local devices, such as microphones, cameras and GPS (geolocation). Others have proposed that these devices be accessed through new Javascript interfaces, built in to the web browser. We propose that these devices be accessed as local web services, through RESTful APIs, and we propose to solve the problem of granting access in the general case, so that it works for non-local resources too.
Overview
The basic model is that the browser has a list of providers of services, similar to a list of bookmarks. Each provider in the list has a name, a web-key, a list of service type names, and other metadata.
A web page can passively request a service using an <input> element (with a new type attribute value), which is rendered as a button. Clicking the button opens a dialog containing the list of providers, filtered by type. Depending on the kind of provider, selecting a provider either grants a service to the requesting page straight away (a non-interactive provider), or it opens further UI, supplied by the provider, in a new tab (this is an interactive provider).
A web page can passively offer up a service provider using a new HTML element, also rendered as a button. Clicking the button opens a dialog to add to the user's list of providers, similar to adding a bookmark.
Rationale
Web browsers already provide one trusted-path "powerbox" user interface for authorising access: the file upload dialog. Web Powerbox goes beyond this:
- it attempts to be generic by supporting an open-ended set of service types;
- it supports multiple service providers;
- it provides a way to register service providers.
a) Web Powerbox aims to be generic because:
- It is clear that there are many types of service that could use this rendezvous/grant facility. The list of use cases currently has 13 items. It is likely that there are further uses that we have not thought of. The aim of Web Powerbox is that new services can be implemented without having to change the browser.
- We want to provide a uniform user interface across the different service types, to help the user to develop expectations about how the UI works. If every type of service had a different trusted UI, the user's experience with one would not carry over into another.
b) Web Powerbox supports multiple providers because:
- There are often genuinely multiple choices. For selecting a image file, there is a choice between:
- taking a photo with a built-in camera
- uploading an image from a local file
- selecting an image from a photo album, e.g. on Flickr
- selecting an image from a general-purpose cloud filesystem, e.g. Tahoe-LAFS
- Even for services where there is usually only one choice, multiple choices are still possible. To take the example of geolocation, the user can only be in one place at a time. However, it is still useful to select another entity's location, e.g. to plot the location of the user's cat's GPS tag on a map.
- If we can ask multiple choice questions, we are avoiding yes-or-no "Is this OK?" security questions. The user is performing an act of designation, not merely confirmation. The aim is that policy decisions disappear into the user's workflow. Yes-or-no questions are usually bad because they do not offer the user a genuine choice.
- Even if the filtered list of providers sometimes contains only one item (e.g. geolocation), we hope the user will understand that they have a choice, by virtue of having been exposed to the same UI in the multiple-choice case.
c) Web Powerbox provides a UI for registering new providers. We expect that some providers will be built-in/pre-installed -- in particular, local devices, such as camera, audio, geolocation, and maybe a local printer. But other providers cannot be pre-installed, because they are specific to the user accounts of whatever remote web services users choose to use.
We also want to provide a means for building new services out of existing services. For example, a user concerned about privacy might want a time-limited geolocation service which provides geolocation data for some period of time before expiring. A built-in geolocation service might not provide this. The user has the option of wrapping the built-in service with one that provides this revocation feature.
Powerbox: register a provider
Arguments:
obj: an unguessable URL (i.e. a web-key)
kind: specifies how obj should be treated:
Interactive provider: opens a new tab (obj can be a URL to POST to).
Non-interactive provider: does not open a new tab.
Simple service: The same service web-key is always granted, so grants are not separately revocable. This is simply an optimisation of a non-interactive provider that avoids a round trip to the provider's server.
type_names: list of service types that are applicable. This is used for filtering displayed provider lists.
nickname: short name describing the provider
desc: longer description of the functionality provided and how it is expected to be used
wants_identity: whether the provider wants to receive the requester's ID (see requester_id below)
info_url (optional): URL, possibly specific to the user, to visit to see information about this service, which might include features for reviewing and revoking grants.
UI:
- Opens a dialog for adding the provider to the user's list of providers.
- This is analogous to saving a file or bookmarking a page.
The user gets to choose a petname for the provider. The default petname is <provider_site_id> + <nickname>. (provider_site_id might typically be the site's domain name; see requester_id below. This is not the domain name from the obj URL, but the identity of the site making the register_provider request.)
Returns:
- Should this indicate whether the user kept the provider? Do we want to support lazily allocating web-keys?
Powerbox: request a service
Arguments:
type_names: list of service types that the requester can accept
args_for_provider: arguments to be passed to the provider
UI:
- Opens a dialog for choosing a provider from the user's list of providers.
- This is analogous to choosing a file to upload.
The displayed list of providers is filtered by type name. A provider is displayed if its type_names intersects the requested type_names.
- When the user chooses a provider, the browser invokes the provider.
- If the user picks a non-interactive provider, the browser invokes the provider's web-key in the background with a POST. The provider responds with a service web-key, which the browser returns to the requester.
- Otherwise, if the user picks an interactive provider, the browser opens a new tab with a POST to the provider's web-key. The provider can respond with a service web-key after interacting with the user in this tab. (This could be opened in a pop-up window instead of a new tab, but it may be better to reserve pop-up windows for trusted interfaces provided by the browser.)
- There should be a visual distinction between interactive and non-interactive providers in the list. Interactive providers could have an arrow icon to indicate that they involve further interaction.
- If we open a new tab, what state is the initiating tab (and the provider-chooser dialog box) left in, if the user switches back to it? While browser-provided file choosers are usually modal, untrusted tabs should not be modal.
- What does the requester's UI do while the browser/powerbox is waiting for the provider to respond? (This would not be an issue if we used promise objects; it would be the responsibility of the web app's UI to handle this.)
Returns: the granted service web-key
Invoking a provider
Arguments:
requester_id: human-readable string containing a description of the requester.
Only needs to be passed if the provider was registered with wants_identity=True.
This might contain the requester's domain name. In a web browser with a petname facility (e.g. the Petname extension for Firefox), this can contain the requester's petname.
- This is intended to be displayed by the service in the request UI (for interactive providers) and in any revocation/review UIs.
- This is not intended to be interpreted by the service. The service is expected not to use this string to make access control decisions.
args_for_provider: arguments passed by the web app in the powerbox request
Returns: the granted service web-key
Invoking the powerbox
All web apps are given the ability to invoke the powerbox, to ask to register a provider or to get a service.
The interface is a DOM element (probably an <input> element for service requests). This means that the web app cannot open the dialogs asynchronously; the dialogs are only opened in response to a user click.
A powerbox request is made on behalf of a web site's identity. We want this identity to match the identity shown in the browser's trusted chrome (usually the URL bar), so that grants are recorded against an identity that the users expects, in order to help the user in reviewing and revoking grants. This means we probably want to disallow powerbox requests inside cross-origin iframes.
Revocation
Providing a facility to revoke grants is the responsibility of service providers.
Revocation is not needed for one-shot services. e.g. Once a photo has been uploaded, the receiver's possession of the data cannot be undone.
However, it is possible and desirable to revoke access to future data from geolocation, microphone, camera etc.
Some services might want to display indicators as part of the browser's trusted chrome or as icons in the system tray; for example, a "Camera is recording" indicator with a "Stop" button. This ability might be reserved to local services that are built in to the system, such as microphone and camera. Chromium extensions may be one way that web apps can be granted the ability to display icons or notifications like this. Mechanisms for doing this are outside the scope of the Web powerbox spec.
Use cases
Types of provider:
- one-shot camera or video access (interactive: displays preview before granting)
overlaps with Capture API (from DAP working group)
- real-time video input stream (camera)
- real-time audio input stream (microphone)
overlaps with Device element proposal
- audio output (local or remote)
- overlaps with Pepper API (an extension to NPAPI) which grants audio output implicitly, and probably with other specs
- geolocation
- one off: snapshot of current location
- follow location
overlaps with Geolocation API
- file chooser (interactive: displays file listing)
- select from local files
- select from files on a Tahoe-LAFS distributed filesystem
overlaps with File API, directories API, File Writer API, File Reader/upload API
- printers
- an interactive printer provider can display a printer listing
- a non-interactive printer service can refer to a specific printer
overlaps with Cloud Printing Proposal
- scanners
- quantities of local or remote storage
overlaps with IndexedDB API (formerly WebSimpleDB), Web Storage API, directories API, all of which provide persistent local storage.
- contact lists, e.g. for importing contacts from a mail app into a social networking site
- add-event-to-calendar
- form-filling: providing a delivery address
- providing payment
overlaps with OpenTransact, based on OAuth
- sensor devices, e.g. orientation sensor, thermometer, heart rate monitor
It is possible to create wrappers that attenuate a service. For example:
- Time-limited geolocation. Revokes access after a time specified by the user. The time limit can be chosen when access is granted, using an interactive provider.
Virtualised services:
- Audio input: could combine audio from multiple sources, or provide audio data streamed from a file
- Audio output: could set up my netbook to output sound through my desktop's better-quality speakers
- Geolocation: can provide the location of my cat's GPS tag
REST binding
The interfaces above are abstract. TODO: Define a concrete mapping onto a REST API.
Questions
Terms
What terminology should we use?
- "Object" or "service" or "provider"? One-shot camera access would return a URL for the photo that was taken (or chosen); this is a specific file rather than a service.
- "Provider" or "maker" or "factory"? "Factory" sometimes suggests an object capable of constructing multiple types of object. However, "maker" is vaguer, less formal, and less widely used as a term.
- Does "request" mean "request-an-object" or also "request-to-register-a-provider"?
Service type names
Type names: What format should type names be?
- To some extent it doesn't matter. Services can establish de-facto standards.
- In some situations we can use MIME types. Photo upload can take place via a file chooser or a one-shot camera usage. The take-a-photo service is displayed when the web app requests an image MIME type.
- Should register_provider and request_service take lists of type names? Should the type name(s) be passed to the provider?
- Should we attempt to display type names to the user? If not, it might not be clear why/how displayed lists of services are subsetted when a service is requested.
Some types are subtypes of others. For example, "image file" is a subtype of "file". A file chooser can return image files as well as files more generally; a camera can just return image files. The browser does not need to know about these subset relationships. A requester can ask for ["file", "image file"], or a file chooser can advertise itself as providing ["file", "image file"].
Tab-scoping grants
How can we arrange a grant which is limited in scope to the lifetime of a tab? (i.e. The grant is revoked when the user closes the tab or destroys the DOM by navigating away, or if the machine is rebooted.) The provider could receive an object (a browser-served URL) which represents the tab's lifetime. This needs to be fail-safe.
Exposure of web-keys
When are we hiding URLs-as-caps from the user and when are we exposing them?
- Granted service web-keys are always hidden.
- Provider URLs are exposed if they are interactive providers; the browser opens a tab on the URL. Non-interactive provider URLs are not exposed.
info_url will be exposed when visited.
Exposure of petnames
If the browser provides a petname system, what happens if the user changes the petname for a site?
There are three options, depending on how far we want to go in implementing Mark Miller's proposed petname markup system:
- Never expose a petname string to providers. Instead, expose an opaque key representing the requester's identity (e.g. a public key). Add a DOM element which takes this opaque key as an argument and displays the user's pet name. This has the disadvantage that, since the provider app cannot get petname strings, it cannot (for example) sort a list of grants by petname.
- Send an identity key to the provider, but provide a DOM API for mapping this key to the user's current petname. Hence displayed petnames are up-to-date, but the provider can see the petnames that the user has chosen for sites that they have granted the provider's services to.
- Send the petname string to the provider at grant time.
Prototyping
We could prototype this system without changing the web browser, by implementing the powerbox itself as a local web service, perhaps one which is able to open windows on the screen. Instead of requesting/offering services/providers via a DOM element button, web apps can make powerbox requests via XHR on a powerbox web-key. This means that the provider-chooser dialog can be opened asynchronously (i.e. not in response to a user click) and interrupt the user, which is not ideal.
Each web site should be given a different powerbox web-key. How can these web-keys be set up? We could set this up manually by copying a web-key into a form on the web site, or we could use Greasemonkey (or similar) to make a web-key available through the DOM.
References
"Why aren't most devices virtual web services?", Mark Miller, January 2010 on public-device-apis
Introduction to web-keys: Mashing with permission, Tyler Close, 2007

