Rethinking files

In Unix, everything is a file. This is often viewed as a positive, but it's often a poor abstraction for many things. This discussion goes into some of the issues.

I've previously written about the concept of nexuses. To briefly reiterate, an integration nexus is a commons which different applications on the same system can use to interoperate. A filesystem is a canonical example of an integration nexus, because it allows applications to consume files generated by other applications, without even necessarily knowing or caring what application generated the file. A maintenance nexus provides an operator with access to the various resources in a system so that they can operate on them. The Unix filesystem is both an integration nexus and a maintenance nexus. Unix sysadmins ssh into machines to administer them, and by and large the shell they are immediately given lets them work with the filesystem and the files on it.

Arguably, an operating system lives and dies by the quality of its nexuses. The quality of Unix's filesystem as an integration and maintenance nexus makes it a pleasing system to work with, and has doubtless earned it much favour.

How do we square this with the view that “everything is a file” is perhaps a mistake? First, let's observe what actual utility the “everything is a file” concept gave us. It seems clear that what it gave us was a common set of verbs with which to operate on files.

Different types of file — normal files, directories, block devices — have very different semantics, yet they all try and force these semantics into the straightjacket of a limited set of common verbs. The result is that the semantics of read() and write() are massively overloaded and vary from type of file to type of file.

The advantage of this seems to be that such files remain addressible by the tooling available to an operator; to the maintenance nexus, in the form of commands like ls, cat, etc. We force all types of file into a narrow set of overloaded verbs because by doing so, we can expose those verbs into our maintenance nexus so that humans can use them.

The underlying problem here is that the set of verbs in a Unix system cannot be extended. It is small and fixed. ioctl() is the most extreme example of the problems this causes; it exists for the deliberate purpose of allowing arbitrary semantics to be compacted into a single fixed verb.

So if we were to replace this with something better, what would it look like?

open() and close() seem non-problematic. People are always going to want to name objects, so we need a way to convert names to handles to that resource. But suppose that handles to objects, in general, aren't expected to support read() or write() or indeed any verb besides close(). Also, open() is unlikely to be the only way to create objects; while it's perfectly viable to obtain a handle to a resource that already exists, no matter what the type of that resource is, many types of resource (e.g. sockets) don't lend themselves to creation in this way. Plan 9, for example, tried to expose the network as a filesystem, with TCP connections being created via standard filesystem operations, but the actual sequence of steps that must be performed to open a connection is bizarre and unwieldy, and certainly not as simple as open().

We do want a hierarchical structure of names we can use to identify resources. But unlike Unix, we want to support arbitrary verbs. We also want to support arbitrary types of resource. This means there needs to be a way to define new resource types at runtime, and what verbs they suppport.

Consider printing, for example. Suppose that a printing system is installed to an operating system. This requires a new resource type: a printer. It doesn't necessarily make much sense for a printer to support the read() or write() verbs; it might have a new verb, print().

Whereas with modern Unix systems, printing may be implemented by a daemon, but it is not exposed on the filesystem natively, as a first-class citizen. A print daemon can create objects on the filesystem, but it cannot extend the filesystem with the set of concepts that printing adds to the computing domain.

Why can the set of verbs in a Unix system not be extended? One reason is because every verb needs a command line tool to expose it to the maintenance nexus, to expose it to the shell. Such command line tools are essentially glorified converters of string arguments to system calls.

People like Unix's “everything is a file” approach because what it really means is “everything is exposed to the same nexus”. It means you need only ssh to a system and you have all the power to reshape all aspects of that system with a single interface, the command line, using a common set of highly composable tools. As soon as a new type of resource, printers, are added to the system, we want to be able to invoke the print() verb that it defines on it at the command line.

What's interesting here is that somebody has already come up with the answers here: Powershell solves a lot of these problems.

Windows is an oddity in that it has essentially two filesystems; the ordinary filesystem, and the Windows registry. The latter isn't ordinarily considered a filesystem, but it sure looks like one to me — a system-wide hierarchy of key-value pairs used to store system state. This clearly wasn't a great design decision and is a needless distinction, but what's interesting is how Powershell has worked around it.

Powershell, you see, isn't limited to working with filesystem objects. You can literally cd into the Windows registry and operate on registry items. This is not due to any kind of kernel support, it's something that Powershell models internally on top of the OS. It's essentially artificially reunified two filesystems into a single addressible namespace to provide a unified maintenance nexus. Amusingly, in doing so, it's arguably created a unified maintenance nexus superior to that of Unix.

What gets even more interesting is that Powershell supports arbitrary verbs. Powershell commands are named in Verb-Object form, for example Create-Directory. These verbs aren't executables on the filesystem. You can have new object types, and with new object types come their verbs. There are different verbs for registry items to normal files. Even WMI (another nexus Windows has that allows you to query system information via SQL) is mapped into Powershell; by cd'ing to the WMI namespace and executing the right verb, you can execute SQL queries against that namespace's objects.

Supporting this requires a capacity for reflection. It requires the ability to introspect the defined object types at runtime and their supported verbs, and what arguments those verbs have and how to convert strings typed on the command line to the correct format. Importantly, once you do this, you no longer need executables to adapt from the command line to system calls or method invocations; the shell itself can do the necessary conversions automatically. This eliminates the need to manually write such an executable-to-function-call adapter for every type of verb a system might support.

How should new object types be defined? The last thing I want to do is introduce a new aspect of system configuration that has to be managed. Suppose that filesystems (“file”systems) can export new object types and supported verbs. You might have some FUSE-like filesystem for example, that you can just install into an existing system without kernel modifications. Maybe a new userspace filesystem is used to manage printers, for example. This filesystem, when it is mounted (really, bound into the global namespace at some prefix) communicates the object types it defines, and their verbs, to the system.

We shouldn't assume that mounting a filesystem is an operation performed by a trusted party, though. Unprivileged users may want to mount filesystems. So we need to ensure that untrusted parties defining new object types and verbs can't affect other parts of the system. We can do this by namespacing object types and verbs. Of course, when we think of “namespacing”, we think of the filesystem itself; there's no need to introduce a new namespace. Verbs and object types themselves can be objects which exist on the filesystem.

Let's suppose this printer filesystem is mounted at /printer. The object type printer might then be exposed at /printer/type/printer and the verb print might be exposed at /printer/verb/print. The /type/ and /verb/ parts here have no special meaning, a specific filesystem implementation can choose arbitrarily where to put object type objects and verb objects beneath itself.

Because all verbs are required to carry with them metadata communicating their arguments and how they are to be invoked, we don't need to create new executables to invoke the print verb. Our shell can do the invocation automatically based on the verb's metadata. The syntax could look exactly like executable invocation:

/printer/verb/print /printer/some/printer /some/file

Of course, the beauty of this is that because verbs are also filesystem paths, they could be added to a shell's search $PATH like ordinary executables. Thus, it is entirely up to the user whether they trust a filesystem and want to put some of its verbs into their $PATH.

No longer being limited to a small set of object types (normal files, directories, block devices, character devices, sockets, named pipes, symlinks), interesting things become possible. For example, you could introduce an SQL table object type which is emphatically not an opaque sequence of bytes in the Unix way, and which is operated on with verbs like SELECT — all natively, in the shell, ala Powershell.

Having verbs be introspectable at runtime would also allow dynamic programming environments to expose new verbs automatically. For example, opening an object in Python could cause verbs supported by that object to automatically manifest on the resulting Python object. This is akin to the dynamic autogeneration of proxies which is popular with RPC systems in dynamic programming languages. In some sense, what we're conceiving of here is files as discoverable RPC.

Addendum

Incidentially, though I assume objects will be referenced by names above, we needn't force that to be the only way to reference objects which are hard to name. How should TCP connections be named, for example? Possibly by their four-tuple, but that is likely to create a risk of race conditions.

Instead of forcing things to be referenced by name, we can simply allow our shell to hold handles, much as today's Unix shells can open file descriptors. But the leap comes in allowing a shell to pass these handles to verbs or executables as a command line argument — and in allowing executables or verbs to return handles to a shell:

CONN=$(select * from /net/tcp/connections where destination_port=443 limit 1)
/net/verb/terminate $CONN

Of course our pie in the sky shell would also have proper array support:

for $CONN in $(select * from /net/tcp/connections where destination_port=443); do
  /net/verb/terminate $CONN
done