An Engineer’s Guide to the Docuverse
Project Xanadu® is the original hypertext project — the brainchild of Ted Nelson, and the result of careful and passionate work by innumerable clever people over the course of nearly sixty years. Because of the length of its development period, the project has spawned and used many ideas of varying importance, and particularly important ideas have had many names. Because its history spans several eras of computing, ideas spawned by the project that were once considered radical have become commonplace and other ideas that were once commonplace have become forgotten and become radical again. Over this time, most documentation available to the public has been written by Ted, and intended for a non-technical or semi-technical audience.
I had the opportunity to work on Xanadu for five or six years, after spending years as a fan, trying to piece together a general idea of the project from scattered documentation. This gave me the privilege of being able to ask Ted for clarification on both technical and philosophical points, and it also gave me a better view of how different ideas and terms fit into different eras.
Xanadu has a poor reputation in many technical communities. Part of this is due to the popularity of a misleading and in parts factually incorrect 1995 Wired article about the project. However, I consider a larger issue to be the project’s tendency (normal through the 1980s but now very strange) to default to secrecy (even with regard to ostensibly non-secret ideas) and to consider any public release of information in terms of PR. This ultimately meant that most available information was lacking in technical detail or had its technical detail hidden behind a general-audience-friendly front. With a few exceptions (such as Udanax Green / xu88’s FeBe manual), there are no complete, well-organized, publicly-released explanations of Xanadu concepts aimed at engineers. Engineers have, as a side effect, failed to understand how certain ideas fit together and have filled in gaps in the explanation with ideas already familiar to them.
Xanadu concepts are, to a great extent, an alternate (and alien) universe. They were developed earlier than their rough general-computing counterparts in many cases, and are not widely understood outside of the community of former and current Xanadu developers.
I aim to write the guide I would have liked to have been able to read, back before Ted found me and invited me to the project. With any luck, this will clear up some misconceptions, introduce some unfamiliar ideas, and prepare people who are interested in joining the project with enough background to understand the internal documentation quickly.
This document is not intended as an introduction to Xanadu as a whole. In particular, I cannot & will not review the long chronology of the project, although when appropriate I will mention the time period in which something was developed in order to provide context to particular technical decisions. Explaining the historical importance of Xanadu and providing a gentle or gradual introduction to these ideas is also beyond the scope of this document: there is much to be explained and little time, so I will focus on the technical background and assume that readers already have a passing familiarity with Xanadu. For a non-technical introduction and brief chronology, see Ted’s video series Xanadu Basics.
I will be covering only projects I have intimate knowledge of — either from working on them, consulting on them, or immersing myself in their documentation. This leaves large gaps in the chronology (such as xu92 / Udanax Gold), and it means that I cannot adequately cover projects started after 2017 (including the current web-based demo). I will describe a few projects and ideas in passing that I don’t have detailed knowledge of, due to their importance — including OSMIC, xu92, and specialized xu88 enfilade types like the POOMfilade; however, my brief description should not be taken as complete or accurate. I will try to mark these cases clearly.
In some cases, I have independently written open source implementations of some data structures or algorithms developed under Project Xanadu. I will provide links to these when I have them. They are intended as didactic aids — clear examples of implementations — and are often missing optimizations or integrations with other features; they should therefore not be taken as complete or production-ready implementations.
Xanadu®, ZigZag®, and the flaming X logo are registered trademarks. When possible, I will be using the following generic terms: ‘xanalogical’ to refer to a system that incorporates ideas from Xanadu; ‘translit’ or ‘transliterature’ to refer to a xanalogical system for representing hypertext or hypermedia; ‘ZZStructure’ to refer to the underlying data structure and facilities within a ZigZag implementation in the absence of a UI layer.
While I was at one time privy to trade secrets belonging to Project Xanadu, to my knowledge nothing I describe here is subject to currently-live trade secret protection. Furthermore, while I have written code and documentation whose copyright is ceded to Project Xanadu, I am not writing this based on that code and documentation, but instead based on my memories of it. To my knowledge, nothing explained here is secret.
There are several live patents associated with ideas I outline here. Interactive ZigZag interfaces are covered by US262736B1, which was granted in 2001 & so should expire in a few years (while non-interactive ZigZag interfaces are not patent-encumbered at all). Slice intercalation (a means of federating multiple independent zzstructures) is covered by US8682884B2, and is not covered in this document because at the time I worked on the project it was a trade secret; it was granted in 2014. There are two abandoned applications for patents on transpointing windows: one in 2008 and another in 2012; the existence of these applications represents prior art that should ensure that no later patent filed for this technology will be granted. There is an application for a patent on navigable transpointing windows from 2017. And, finally, there is an expired patent on online payment systems. In my experience, the patent status of Xanadu ideas have been no impediment to independent implementation: Ted is generally happy to see his ideas carefully and seriously implemented.
This document is not officially blessed by Project Xanadu, and any inaccuracies are mine. I do not represent Project Xanadu or Ted Nelson.
The aerial overview: key concepts
Work on Xanadu can be divided into two general realms: transliterature, which involves hypertext and hypermedia, and ZigZag, which involves much smaller pieces of data. These realms very occasionally interact (see my section on ZZOGL / FloatingWorld), but for the most part the technologies involved remain separate, and they have distinct and almost contradictory concerns and philosophies.
If we wanted to find a very rough equivalent in the realm of normal computing, transliterature would be word processing while ZigZag would be the use of spreadsheets and databases.
Transliterature, in all its incarnations, involves several concepts missing from or alien to mainstream computing: unique permanent addressing for immutable documents and source data, indirect document delivery, visible connections, and external markup. All of these ideas stem from a single basic idea: the manipulation of immutable sequences of bytes with permanent addresses.
ZigZag’s universe is based around a particular data structure — a generalization of the spreadsheet, wherein a cell can have connections along arbitrarily many named dimensions.
I will focus on transliterature in this document because it has a longer history and more concepts associated with it. It is also what most people associate with the ‘Project Xanadu’ name.
An introduction to transliterature
Transliterature is the new name for what used to be called ‘hypertext’. Specifically: it is a way of thinking about text and other media, and a set of operations and features that way of thinking makes possible. Non-xanalogical hypertext systems like the World Wide Web implement some of the flashier features of transliterature without understanding or implementing some of the core ideas, and as a result, implementing other features is either impossible, awkward, or unreliable. Since these non-xanalogical systems dominate the discussion about ‘hypertext’, we’ve changed the name.
On the back end, there are three tenets of transliterature:
- All addresses are permanently affixed to the data they refer to
- All meta-information (such as links, formatting, and rearrangement rules) is distinct from the original information, and is therefore stored as a distinct chunk of data with a distinct address
- All operations are available to all users, because no operations are destructive
This shakes out in several different ways. One is indirect document delivery.
Indirect document delivery is the idea that a document, as visible to a user, is the result of the combination of material from various sources. The way this works in practice depends on the implementation, but the current method (used since the end of the xu92 project — so, dominant since 1990 or so) is the EDL and ODL. The EDL and ODL are chunks of data with permanent addresses like anything else, but they are interpreted by the client in a particular way.
An EDL (Edit Decision List) is a set of ‘spans’ — which is to say, permanent addresses of chunks of data with byte offsets and lengths. (In xu88, a span was called a ‘threespan’ because it had three components: an address, a start, and a length.) The spans are fetched and put together in the order in which they occur in the EDL document. The resulting frankenstein is known as the ‘concatext’ (a portmanteau of ‘concatenated text’).
An ODL (Overlay Decision List) contains rules about how to display content that falls within particular address spans. Because a concatext should never contain any kind of markup (even span and div tags count as meta-information about which parts of a document are supposed to be divisible), all formatting is performed using these rules. All font information is provided by an ODL assigning a font or emphasis style to a particular span of text, for instance. The ODL also contains information about links (by claiming that one or more spans are connected by a link with some link type), and about format (by claiming that some span should be interpreted as an image, a video, an audio clip, etc.) Information about text justification or page breaks are also encoded into the ODL.
An EDL provides the capacity to remix someone else’s document, or to create a new version of your own, without storing a duplicate of the original content in a way that isn’t clearly marked. (As a matter of optimization, it is possible for a local cache to store or serve a popular concatext and reverse the process to produce the original chunk of data with holes. What matters is that placement at the original address is canonical.) The use of a reference to a distinct source — quoting — is called transclusion (a portmanteau of transliterature inclusion).
While one would expect the author of a document to provide an ODL to go along with their EDL (if necessary), such a pairing is only a recommendation: an ODL is a set of formatting rules, and each rule can be overridden by the user. Furthermore, all ODLs loaded into the system act as a formatting ruleset against any document viewed. In this way, one can open many distinct documents and see their hidden connections (if one is lucky enough to have ODLs that display links between these documents or their sources).
On the front-end, links are displayed as bridges (also called beams) between the connected spans of open documents. (In the implementations I have worked on, we had a set of configurable rules for which colors the linked text and bridges were for particular link types, and procedurally generated a color from the hash of the link type if that type wasn’t in our color database.) The use of visible beams between open documents is called “visible connection” or “transpointing windows”.
Every visible document is considered editable, because editing such a document produces a new version with a distinct address.
Differences between translit implementations
Addressing & how EDLs and ODLs are stored changed a great deal after the introduction of the web.
Udanax Green / xu88, developed in the mid-80s at Autodesk and given an open source release in 1999, represents the culmination of a particular set of ideas about addressing begun in the 1970s at itty bitty machine co. In particular, Green uses a data structure called an ‘enfilade’ — a kind of tree with special fetch rules. The enfilade works in conjunction with a ‘tumbler address’ — a sequence of numbers that tells the server how to navigate the tree.
Each tree node contains zero or more numbered subtrees, and each leaf contains a piece of information. The server uses each piece of the tumbler address to determine which subtree to navigate to, and when it runs out of numbers in the address, it concatenates and returns the data of all the subtrees.
Green was designed for clients whose bandwidth, memory, and processing speed were poor, and so effort was made to put as much processing as possible on the server and to provide the client with something practically ready to display.
Udanax Gold / xu92, developed after Green, used a different structure called the Ent. Unfortunately, full source for Gold has not been released, and I cannot provide much information about it. I met one of the primary authors once & he told me that Gold supported cryptographic signature chains for versioning, in a fashion similar to a blockchain, but I can’t confirm this.
Beginning in the 1990s, transliterature systems began using URLs and HTTP for addressing. XanaduSpace and its successor XanaSpace used this system, as do the browser-based OpenXanadu and the current browser-based demo, XanaduCambridge. This presents a few problems. The data addressed by a URL is not guaranteed to remain static — in fact, even though the W3C maintains that changing the data addressed by a URL is rude and uncouth, nearly all web content is dynamic and sites have their content completely replaced on a regular basis. Furthermore, hosting user content becomes both a technical and (with DMCA) a legal problem. The browser-based transliterature systems have an additional problem: same-origin policy makes transcluding content from different domains difficult.
In my own xanalogical systems, I prefer to use CAN/DHT based systems like IPFS. A prototype using IPFS is here.
Another concept that exists in XanaduSpace and XanaSpace is the ‘permascroll’. The permascroll is the sole violation of the rule that chunks of data are immutable. It is an append-only log of content. The idea is that, while editing, anything typed would go into the permascroll, and the EDL would transclude that content from the permascroll.
There are two ideas for how permascroll use would work in conjunction with publishing. One is that there’s a private & public permascroll division — in other words, content being edited on someone’s own machine would have addresses in the private permascroll, but the process of publishing a document would append all the portions of the private permascroll that were transcluded by that document to a public permascroll (with the exception of any that might already be present) and the published EDL would have its span addresses and offsets edited to compensate. The other uses a single public permascroll that is encrypted using the method I will describe later under the heading transcopyright, with unpublished sections considered ‘not for sale’.
ZigZag and the ZZStructure
The ZZStructure is relatively straightforward: mathematically speaking, it is a directed graph with colored edges, where each node can have a maximum of two edges of each color — one going outward and one coming in. We call the nodes cells, and we call the edges connections. We call the color ‘dimension’. The direction is ‘posward’ — out — or ‘negward’ — in.
Each cell, in addition to having named pairs of connections to other cells, has content (although that content can be blank).
Another way of thinking of a ZZStructure is that each run of cells along some dimension (called a ‘rank’) acts as a doubly linked list — so a ZZStructure is a tangle of objects that occupy different positions on different doubly-linked lists simultaneously.
Given these basic rules, we have a data structure that, depending on how you look at it, can effectively express a large number of different things.
One way you can look at it is as a cell being subject to properties, defined by its posward connections along dimensions. This is similar to a prototype-based object system, and is used internally in XanaSpace and elsewhere as a configuration system.
One particular property, ‘clone’, is treated specially. When you request the content of a cell, under the hood, you are getting the content of the negward-most cell along the rank ‘d.clone’ — the ‘head’ of d.clone or the ‘clonehead’. Clones (defined as cells that have a neighbour negward along d.clone) are treated as references to their clonehead. Since a property can be a clone of some other cell, space can be saved or a property can reference a whole large group of other properties that might change independently.
Used as a personal organizer, mind-mapping tool, or database, this property model is quite useful. To the extent that ZigZag is used at all in practice, it is usually with a dual-pane interface as a personal organizer.
Unusual structures can be produced in ZigZag. For instance, a ringrank is a rank that is circular — it has no head. A zipper list is a pair of ranks along some dimension with a connection along some other dimension — representing an associative array or some other correspondence.
There are currently two officially-blessed implementations of ZigZag available to the public: a console-based system called azz and a graphical system called gzz. Prior to joining the project, I worked on two ZigZag-adjacent projects: dimscape, a graphical ZZStructure editor, and iX, a toy OS whose user interface and disk format was inspired by azz.
Officially-blessed implementations tend to have dual-pane interfaces because they tend to support a keyboard-based standardized interface system called KBLANG, which requires at least two panes. KBLANG is based around dividing a QWERTY keyboard into two direction pads (controlling the currently highlighted cell in their respective panes) and using the remaining keys to perform operations or switch visible dimensions.
Some effort has been made to develop programming languages that use ZigZag natively, the way spreadsheet formula languages take advantage of spreadsheet structure. There is some discussion here, and here is my contribution to the problem.
Transcopyright is a concept that gets a lot of push-back from people who don’t really understand what it’s trying to do. I’ll do my best to explain it, with special emphasis on the confusions I have seen.
Transcopyright is a system for monetizing or controlling the distribution of one’s material within a transliterature system. It is not DRM, because it applies to data the same ownership rules that are applied to objects: what one has purchased, one owns in perpetuity, including the right to remix and release remixes.
It is specifically intended to simplify rights negotiations on derivative works, and it uses the same model as RiffTrax: remixers are providing overlay or re-arrangement rules for material already owned by the people downloading the remix. It relies upon permanent addressing for this.
On a technical level, how this works is: when a work subject to transcopyright is published, a one-time pad is generated that is the same length as all the newly-published material in the work. The material is encrypted with that one-time pad, and it is the cyphertext that is published. Anyone can view any span of the cyphertext. Anyone who wants to view a portion of the plaintext must request that portion of the one-time pad from the author or some trusted oracle.
A remix or edit can be downloaded by anyone, but those pieces the viewer doesn’t yet have permission to view will be cyphertext (subject to an ODL entry that blacks it out and provides information about how to purchase the rights to it). One may decide to pay for only the bytes present in the remix. Remixes of things the viewer already owns will not require a second copy.
Once you have a copy of the plaintext, there are no technical mechanisms to prevent you from doing what you like with it. However, etiquette dictates that you do not distribute the plaintext or encryption key beyond a reasonable level, and normal copyright law kicks in at that point. The software itself does not provide the facility to distribute either on your behalf, for material you have not created.
ZZOGL / FloatingWorld
XanaduSpace and XanaSpace use a unique system for handling display, wherein a ZZStructure controls the placement, shape, color, and interaction profile of objects in 3d space. This is called ZZOGL (for ZigZag OpenGL) or FloatingWorld (FW for short).
FW is an extreme form of the ‘property orientation’ model mentioned in the ZigZag section. In it, cells correspond either to object heads (around which properties coalesce) or property values.
Starting from a cell representing the start of the structure (the zzogl head), one dimension represents the draw order. We iterate over objects along this dimension twice. During the first pass, we calculate sizes and locations; during the second pass, we draw using our calculated sizes and locations.
Each object calculates its location with respect to some other object within a location-offset DAG. Its parent’s size in three dimensions is multiplied by a vector of between 0 and 1 (the ‘parentcenter’ — the point on the parent with respect to which we move), we add the delta, and then position it on a point on ourselves (size * center).
pos = parent.pos+(parent.size*(parent.center-parentcenter))+delta-(size*center)
This kind of dynamic relative positioning made it possible to represent relatively complex relationships between objects. We used it to display both transliterature and ZigZag systems.
There were a couple kinds of objects:
- ‘groups’: invisible dummy objects for moving whole groups of objects at the same time
- ‘slabs’: rectangular prisms
- ‘beams’: tetragonal two-dimensional shapes with a solid color
- ‘tetroids’: textured two-dimensional shapes that are shaped like tetris-blocks or paragraphs of text (in other words, a rectangle with at most two smaller rectangles abutting it at top or bottom)
Unfortunately, getting OpenGL to render large quantities of text on top of 3d objects while being responsive enough to edit that text in real time was a difficult problem, and one that we were never able to solve. As a result, XanaduSpace and XanaSpace never had a release.
OSMIC is an alternative document versioning system — a general-purpose journal-based system for tree-based versioning of any byte sequence. Each version is an address in a list of steps. A step can be one of:
- insert bytes at a particular point
- delete bytes from a particular point
A version’s address is, like a tumbler address, a period-separated sequence of numbers. Each time a new version is created, the last number increments. Each time the version tree forks, a new number is added at the end, starting with zero.
While it’s possible to reconstruct a version from scratch by running the journal, it’s also possible to rewind to a particular version by treating every delete as an insert and every insert as a delete. An OSMIC journal can be much larger than the byte string it represents, so a compact representation is important. I am not aware of any work on a compact binary representation of OSMIC journals.
I haven’t worked with OSMIC, but at one time it was suggested that XanaduSpace should be retrofitted with an OSMIC-based system for storing the state and history of the ZZOGL system (and thus, all internal information necessary to display and edit documents).
Several systems exist for transliterature editing, some of them released. None of them are, to my mind, satisfying. All are based around selecting and re-ordering spans from existing documents. My implementation is called SPANG and is intended for use with OpenXanadu. XanaduCambridge has a less elaborate, more manual system.
We had pitched a more natural system, wherein selected text could be dragged out of a window to create a new document, or dragged to snap to the end of a document (or to the middle of a document as an insertion) to produce transclusions. We suggested dragging a selected span of text on top of another selected span of text to produce a link between them.
Such a system would require an environment where documents floated and text could be dragged between widgets — in other words, something that the cross-platform graphics libraries we were working with didn’t support.
On-disk formats (for ZZStructures, EDLs, and ODLs) were constantly in flux while I was at Xanadu. Some were, from my perspective as a professional programmer, abysmal. Most were either based on CSV or key-value pairs. In some cases, they used CSV for mandatory fields and key-value notation for optional fields.
I prefer a subset of YAML or JSON for ZZStructures, because ZZStructures fit that model well. However, since these structures have explicit hierarchy, they were controversial.
Generally speaking, EDL formats have been one line per span, with ordered triples (address, start, end). Whether or not keys are used differs from spec to spec, as does how (or whether) one can specify a whole document without specifying its length.
Generally speaking, ODLs have been lists of addresses pointing to “link” files, and those “link” files contain a type followed by sections called ‘endsets’ that contain one or more spans each. Generally, a link has one or two endsets. If a single endset has non-contiguous spans, the resulting beam resembles a starfish or a hand. I argued in favor of arbitrarily many endsets, each with arbitrary many spans. I also argued in favor of zero-length single-sided links, which I considered to be like bookmarks. I implemented page breaks in XanaSpace as zero-length single-sided links.
There have been arguments over whether or not it is appropriate to support spans that are with respect to the concatext of some document (and if so, how to specify that). Ted has generally been against this feature, and I have generally been in favor of it — both in EDLs and ODLs.
If you’re planning to implement a xanalogical system, compatibility with an existing format will not be a problem: there are no genuine standards, even within the project. However, you should consider the problems mentioned above.