As we saw in our examination of the media domain, word processors and desktop publishing applications tend to straddle the divide between the media domain (what a document looks like) and the document domain (how it is organized). While they are built on a basic set of document domain objects—pages, paragraphs, tables, etc.—they use a WYSIWYG display to keep the author working and thinking mostly in terms of styles and formatting—the concerns of the media domain. This makes it difficult to apply meaningful document domain constraints to the author’s work, or to record which constraints the author has followed. For that we need to move to the document domain.
But it is not simply a matter of expressing document domain constraints in the document domain and media domain constraints in the media domain. Rather, the simplest reason for moving to the document domain is actually to enforce media domain constraints that are hard to enforce in the media domain itself. In fact, one of the consistent patterns in structured writing is moving to the next domain to enforce, or factor out, constraints in the previous domain.
Consider a list. You may want to impose a constraint that the spacing above the first item of a list must be different from the spacing between other items of the list. This is a media domain constraint, but it is hard to enforce in the media domain. Most media domain applications create lists by applying styles to ordinary paragraphs. The usual way to apply the extra space above the first item is to create two different styles, which we can call first-item-of-list style, and following-item-of-list style. The first-item-of-list-style would then be defined with more spacing above.
The problem is that an author can forget to use the first-item-of-list style. Or they could add a new first item above it and not realize that the second item in the list now has first-item-of-list style.
As we noted before, structured writing works by factoring out invariants. Most constraints are invariants—rules that apply to all instances of the same content structure (such as all lists must have extra space before the first item). The easiest way to enforce a constraint, therefore, is not actually to enforce it on the author, but to factor it out altogether.
To factor out the spacing-above-lists constraint, we remove the need for the author to specify the style at all. You can’t do this in a typical media domain application, because the only way to create a list is to apply list formatting to a set of paragraphs. To factor out the formatting step, we need another way for the author to specify that a list is a list.
To do this, we create a list object—not a styled paragraph, but an object that is specifically a list. Lists belong to the document domain because they are a common rhetorical tool, a way of organizing information, that is not specific to any one subject area (subject domain) and can be formatted in a wide variety of ways (media domain). Once we have a list object, we can create rules, in a separate file, about how lists are formatted.
Structurally, a list object looks something like this:
list: list-item: Carrots list-item: Celery list-item: Onions
Here’s that same structure in HTML (actually, this is a slightly more specific structure, but we’ll get to that):
<ol> <li>Carrots</li> <li>Celery</li> <li>Onions</li> </ol>
Now we have a distinct list object, we can factor out our invariant list formatting rule into a separate file that contains list formatting rules. For HTML, this is usually done with a cascading stylesheet (CSS):
li:first-child { padding-top: 5pt; }
Now, the author doesn’t have to think about the correct spacing above lists. In fact, they cannot manipulate it, even if they want to. They just create document domain list objects. Media domain list formatting rules have been factored out of the author’s world. The media domain constraint about spacing above a list will now be followed automatically and reliably by algorithms.
But wait! That’s fine if all lists are formatted exactly the same way, but we know that is not true. At very least, some lists are bulleted and some are numbered. And then there are nested lists, which are formatted differently from their parents, and specialized lists, like lists of ingredients, definitions, or function parameters. If we are going to create list objects in the document domain rather than applying styles in the media domain, how do we make sure each of these types of lists gets formatted appropriately?
Extensibility
At this point it is worth looking at a very important feature of all structured writing systems — extensibility. In media-domain word processing and desktop publishing programs, authors may need many different styles to format their documents, and these applications do not attempt to anticipate or provide all the styles every author might need. Some, like Word, come with a basic set of styles that may meet some basic needs, but all these programs let authors define new styles as well. The set of document domain objects in these programs is small and fixed, but the set of media domain styles is extensible—you can create as many as you need.
Essentially, a word processor or desktop publishing application that supports the definition of styles creates an extensible media domain environment. Styles are media domain structures that abstract out a set of style metadata that you can attach to a block of text to specify how it to display it. Every time you create a new style you are extending your set of media domain structures.
This need for extensibility is another common pattern in structured writing. If you are working in the media domain, you may need to extend your set of styles. If you are working in the document domain, you may need to extend your set of document structures.
But, extending the document domain is not as easy as extending the media domain. For one thing, document domain structures are more abstract than media domain styles, which makes them harder to think about. For another, content in the document domain must be processed by algorithms before it can be published in the media domain. That means when you create new document domain structures, you also need to create the corresponding algorithms, which requires a skill set somewhat more complex than defining styles sheets.
This added difficulty in extending the document domain means that alternatives have been developed to manage the need for different document domain structures. There are four basic approaches:
- Languages like Markdown, reStructuredText, and HTML provide a default set of structures, but no way to add your own. Either they provide enough document domain structures for your needs or they don’t. Since these systems are not extensible, some people will create variants of them to meet specific needs (that is, extend them by modifying the core language itself). Thus there are a number of variants of Markdown designed for particular purposes.
- Metalanguages like XML provide a way to define your own structures, but no default set to start with. If you start from scratch in XML, you will need to define all the document structures you need for yourself. The upside of this is that you can constrain those languages very closely to exactly meet your needs. The downside is that you then have to write and maintain the algorithms as well.
- Systems like DITA, DocBook, and SPFE provide both a default set of structures, and a way to add your own if you need to. DocBook and DITA both provide a large set of standard structures, whereas SPFE provides a library of simple structures that you can combine to meet specific needs.
- Moving content to the subject domain can factor out much of the variation in document domain structures. A number of systems like SPFE, DocBook, and DITA can extend into the subject domain, You can also choose from a large number of existing purpose-built subject domain languages. (We’ll address factoring out document domain structures by moving to the subject domain in a future article).
Which of these approaches you choose depends on what you need to do—which constraints you need to observe and express in your content.
Now, let’s get back to the discussion of lists. If we need more than one type of list object in the document domain, we either need to extend our document domain language with new list types, or choose an existing document domain language that already has the list types we need. But how many types do we need?
One obvious formatting difference between lists is that some are numbered and some are bulleted. How does a formatting algorithm tell whether to use bullets or numbers to format a given list? One way would be to add a style attribute to specify bullets or numbers, but then the author would be working in the media domain again. To keep the author in the document domain, we create document domain objects that contain the metadata needed to make those decisions at the formatting stage.
The common way to handle bullets versus numbers is to create two different list object types, the ordered list and the unordered list. Different markup languages use different names—ol and ul in HTML, orderedlist and itemizedlist in DocBook, for example—but they are conceptually the same thing. Thus the HTML example above is a little more specific than just being a list object. It is an ordered list object (<ol>).
The choice of the terms “unordered” and “ordered” is important, because it focuses on the document-domain properties of a list—whether its order matters—rather than on its media domain properties (bullets or numbers). The decision to format an ordered list with numbers or letters or Roman numerals belongs entirely to the media domain. It has been factored out of the document domain structures.
When we work in the document domain we think specifically in terms of document structures, not formatting, and so each document domain object we create needs to make sense in document domain terms, not media domain terms. For example, consider nested lists. While nested lists are formatted differently, we don’t need a separate nested list document domain object. Instead, we express the fact that a list is nested by actually nesting it inside its parent list. For instance, we can nest one ordered list inside another ordered list:
<ol> <li> <p>Dogs</p> <ol> <li>Spot</li> <li>Rover</li> <li>Fang</li> <li>Fluffy</li> </ol> </li> <li> <p>Cats</p> <ol> <li>Mittens</li> <li>Tobermory</li> </ol> </li> </ol>
In the document domain, both are as ordered list objects. In the media domain, one is formatted with Arabic numerals and the other with letters.
One Document Domain Object, Two Media Domain Styles
In this case, the algorithm that formats the page distinguishes the inner and outer lists by looking at their parentage. For instance in CSS:
ol>li>ol>li { list-style-type: lower-alpha; }
This ability to distinguish objects by context is vital to structured writing, because it enables us to reduce the number of structures we need to fully describe our content, particularly in the document and subject domains. Distinguishing objects by context also allows us to name structures more logically and intuitively, since we can name them for what they are, not how they are formatted or where they reside in the hierarchy of the document as a whole.
Flat versus Hierarchical Document Structures
This reliance on context also highlights another important difference between the ways media domain and document domain writing are usually implemented. The media domain almost always uses a flat structure with paragraphs, tables, etc. following one after the other. For instance, a nested list in Word is constructed as a flat sequence of paragraphs with different styles. Inner and outer lists are expressed primarily by the indent applied to the paragraphs. (Word tries to maintain auto-numbering across such nested structured lists, but does not always get it right.)
In the document domain, document structures are almost always implemented hierarchically. List items are inside lists. Nested lists are inside list items. Sections are inside chapters. Subsections are inside sections. Where the media domain typically only has before and after relationships (except in tables), the document domain adds inside/outside relationships to the mix. This use of nested, rather than flat, structures helps to create context, which helps to reduce the number of different structures you need.
For example, HTML, though a document domain language, is relatively flat in structure. It has six different heading elements H1 through H6. Docbook, by contrast, is much more hierarchical in structure and has only one element for the same purpose: title. But DocBook’s title element can occur inside 84 different elements, and therefore can potentially be formatted in 84 different ways based on context. In fact, it can potentially be formatted in more ways than that, since some of the elements that contain it can also be nested, creating even more contexts.
We must strike a balance struck here, however. Nested structures are harder to create and can be harder to understand. Often they require the writer to find just the right place in a hierarchical structure to insert a new piece of content, which is much more difficult that simply starting a new paragraph in Word or Frame.
These considerations make the need for more than one document domain language apparent, and demonstrate the importance of extensibility. A single document domain language that captured all the document domain structures that anyone might want would be very large and very complex.
Worse, a universal document domain language would not express the unique and specialized document domain constraints that individual organizations need to manage their content creation and management processes efficiently. Much of the virtue of going to the document domain lies in the ability to impose such constraints, and that means that the world has and needs many document domain languages. And the document domain systems that are designed to be extensible (for example: Docbook, SPFE, DITA) are also designed to allow you to add additional constraints as well.
Finally, writing an algorithm to transform a large unconstrained document domain language into the media domain would be a daunting task, since it would need to have a rule to format every single combination of document domain structures that could occur in that language. With a large number of elements and few constraints on how they can be nested, the number of combinations would grow exponentially.
In fact, you will find in practice that some large document domain languages, even though they are constrained in many ways, can permit some combinations of structures for which their common formatting scripts do not have full support. In those cases, you would need to check your outputs and possibly fix your document domain markup to get it to format correctly. This is not necessarily the fault of the language. Technically, it is the fault of the scripts. But a language that allows for a lot of edge case combinations rather invites the creation of scripts that don’t cover all those rare or unexpected cases (due to the expense of creating, testing, and maintaining all the necessary code).
In a perfect world, the structured tagging language we choose will guarantee that if a document is valid according to its schema (or other validation checks) it will format correctly in all media domains. If we can achieve this then we remove the need for that author, or anyone else, to inspect individual outputs for conformance to the desired media domain constraints. In some circumstances, this presents huge potential for cost and time savings. But to get there, or even to approach this state, we have to think seriously about constraining the structures of our documents so that we can ensure our formatting algorithms are comprehensive and reliable.
Constraining Structure
Another big reason to work in the document domain (besides abstracting out formatting rules) is to constrain how documents are structured. Let’s say that you want to make sure that all graphics inserted into your documents have a figure number, a title, and a caption. This is a document domain constraint rather than a media domain constraint. The requirement for a graphic to have a figure number, title, and caption is one of document structure and organization, that does not say anything about how the title or caption should be formatted.
In the media domain, you can make styles available for figure-numbers, titles, and captions, but you can’t enforce a rule that says all graphics must have these elements (which is, by its very nature, a document domain rule). In the document domain, you can express these constraints. You can make sure that the only way to include a graphic is to make it a figure and give it a title and a caption by making it illegal to place an image element anywhere else in the document structure. The figure structure would look something like this:
figure: title: Cute kitty caption: This is a cute kitten. image: images/cute.jpg
If the only way to include an image is to use the image element, and the only place where the image element is allowed is inside the figure element, and if the title and the caption elements are required and must have content, then there is no way for an author to add a graphic without a figure, title, and caption. A document that lacked these elements would be rejected by the schema and reported either by the editor or by the processing software. (We’ll look at how these constraints are expressed and enforced in a later article.) As for the figure number, it would be generated automatically, just like the numbers in an ordered list. That constraint has been factored out rather than enforced.
This figure structure points to why document domain languages tend to have hierarchical rather than flat structures: constraining document structures. Typically, media domain applications place no restrictions on the order in which paragraph styles can be applied. If you want to put a level two heading between two steps in a procedure, nothing other than common sense will stand in the way of your doing so. A document domain language, however, usually disallows that kind of thing.
Instead, the document domain language includes a set of constraints on how procedures are constructed. For example, you might have procedure objects, which have step objects nested within them. A step is only allowed to appear inside a procedure. Only certain text elements—such as paragraphs, lists, or code blocks—are allowed to occur inside a step. A second-level heading cannot be placed inside a procedure.
Constraints like these are important to document domain languages. If you want to control how procedure are written, or how graphics are labeled, you need to create specific document domain structures for these things, and to constrain them to avoid them being misused. Without such constraints, it is easy for a language to slip back into the media domain, something that has happened to HTML.
In fact, authors backsliding into the media domain is one of the biggest problems in structured writing, and one that can undo all the business benefits the system was implemented to achieve. We will look at backsliding and some of the ways to avoid it in the next article.