Lightweight DITA: a Preview from Michael Priestley

Lightweight DITAA new buzzword has appeared in the worlds of structured authoring and content reuse–Lightweight DITA. On November 28, Michael Priestley, one of the lead DITA architects and a Senior Technical Staff Member at IBM, gave a presentation to the Toronto Chapter of the STC about Lightweight DITA and how IBM has used DITA as a key part of a 60-million-page knowledge center.

Since its introduction a decade ago, people have been saying that full DITA is too complex, but according to Priestley, what they mean is that it has features that they don’t need. There have been several attempts to simplify it, but there’s no standardization between them. Lightweight DITA will introduce a simpler, standard set of features that will make it much easier for people to start using DITA, maintain compatibility with the full DITA standard, and still be customizable and easy to extend.  Lightweight DITA will become part of the official DITA standard, although it may not make it into the forthcoming DITA 1.3 specification.

One of the questions asked during its development was “How do you get to something that’s lighter weight given that everybody wants a different light weight?” Part of the solution was too look for different areas of overlap that people could start with and extend that outwards instead of having to understand the whole thing and narrow it down. To make this easier, Lightweight DITA will include a subset of specialization.

dita-bird_0Because it’s built on top of the full DITA, Lightweight DITA users get a whole set of architectural behaviors for free: specialization, inheritance, re-use, filtering, collections, and metadata; these are already defined and can be used without having to be re-invented.

Within IBM, there are several scenarios driving the implementation of Lightweight DITA:

  • Contribution: SMEs creating content that will be used by a full DITA system. (This is also a common scenario in publishing companies).

  • Collaboration: Similar to contribution but the people are maintaining the content over time.

  • Parallel adoption: People want to be able to share a content management system’s or publishing system’s capabilities without necessarily sharing the content.

  • New adoption: Where people are moving to XML, it’s almost certainly to DITA. One of the benefits of Lightweight DITA is that people don’t have to decide what to skip or what they don’t need at the beginning.

In IBM DITA is widely used and its use is growing. It’s being used for documentation, product announcements, semi-conductor design manuals, learning and training material, support content, marketing content, white papers, and policies and procedures.

Lightweight DITA is substantially less complex than full DITA:

  • Topics: Reduced from 94 elements to 27.

  • Maps: Reduced from 10 elements to 2

  • Specialization: Reduction in what you can specialize

  • Document types: Reduced from 23 to 6

Everything that’s being done for Lightweight DITA is a valid implementation of DITA. The DTDs will be different but the content is valid DITA and can be shared with systems using full DITA.  Any tool that supports full DITA should be able to process Lightweight DITA, because Lightweight DITA isn’t a new standard; it’s a very carefully defined and built out subset of full DITA.

Priestley provided more details in a discussion with the STC members.

  • Mixed content will not be allowed. All text must be in a paragraph tag. This will make conrefs possible between any text elements and simplifies re-use, processing, and tool creation.

  • All attributes are now managed as functional groups, which can be removed or enabled as groups; for example, re-use or variable text.

  • Nesting will be configurable.

  • Specialization will have less functionality, but it will be much easier. The goal is to make authoring a specialization as easy as creating a topic.

For more details about how topics, maps, and specialization have been simplified in Lightweight DITA, refer to Michael Priestley’s SlideShare presentation on Lightweight DITA.  Some areas are still in flux – for example, whether some subset of the CALS table model should be included. Priestley’s goal is to have Lightweight DITA ready early next year, with a starter set of document types, followed later by one including the specialization topic.

As part of his talk, he gave a demonstration of IBM’s Knowledge Center, a massive knowledge base containing more than 20 million DITA topics and 40 million HTML pages built from content contributed by IBM’s 1,500 technical writers. To say that it’s impressive is an understatement, and it would have been impossible (or at least, far more difficult) without DITA.


Subscribe to TechWhirl via Email