Content Traceability and Standardization

Image by suju on

Safe consumption and use of a wide variety of products dictates traceability of component materials, chemicals, parts and other elements. It’s something we see and read about frequently when it comes to food, children’s toys, drugs, automobiles, transportation and more. In this article, we talk with Alexandre Loukakos, CEO of DocTech Software, about content traceability and standardization, and the tools and processes companies can use to ensure both throughout their documentation.

Documenting that traceability, in the manner prescribed by law or industry, is one facet of content traceability. The other, less readily known or discussed outside of technical communication circles, is ensuring that the content regarding the product can be traced through all of its iterations to ensure proper review and approvals that minimize risks arising from inaccuracy, obsolescence or incompleteness.

For manufacturing and other goods-producing companies, traceability and standardization represent key factors in business and operations models. These organizations invest large amounts of money and resources to deliver products that produce a profit, while meeting regulatory and legal requirements for sourcing, production and distribution. Documentation and other technical content on products and components can represent a significant percentage of those investments.

The Complex Relationship Between Content Traceability and Standardization

To understand how interconnected traceability and standardization are within a business, let’s start by defining traceability. Oxford dictionary online defines traceability in the business context “as the ability to discover information about where and how a product was made.” Companies need to be able to trace their products’ supply chains from the source all the way to the consumer, and particularly in highly regulated environments, traceability is usually a legal requirement. And full traceability includes content on the proper setup or preparation, storage, consumption, and disposal of these products.

Alexandre notes simply that “Content traceability and standardization go hand-in-hand. You need standardized content to be able to get any kind of relevant traceability.” In fact, standardized content is an absolute necessity for both traceability and reuse.

He argues that, to begin the standardization of content, a content team should perform a step-by-step segmentation of the information into “topics” or “data models”, based on the content standard the organization applies (e.g., DITA or S1000D). “In my view, no matter the standard one chooses to follow, the key to maximize ROI is to take the opportunity to adequately segment, structure, and tag the content.”

Determining the Right Content Standard for Your Organization

International content standards allow companies to operate more efficiently while conforming to regulatory and legal requirements. Even companies that don’t operate in a highly regulated environment stand to benefit from implementing standards. Alexandre points out that the three primary international standards–DITA, S1000D, and ISO (AS9100, EN9100)—”are minimal requirements; they are not a barrier. Nothing prevents you from going above them if there is a benefits in doing so. In fact, standards implementation may bring you additional business you wouldn’t qualify for otherwise.”

How should your company decide which standard would best serve their needs and objectives? Alexandre provides some background how the standards evolved.

DITA was created and initially owned by IBM. Originally designed for software code production, DITA was made public in 2005 when ownership transferred to OASIS.

S1000D is an international specification for the production of technical publications that dates back to the 1980s. While it is gaining popularity in sectors such as rail and roadway transportation, biotech, engineering and machining, S1000D is mostly used in the air, space and defense sectors. It is therefore more specialized and limited in application by companies outside of these sectors.

“Both S1000D and DITA are based on single source principles,” Alexandre explains. “Therefore, both allow reuse of blocks of content in the forms of data modules (S1000D) or topics (DITA). These blocks can be reused within different documents eliminating the need to rewrite content over and over.”

However, S1000D remains more specific to these industry sectors, which limits its usefulness in other areas. DITA is not constrained in this fashion, and as such has more applications. In fact, “Depending on the corporate activity, you can adapt DITA to S1000D through ‘specializations’ when needed,” Alexandre says. “The opposite is not true–S1000D is frozen and limitative with no ‘specializations’ available.”

Because the flexibility of DITA is what is missing in S1000D, Alexandre recommends that companies question whether they really need S1000D. Its limitations can constrain an organization unnecessarily. For companies that have not implemented standards previously, “we suggest starting with DITA, as it will produce higher ROI, it and can be adapted later on to be S1000D compliant.”

Benefits of Content Traceability and Standardization

Like any other major business decision, the choice to implement content traceability and standardization relies on a cost-benefit analysis. The factors impacting that analysis can vary widely based on the industry, the maturity of the company’s processes and content models, and a range of risk factors related to compliance, regulatory requirements, geographic distribution, and more. The following table describes a partial list of both:

Costs to Consider Benefits
Conversion from legacy systems and condition of legacy content Higher productivity from automation of structuring and tagging, and elimination of “copy & paste” and version inaccuracies
Segmentation and tagging of content types Better reuse of content, reduced content update and revision time
Compliance and Audit review processes (including processing of large amounts of data) Simplified review processes and reporting mechanisms
Customization of standard features/functionality to address specific needs End-to-end process that reduces time to market
Training for system users Increased productivity and improved subject matter expertise
Maintenance and enhancements Proactive approach to improving documentation and compliance processes

Alexandre points out that in his experience, upfront costs of converting legacy systems can be quite high, as many organizations fail to consider traceability and the need for content standardization during design and develop of the products they sell. “Willing organizations are able to adjust legacy tools to support standardization, others are not,” he says. “Some organizations would be better off starting from scratch and do a batch import later, instead of dealing with so many upfront costs to get legacy system content into shape.”

Determining the real costs of implementing or not implementing standardization and traceability can be quite challenging. Alexandre provides a hypothetical scenario of the risk.

Let’s say four people (A, B, C and D) exchange emails with the common goal of put the best content/ideas in a DOC file. To do their best to track their contributions, they use the revision features of their favorite word processor. A and B each send their version with their comments. D sends his file, which it is based on B’s file but some of B’s modification disappeared due to D’s modifications above it. C was on vacation and catches up by sending his version. Then A spends a day or more merging all the files. A takes C’s and D’s files to save some times seeing that D’s work was based on Bs file. As a result they lose B’s modifications that were unwittingly deleted by D before sending his file.

Long story short: What is the price of that best idea your company just lost?

Key Features of Traceable Content

Before embarking on an initiative to build traceable and standardize content, familiarize yourself with the features of content that make it more readily traceable, and how these features can be built into the processes your team would follow:

  • Importing legacy content
  • Single-sourcing
  • Structuring content
  • Tagging structured content
  • Defining and managing logs to demonstrate conformity
  • Defining content workflow and approval processes
  • Defining desired output layout (CSS stylesheets) and format (PDF, HTML, etc.)

Building Content Traceability into a Documentation Project

Alexandre, explains that whether a team is starting a new project or seeking ways to optimize an existing project, they need to review source content, before importing into a content management system (CMS) to develop a method to segment content. For a new project “information can easily be segmented. What’s missing most of the time is to find the appropriate structure and tags.”

Consider building steps for content traceability and standardization into the content lifecycle your organization maintains.

Existing materials

In existing initiatives, the effort required to build traceability and standardization into the system depends greatly on the condition of the legacy content and what strategic and tactical choices the team made initially.

For example, if the team is working with content coming from word processing tools such as Microsoft Word or Open Office, the choices and approach are similar to what they would consider in starting a new project. But in cases where a CMS already exists, the team will need to review the structure and tagging approach and build out a plan for “batch conversion in order to have the same structure, data modules, tags that they originally implemented.” And if the team determines that changes to structure are required, they must adequately plan for how the structure can significantly complicate conversion and implementation. He also notes that in converting to a new system such as DocTech, the team should consider the technical feasibility of maintaining logs from prior to the conversion.

While it’s possible to prepare your content using multiple tools such as word processor, spreadsheet, presentation, or PDF files, the risks and resource requirements of using manual conversion processes can become prohibitive pretty quickly.

With a specialized tool such as DocTech, the team can import source content from Microsoft Word or Open Office files, review and modify the structure, and then complete the import so that structure and tags apply across the all of the content modules. If files are “clean,” with appropriately applied style tags (such as Heading or Title). “it speeds things up considerably, allowing a team to build traceable content instantly,” Alexandre says. “Teams can choose to manually segment content further after import, and apply more specific attributes and specialized tags to gain additional ROI.”

What to Watch for in Implementing Content Traceability and Standardization

Once the decision is made to use a CMS and eliminate dependence on word processor files, existing PDFs and email chains for technical, legal and other sensitive documentation needs, the team must choose the right tool to get the job done, what Alexandre refers to as the “How to get there?” phase.

“Sadly, that’s where a lot of content teams fail,” he says. “They know where they are and where they want to be, but they only think they know perfectly how they want to get there, which path to follow. They fail to ask the end users (both readers and contributors) what kind of platform they would find use.”

To get the most productivity gains from the content standardization experience, the content team should incorporate this key factor, but end user feedback is often overlooked.

Feature overload represents a real risk to those who are considering a content traceability and standardization effort. “Too many or overly complicated features can annihilate most productivity gains from structured and reusable content,” he says.

To avoid feature overload, Alexandre recommends that content teams first evaluate the actual state of their documentation, the “Where we are” phase. Then they should define the “where we want to be” phase, researching and deliberating on the goals they want to achieve, including the extent to which they want to implement standardization within the organization.

For instance, one team may want to limit CMS use strictly to user PDF manual generation, while another may want to use it for all documents across the whole organization including PDF generation as well as a content engine for an intranet HTML-based FAQ board or Emergency Operations dashboard.

In all cases and for each implementation, the team should evaluate the cost of continuing in the current state, especially in terms of time lost: exchanging emails with voluminous and sensitive attachments, searching for files, working on the wrong file, and merging and tracking multiple files from different sources. And always incorporate potential end users at the beginning of the project and throughout the testing period.

The best scenario, Alexandre says, is one where the organization encourages initiative among their content teams, to always look at corporate processes with an acute critical eye looking for areas for improvement. “The key is to identify the bottlenecks within the corporate’s content management processes,” he says.


DocTech has worked with many clients to implement DITA/XML standards as part of their CMS initiative. The ability to recycle content and eliminate the need to rework each document manually benefits the whole organization, in terms of productivity and full traceability of the original content.

Content traceability represents an initial investment that may be large or small, with big returns or losses depending on the path chosen and the starting point. “Content standardization is the only way to make it traceable and reusable,” Alexandre states. “It implies content segmentation and tagging, which serve as core of the standardization investment and more efficient content management.”

Organization benefit when their content teams are freed from time-consuming search, manual updates and revisions, multiple versioning, and the other problems that plague many legacy systems. “Content traceability and standardization in a well-designed system results in more free time that the end user and the organization could use to generate more business and to focus more on the quality of the content.”

You may also be interested in:

Subscribe to TechWhirl via Email