登录查看更多内容

Ascii1000D: Lightweight Markup S1000D-style, part 1

John Mogilewsky

Publication Systems for Aerospace and Defense

发布日期: 2021年7月27日

In this piece we discuss the background of CCS (Component Content Systems, aka topic-based authoring) in regards to S1000D, and how we can replicate the business architecture of S1000D in a lightweight markup (Asciidoc) "Docs-As-Code" publications systems.

Follow along with the public-facing Ascii1000D sandbox: https://github.com/lopsotronic/Ascii1000D

Introduction: What is the Bare Functionality of Topic Based Authoring?


[New software licenses at ten grand per person per month]
+ 
[1 additional dedicated tools Level IV salaried headcount per five writers] 
=
[saving an hour per person per day].

Topic-based content systems - when attempted by actual real writers with a rapidly changing new product - have an occasional tendency to mutate into money pits.

See above calculation for the best-case scenario.

Punch those numbers in your calculator and it'll make a frowny face. Add in that the end product - the PDF "book" - is, well, worse, no getting around that, and it's not long before management starts giving you stinkeye. Unless you truly are saving gargantuan amounts of money with topic based authoring.

There's lots of reasons for this: the huge range in the writer skillset; the woes of XML-based publishing, and something I call "The Applicability Trap", among others. The first one and the last one can (and will) bite you regardless of what tool/markup/vendor you're using - that's a story for later.

So, topic-based authoring - particularly conditional, applicability-driven content - is it doomed? No. There's good, quantifiable functionality in these Topic Based Specifications. Particularly - well, exclusively - for those organizations who share a lot of content between deliverables. So what are the bare necessities? Transclusion, Partial Transclusion, and Conditionals. That's it.

That functionality expands out into the following more granular categories:

Transclusion and component transclusion: a document can be populated with the contents of another document (like a Publication Module [PM] grabbing Data Modules [DMs]) or a part of another document [like front matter, data restrictions, CIRs, SBs, WIs, Warnings, cautions, acronyms, parts information, MSDSs, safety specs, and a ton of other things]. Re-using CIRs for safety information is literally priceless - way better than having dozens or hundreds of different versions of MSDSs running around.
Conditional content: when condition X then show step X1, when condition Y show step Y1. It's not rocket science to see how this saves effort and rev cycles . . as long as you don't get too crazy with it, because this gets more complicated than you think. Sometimes a variant is too different to be in the same system that its siblings live in. The trick is knowing whether something is a "similar" variant or if it's being forcibly shoehorned into a wholly dissimilar system for pricing, regulatory, or other reasons. This topic has been in the aerospace news a lot lately, but it's endemic in any highly regulated industry where "new" products are strictly controlled. There's dollars riding on convincing customers that products are more similar than they are, and that's trouble for you as a writer. Be brave.
Parts Information Repository, which is transclusion but for delimited data that might come from the larger BIS (business information system) environment (LSAR, PDM, CAD, etc). The Parts Integration piece needs some intelligence on the document side so that it can selectively grab rows and, ideally, columns. Also, you might have changing data sources as the product matures (Perhaps Blue Label sucks CAD; Red Label snarfs PDM; Black Label gobbles LSAR). There's a huge amount of wiggle room depending on your business, because not everyone uses ERP, not everyone uses CAD. This is a subject deserving of its own separate article, so we'll just mark it for later and walk on by.
Technical Illustration: a document should have its media and artwork pulled - as directly as possible - from whatever the enterprise uses for its design process. This represents another huge amount of wiggle room totally dependent on the business, ranging from a simple PDF ::include:: for 2d sheets, to taking advantage of CAD repositories via dynamic queries and archival formats like STEP and 3MF. Interestingly, doing HTML deliverables with passthroughs for Web 3d content is probably the fastest way to do graphics . . IF your business trusts its CAD repository to support the product. Which is one enormous IF. Again, this is a subject dense enough to warrant its own article. Wiring Data gets handled with a similar range of approaches, since wiring design ranges from "Pizza-Spattered-Napkin-Scribble" all the way to design software like Capital Harness and the like.
Component content metrics: a requirement that's met by any decent software project management system on the planet: tracing each module from requirement / issue / ticket to planning to review through to publish[1]. JIRA is a popular vendor, but there are tons of options. Pick the one that integrates with your business; don't let the tool dictate the work.
Multiple output channels from the same source (aka "single sourcing"): PDF, HTML, XML, and whatever other weirdness the customer wants. HTML and PDF output need very robust styling options (see my article on the subject), and they'll probably need some form of XML interoperability. If you need S1000D out of native DocBook, you will have to spend some development time on the transform. Difficulty of DocBook-S1000D transforms is wholly dependent on S1000D issue and BREX; it ranges from the frankly impossible to the medium hard.
Acceptance of choppy manuals: these topic-based manuals won't be works of art[2]. They're built from blocks. A toy made from LEGOs will always be heavier, uglier, bulkier, and more expensive than a toy that's purpose built. Similarly, a component / topic-based manual will never flow as well as a unified manual. Lego planes don't fly, ever notice that? Not without a literal rocket booster, anyway. It's because systems interfaces are expensive, and lego blocks are nothing but interface. Anyone who says otherwise is selling you something. Probably something eye-wateringly expensive.

So there's some of our useful functions from component content systems.

For the functions that are worthwhile, what's the lowest cost point at which we can make these happen, while still getting old-fashioned print output?

Time to Get Stupid

Right around now I'm going to say something that most content professionals would call pretty stupid. Maybe even monumentally, astonishingly stupid. So stick with me a second. Let's toss out our notions about strict validation, content information typing, and schema definition languages. No one wants to pay for those, anyway. It's time to get stupid.

So: lowest cost point at which you can do component authoring? Asciidoc.

We're keeping all the aero/def business architecture from S1000D (SNS, incodes, etc), but we're going to do the actual work with standard programmer tools: Text files, Atom, VSC (Visual Studio Code), git, and CLI (Command Line Interfaces or Shell interfaces like BASH or MS PowerShell). Modern text editors - VSC and Atom - are awesomely powerful compared to a typical dedicated XML editor. Asciidoc - DocBook interoperability lets us push to the XML world, if we need to, but we don't need to rebuild our entire business around namespaces, custom parsers, or mixed-mode XSD/DTD validations. HTML+Javascript gives you some extremely sophisticated IETM behavior all the way to L5, while you generate PDFs using a range of technologies, from the simple (the standard `asciidoctor-pdf` gem) to the complex (FOPUB, which integrates the DocBook-XSL PDF processor).

Anyway, we just want docs, so let's use a doc format. It's simple. It's stupid. Text files talking to text files.

Non-Apology Apology:: I'm not advocating we all start writing aerospace publications in lightweight markup languages. Aircraft are complicated things, and the modern sustainment pipeline makes them fifty times as complicated. Integrating lightweight markup with that, from scratch, is going to take work, and if you have a unified ERP/LSA environment then that work's already been done for you. Center your pubs operation around that instead. Having said that, if you're in a situation where 1) you have no money for?tooling, 2) you need to push content in a hurry, 3) you have non-integrated business information all over the place (PDM, LSA, CAD, ERP, LMS, etc), and 4) you have a requirement for multiple active contributors in a topic-based system, then maybe take a page from the software development world. Lightweight markup in text files, programmer-style text editors, and off the shelf version control systems will get the job done, and it's not a galaxy away from how we would do things in S1000D anyway. Alright! Editorial non-apology apology is over.

Let's review some useful Asciidoc equivalents of S1000D constructs. Today, let's take a look at Asciidoc equivalents of Publication Modules (PMs), Applicability, and Common Information Repositories (CIRs).

Simple Ascii1000D . . in Action

No ACT/PCT/CCT, no CIRs: in the below diagram, the Publication Module brings in Data Modules via the Asciidoc include directive. The Publication Module then sends the resulting package to a document processor, which creates PDF, HTML, and more.

Publication Module (PM)

PM is an easy one. A PM equivalent is an asciidoc file that includes other asciidoc files.

That's it.

We'll be seeing the include directive rather a lot when engineering S1000D equivalence. In programmer-y terms, Asciidoc `include` is an implementation of transclusion, not dissimilar to pmref or topicref or object or xinclude, but far simpler. In Asciidoc, we can have transclusion and we can also have partial transclusion, which we'll talk about later with CIRs.

To help make sense of include, let's take a look at our file system. The file system scheme for our Ascii1000D project might look something like this:

If you're from S1000D, a lot of those acronyms and numbers mean something, with the exception of ICF (illustration control file), which is a method for handling applicability (conditional content) for illustrations (ICFs are a way of moving applicability chunks for graphics away from the narrative). I am, unfortunately, assuming that you're bringing in a little S1000D knowledge to this article. If you aren't, merely marvel at the amazing length of these filenames.

Also, wow, that's a flat folder structure, isn't it? Why is that? It's all about relative paths. By default , the Asciidoc processor considers the including document to be the current location - regardless of where the included document might be. When you run the PMC through the processor, it thinks it's in ./PMC even as it processes the included files. That's why the `include` is called a pre-processor directive - it pulls the includes in before it starts transforming. We could hack this by using user-defined attributes or built-in attys like imagesdir in our included files, but for today, let's keep it nice and simple with a flat hierarchy of data modules. PMs and DMs are at the same level so we don't need to worry about relative paths changing from DM to PM. ./CIR is always the place to find Common Information Repositories, regardless of whether you are running from DMC or PMC or somewhere else. One step up, and one step over.

Let's crack open one of those PMs, viewing with Microsoft Visual Studio Code and the Asciidoctor Plugin activated[3].

The Publication Module starts with a level one heading - the PMTITLE of the deliverable - and those ::includes:: can be arranged in whatever nesting of headings (PMENTRIES) you might desire. At the top of the PM, we also have a whole bunch of book metadata Asciidoc carries over from Docbook, which is a good thing - we use all of it.

PMC ::includes:: need to use the leveloffset attribute in the include. Why? A data module will always have a level one heading - the DMTITLE. Included directly without leveloffset, DMTITLE will have an argument with the level one heading for the publication module ("Who's the title? I'M THE TITLE"). Leveloffset lets you tell the included file what heading it's supposed to start at. Leveloffset=+2 tells it to add two heading levels to whatever is being included.

If you're using DocBook-XSL (also present in FOPUB and the "boxed" Asciidoc editor AsciidocFX) at all, don't skip heading levels: if you go to heading 3 from heading 1 with a DocBook processor, it will complain mightily. It expects heading increments to be 1.

领英推荐

Unlock the full potential of your DITA CCMS

Metapercept Technology Services 1 年前

Unlocking the Benefits of DITA-XML vs Docs as Code:…

Metapercept Technology Services 1 年前

Unlocking the Benefits of DITA-XML vs Docs as Code:…

Metapercept Technology Services 1 年前

The PM also uses document header attributes to tell the processor "I Am a Book" as opposed to "I Am an Article" or "I am a Data Module". PM doc header also contains information like title, author, date, revremarks, all that stuff.

Finally, the publication module is where user-defined attributes are declared for applicability. Asciidoc applicability is very stripped down from the powerful S1000D applicability model, but, on the other hand, it does work out of the box, for free. The PM is just where the applicability is declared - the conditions are used in the data modules. You might see it declared in a PM as follows:

Those attributes are global, so they will be in effect for any and all `included` files that the PM brings in. The publication module, then, mirrors precisely the configuration state of its corresponding deliverable.

For example, if you're writing engine documents, and the new engine is a Block IV Flexifuel, then the publication module might have attributes that look like :BLK4: and :FLEXFUEL:, along with (if you track such things) a starting serial number as the respective attribute values (document attributes have names and can have values as well, don't forget, and you can use ifeval for serial number range applicability if you have a variant that isn't classified as a block or a mod dot yet . . a subject for later, and it's a wee bit outside of our publications sandbox).

Anyway, those are your applic declarations, they ride in the PM, and all the DMs inherit them. Which is a pretty good segue into our next item.

Applicability

Whew! PMCs took way too long. Let's dive straight into applicability at the DM level.

Applicability aka Conditional Content in Asciidoc is quite a bit lighter than in other CCM/CCS vocabularies, as shown below.

Comparison of Conditional Content markup

Conditional Content aka Ascii1000D Applicability is done using Asciidoc Conditional Directives in the DMs. Let's use a Data Module (DM) from the PM we're using: \\DMC\DMC-DEMO-000-10-00-01A-280A-A.adoc.

Ignore those other includes for the moment. As you might be able to tell from the Information Codes (INCODEs), those are Asciidoc equivalents to Common Information Repositories. We'll get to CIRs in a second.

Notice the ifdef conditional directive in that procedure. When this data module is run by itself, that step - checking for the yellow warning - does not appear. However, when it is included by the PM declaring CONFIG1 as an applicable attribute, this toggles the content "on", thus giving you some customization for shared document components.

See below for a side-by-side of HTML output, one with no applic declared, and the other as called from the PM with CONFIG1 declared.

Note that unless otherwise stated, from here on out all the sample renders are HTML. It's just a heck of a lot faster.

See there? Since the one of the right is being run from a PM with CONFIG1 declared, that procedural step shows up. When it's not declared, it's suppressed in the output deliverable.

Now you might see what those Illustration Control Files (ICFs) are for. When you have multiple configurations being described by one DM, you need to have a ton of applicability blocks to toggle between all the different graphics, because individual graphics can't really have bits and pieces that can be toggled. Not well, anyway. The ICF gives all those applicability-driven graphics a place to live, so that the writer doesn't really need to worry about juggling those. Whoever's doing the heavy illustration work can do the ICFs, and then the writer just needs to `include` those. When the publication is run from the PM, the PM sets the applicability, and it's persistent all the way down to the ICFs, filtering the applicable graphics.

Now, when it comes to integration with CAD and 3D content, that's a whole other article. Stay tuned!

CIRs

Another usage of the include directive is for bringing in parts of other data modules, an instance of partial transclusion. Let's say we want to use a shared warning. Using the above example, I might have a procedure step that goes like this

This is pretty important! But say I get a call from the safety office. It turns out that the voltage KILLS - and we need to say that everywhere, every single place we have that warning. Over the years, we've worded this warning all sorts of ways, across hundreds of books, so this could get to be a nightmare.

But what if we made a centralized Common Information Repository (CIR) that contains all our warnings, separated into tagged regions? I might have a file ./CIR/DMC-DEMO-000-00-00-01A-0A4A-A.adoc (note the incode, S1000D folks, that IDs it as a CIR), and the warning in that CIR might look like this:

Now, to use that CIR, we use a partial include to that tagged region of the CIR in our procedures, everywhere we want that warning to show up. Note the tag name declaration in the include directive below.

Now, when we fix a warning in that CIR (incode 0A4A), it's fixed everywhere, all at once, wherever it's used. Bam! Here's how it looks when we run it.

We can do this for data restrictions (ITAR and export statements), regulatory statements, acronyms, cautions, wiring data, and a giant bucket of other stuff. It's pretty easy, it's relatively simple, and -- are you getting tired of hearing this yet? -- it works right out of the box, for free.

Sort-Of-Conclusion

I'll be doing some more in this series, but this gives you at least a basic idea of where lightweight markup is these days, in terms of supporting topic or component based authoring. It's a very different place than it was in the mid-2000s!

Notes

[1]No, I don't care about GIANT USER MANUAL metrics at the book level - we need tracing at a component or task level. (Also, a git branch should correspond to an ECR/ECN/Engineering-Ticket-Thing, not, repeat, NOT, a book-level deliverable. The latter is far too big and long-lived for a branch). We need to be able to trace procedures, files, like Superbomb trigger, variable yield - Inspection. "That proc's been hit with part changes sixty times in the last four months!" is a pretty good red flag for problems upstream of the publications department. Or, different situation, same time frame, you can see Jane Doe's done two hundred sixty commits while John Smith had fifty commits. What if John's been working on the Superbomb trigger system? Given what we saw with that system, the manager should maybe check and see if Superbomb trigger is something heinously complicated and/or broken. If it is, John probably needs some help from outside the group to piece together whatever the heck is going on with that system. Or the manager could go back to the programs office and tell 'em, "Your stuff''s broke, we'll document it when you make something that works". Or, maybe, otherwise, John's slacking. Or "all of above".

[2]Anyone who says otherwise is selling you something. The tradeoff for ugly is better consistency, re-use, process efficiency, and more opportunities for integration. To continue the LEGO metaphor, you can theoretically make loads of other toys from the same set of LEGO blocks, which is why they cost so much. Whether that's worth the cost is a decision your business will have to make. And if you're not getting anything out of it, drop topic-based component content like a hot rock. You do not want to be using component content management unless you're re-using at least 75% of the assets. Otherwise, it's like buying and building a brand new LEGO kit for every new toy you wanted to make. This represents a nauseating amount of waste in both human and capital terms. There's a reason LEGO planes can't fly. And if they ever made one that did, it would be hand-crafted from those stupid customized purpose-built pieces that only do the one thing, and now - you poor fool - you've got multiple bespoke configurations to deal with, because the bespoke top level assembly is built from them.

[3] At the end of the day, it's just text, so you could use your favorite text editor with the Asciidoc lexxer enabled. Visual Studio Code gives you more functionality than I could comfortably summarize here: customized autocomplete, reference handling, image tools, etc etc etc. Alternatively, if you're starting out, you could use the standalone AsciidocFX editor which, although under-maintained, integrates several other useful libraries in one piece of "boxed" software. It's great for learning Asciidoc in general, and the DocBook piece works well. FX does use the older DocBook-XSL (FOPUB) for PDF output, however, so you will miss some Asciidoc-only stuff but you gain a larger degree of freedom when it comes to PDF formatting. DocBook-XSL is very old, very kludgy, but very configurable when it comes to print. For some more fun in PDF land, check out my article on the subject.

要查看或添加评论，请登录

John Mogilewsky的更多文章

The Content Architect, part 1: The Avionics Factory

2022年7月31日

The Content Architect, part 1: The Avionics Factory

Let’s join a fictional content architect as he stumbles his way through a new business, with hopefully a solid…
The CCM-PDF Pipeline

2020年8月17日

The CCM-PDF Pipeline

Before this PDF exploration begins in earnest, let's define some terms, like what we mean when we say "CCMS/CCS"…

3 条评论
Skill questions for advanced technical writers and content analysts

2019年11月30日

Skill questions for advanced technical writers and content analysts

With no centralized accreditation mechanism for the many various content specifications, triangulating skill level of…
OneStrand hosted S1000D product

2016年7月20日

OneStrand hosted S1000D product

Very, very curious to see how this turns out. Anyone who knows how this product is, feel free to spam comments.

Ascii1000D: Lightweight Markup S1000D-style, part 1

John Mogilewsky

Publication Systems for Aerospace and Defense

Introduction: What is the Bare Functionality of Topic Based Authoring?

Time to Get Stupid

Simple Ascii1000D . . in Action

Publication Module (PM)

领英推荐

Applicability

CIRs

Sort-Of-Conclusion

Notes

John Mogilewsky的更多文章

社区洞察

其他会员也浏览了

6 Tips for Implementing a Technical Publishing Strategy

Why Unstructured Content is Holding You Back? An Analysis of Content Migration Strategies

6 Reasons to Stop PDF Reviews of Structured Content

Technical Writing Digest - May 2024

Technical Writing Journey: From Past to AI-Powered Future

Unraveling the Power of DITA & The Concepts Of DITA in Technical Writing

AureXus launches the new version Medusa 3.2 of its Digital Content Management Platform.

A Conversation With Bard About DocBook And MadCap Flare Software

Markdown: Valuable Tool for Technical Writers

DITA for Technical Writers Tips and Tricks by Contents Dynamics

Introduction: What is the Bare Functionality of Topic Based Authoring?

Time to Get Stupid

Simple Ascii1000D . . in Action

Publication Module (PM)

领英推荐

Applicability

CIRs

Sort-Of-Conclusion

Notes

John Mogilewsky的更多文章

The Content Architect, part 1: The Avionics Factory

The CCM-PDF Pipeline

Skill questions for advanced technical writers and content analysts

OneStrand hosted S1000D product

社区洞察

其他会员也浏览了

6 Tips for Implementing a Technical Publishing Strategy

Why Unstructured Content is Holding You Back? An Analysis of Content Migration Strategies

6 Reasons to Stop PDF Reviews of Structured Content

Technical Writing Digest - May 2024

Technical Writing Journey: From Past to AI-Powered Future

Unraveling the Power of DITA & The Concepts Of DITA in Technical Writing

AureXus launches the new version Medusa 3.2 of its Digital Content Management Platform.

A Conversation With Bard About DocBook And MadCap Flare Software

Markdown: Valuable Tool for Technical Writers

DITA for Technical Writers Tips and Tricks by Contents Dynamics