Object-oriented content modeling: thoughts on similar-but-different content types
John Collins
Sr Localization Manager (Atlassian); principal content architect (Collins Content)
Recently, Aaron Bradley asked a question:
What is the correct degree of separation between types when building a content model? Does any difference between elements warrant a new type, or is it better to consolidate types as much as possible - even if this means overloading it for some users? To what degree should content authoring be a consideration, and should authoring requirements allow duplicative types for the sake of comprehensibility?
If we lived in a perfect world, this might be easy to answer, but alas, the world is complex, and the answer, of course, is “It depends.”
But Aaron’s question mirrors some musings I’ve had over the few years, and I wanted to share them here because it's more than would be appropriate in a comment.
Some principles
It’s helpful to define principles that can help us sort through complexity. Here are some content modeling principles that I've come to realize that I've held over the last few years.
We can lean into them for this question.
Build content models that are foolproof for authors who are going to be using them
To Aaron’s question, I absolutely believe that the author experience needs to be considered when creating content models. In fact, I propose that content authors are the content architect’s first user.
Content authors need to understand how to create content that aligns to models so that they can create semantically useful and accurate content. Well-designed content models can foster this.
Content models with very similar-but-different content types can start to confuse authors, and those authors may misuse the system, putting content in inappropriate content types.?
When content is used in the wrong content types, the entire system starts to crumble and your good intentions will get you nowhere.
The situation can get even more complex when authors from different parts of the business need to use these content types. In fact, this might be one of the reasons you’d have to try to sort out whether or not to build a new type. And it becomes a reason where you need to involve the next principle.
Build content models that a system can process easily
The content architect working on content models should also be thinking about how a developer will be consuming the content that comes from these models. The developer is the content architect’s second user.
Ask yourself how developers will build queries to get this content. Will the query be a single query? Will it require multiple queries? This, I think, becomes key to Aaron’s question. If the developer wants all of something they expect to be a content type, will they know it if you actually have three content models that represent that content and they need to query all three?
Will there be nested queries? If you use references from one content type to another, this could complicate the developer’s life. Nesting can probably go too far, but it may be a useful technique in some scenarios.
Build content models that reflect the content appropriately
How many times have you seen someone (maybe yourself) use a content type in the CMS that wasn’t intended for the content they put into it? Like maybe you’ve got an online community and someone created a “video lecture” entry when they should have used a “meetup” entry, since the event was happening in person and being recorded–but the “video lecture” entry had a video URL field and the “meetup” entry did not.
Now you’ve got an entry that misrepresents the nature of the event. Probably not a dealbreaker, but it may cause confusion. Attendance might be low.
Sometimes, to avoid this, we over-generalize content types and all of the content types are made up of title and body fields. Now, there’s nothing to differentiate between the content types, so why have content types at all?
(As an aside, I’ve got a slightly tongue-in-cheek idea that many, many content type should be only a title and body field, just with differing metadata capabilities.)
But no, we want distinct content types that reflect the reality of the content, the uniqueness of the content, the meaning of the content.
A content architect is the custodian of meaning in content systems.
By preserving the appropriate meaning of content through content models, the content architect is serving a third user: the content consumer.
Borrowing an idea from object-oriented programming
Several years ago as content architect I reported to a senior engineering manager, Kent Gillenwater . We were having a discussion about basically the same question as Aaron has raised, and that sparked this idea.
(Caveat: I do not consider myself a developer. I like to say I can’t code but I know enough to be dangerous. So, if you’re more comfortable with code than me, excuse any mistakes you see in this section.)
In object-oriented programming (OOP), there’s the concept abstract classes and concrete classes. We might be able to adapt this for use in content modeling.
An abstract class is meant to be used as the base class from which other classes are derived. The derived class is expected to provide implementations for the member functions that are not implemented in the base class. A derived class that implements all the missing functionality is called a concrete class?.
— Bruno R. Preiss in Data Structures and Algorithms with Object-Oriented Design Patterns in Java
I believe that the content architecture community needs concrete content types and abstract content types.
Abstract content types should be highly-structured, purpose-built for re-use, standalone—not bound to any specific property or user experience. They should contain the core set of common user-facing and metadata fields for the content type.
领英推荐
Concrete content types should also be highly-structured, but designed for the unique content needs of variants of the content type, whether for user-facing fields or metadata fields. Reuse isn’t a key goal of these content types.
Some examples
All of this may be a little too … abstract … so let’s get a little more … concrete …?
(Caveat: I couldn’t use real-life examples, so this example is fictitious. It is slightly over-simplified, so please don’t get pedantic about the details of the scenario or model.)
The scenario
Here’s an example: The company Acme Startup has built a fantastic software product for managing the modern household. The core product was designed more as a platform, and is extremely extensible. In fact, it has its own app store ecosystem.?
Acme wants to use a content management system (CMS) as the backend for app store listings. While company-developed apps, partner-developed apps, and independently-built apps should all have fairly common capabilities in the app store, Acme Startup anticipates slightly different content abilities across the three, along with vastly different metadata expectations.
For instance, Acme Startup apps get additional metadata that will be used to suggest apps at the appropriate time. Acme Startup and partner apps get the ability to have promo videos, but independent developers don’t because they have a lower barrier to entry and aren’t in such a close relationship with Acme. Acme Startup apps are all covered under the Acme Terms of Service, but non-Acme apps may fall under additional terms of service, so there’s a field for the URL of those terms.
Acme Startup doesn’t want to expose all this to the people creating app store entries because it would cause confusion for the authors and also, potentially for developers and consumers in the app store.
Approach 1: Three different “app store” content types
In this approach, the CMS has three content models:
Because this is a fictional example, I declare that the authors creating content for the app store see only the content and metadata fields they are meant to see. However, developers building the app store need to make sure that their queries are checking for three different content types.
When it comes to implementation, you’d want to use some sort of naming convention to signify the similarity of the content types, such as:
Approach 2: One “app store” content type with a reference field to three other content types
In this approach, the CMS has one “app store” content type. However, it has a reference field (a field that links to other content types) that presents three other content types with the appropriate additional fields for the different author types.
Without additional logic, the authors would see all three options and have to make the appropriate choice for themselves.
Developers would, more or less, have all the content in one query, but there is one level of nested data for them to traverse to get everything that goes with the entry.
This probably plays out as a parent-child content type relationship in the content model.
Approach 3: One “app store” content type with all the fields
In this approach, the CMS has one content type that contains all the fields. In a default implementation, everyone sees all the fields, even fields that don’t apply to them. Expect confused authors.
The developer gets everything they need in one query, with no nesting (thanks to my simplified fictitious model). However, they could get invalid content/metadata if authors don’t use the content type properly.
Approach 2b: One “app store” abstract content type that displays concrete content types, depending on app creator
The second approach comes very close to the model that I would probably choose, with one difference in the implementation. Depending on the CMS and its capabilities, you might be able to customize your implementation to display only the one content model appropriate for the author type. (Assuming for a moment that either the “app creator” field tells us the developer type directly or there’s additional metadata to do so that is not shown in the simplistic models here.)
The parent-child content type relationship might take the form of something like abstract.AppListing →
So, those are my thoughts.
I’d love to see is CMS vendors start to build this abstract/concrete capability into their content modeling abilities. Is anyone doing anything like this?
How do you handle this in your content modeling?
CEO and Co-Founder @Hygraph
8 个月Great article John! I completely agree that the world of programming languages and data structures has many many cocepts in them that content management may adapt. Inheritence for sure being one of them and there is so much more. But with great flexibility comes potentially high cognitive load that could make things more complicated for the content editor in general. On: "Ask yourself how developers will build queries to get this content. Will the query be a single query? Will it require multiple queries?". This is one of the reasons at Hygraph, we always found that GraphQL is the perfect interface for content, because the query can be adapted and composed by the developer who's consuming the content. Having this, the dimenion of "Build content models that a system can process easily" can be remove from the equation.
Partner & Division Director of Advanced Content at Enterprise Knowledge | Content Strategy and Operations | Content Engineering | CMS Solutions
8 个月Must stay on target with current tasks, but flagging to review later. ??
GVP, Platform Ecosystem at Uniform | #integrations #apis #AI #growth
8 个月John Collins (and Aaron Bradley), y'all are onto something. Back in the day, I almost bought object-oriented-content.com :D I often reference bounded context and domain modelling when thinking about content modelling. Bounded Context: https://martinfowler.com/bliki/BoundedContext.html Domain Modeling: https://www.thoughtworks.com/insights/blog/agile-project-management/domain-modeling-what-you-need-to-know-before-coding
Creator of The Content Technologist, web evangelist, and results-focused digital content strategy consultant
8 个月Are you familiar with OOUX and Sophia V Prater? I'm just digging into her work but it feels appropriate to mention as an approach to tackling this problem.
Content Engineering Consultant at Enterprise Knowledge, LLC. | Digital Asset Management (DAM) | Modular & Dynamic Content | Content Operations
8 个月Love this quote, “A content architect is the custodian of meaning in content systems.” It made me go down a whole rabbit hole of how do custodians manipulate the space to better support good habits by occupants (or authors in our case) We’re using a abstract vs concrete approach right now with a client, drawing on EK’s knowledge graph experience and the use of entities as business concepts. Fun challenge!