Semantic Business Rule Extraction

Semantic Business Rule Extraction

We tell people that if you want to modernize safely, you must completely understand the semantics of your legacy data and you must completely understand the transformations upon that data by the business logic in your legacy applications. The emphasis here in on “completely” because the most common reason for a project to go wrong is an incomplete understanding of both, compounded by not knowing that your understanding is incomplete.

This remains true even for projects accompanying a digital transformation initiative. One rarely plans to exactly reproduce the legacy system in new technology, but if you want to ensure success you must start with the understanding that would allow you to do so.

You may decide to ignore the technical logic from the legacy code, but preserving the business logic for the first sprint in new development allows the use of test driven development. Changes or enhancements to the business logic can come in subsequent sprints via refactoring. This is the method to do greenfield development safely when replacing a legacy system: first prove you haven’t broken anything or missed anything, then you can refactor to your heart’s content.

What Are Business Rules, Anyway? Perhaps Not What You Think…

A business analyst’s job is to gather functional and non-functional requirements upon which to base a software implementation. Functional requirements can be divided into functions (what the software must do) and business rules (how each function must operate). Too often the distinction between requirements and business rules is blurred, with some people using the terms interchangeably.

The business analyst’s description of business rules typically will be non-procedural, expressed as a constraint (“data element X must be numeric”) or as a decision which evaluates to a result (“if X = 1 then A = B * 10 otherwise A = B * 20”). The programmer’s job is to translate the requirements into a functional system, almost always using a traditional procedural language such as COBOL, C/C++/C#, or Java. Thus, to a programmer, business rules are procedural.

What We Have Here Is A Failure To Communicate

A procedural language, sometimes called an “imperative” language, is one in which the order of instructions matter, i.e., there must be explicit sequences of steps that must be followed to produce the desired result. A non-procedural language, sometimes called a “declarative” language, describes relationships between variables in terms of functions or inference rules and the language executor, whether interpreter or compiler, applies some fixed algorithm to these relations to produce a result. Examples include LISP, Haskell, Prolog and SQL.

This is a critical distinction and explains why business analysts and business owners have difficulty reaching understanding with programmers, and vice versa. The business people think non-procedurally, i.e., about what the business rules mean, while the programmers think procedurally, i.e., about the sequence of steps required to implement the business rules. I’ve listened to many conversations between business people and programmers, both discussing business rules, without either realizing that they were discussing different concepts.

Semantic Business Rule Extraction

The process of semantic business rule extraction from legacy code, which seeks to extract the meaning of the rules from the legacy code, always is performed on a procedural language. It is a process of inferring what the data means and what was the original intention of the procedural logic. This is complicated by the almost universal intertwining of technical logic (sending/receiving messages, reading/writing data, controlling the execution path through the program, etc.) with business logic (validation constraints and decisions selecting among alternative transformations to apply to data), with related clumps of business logic spread around within a given program or group of related programs.

No alt text provided for this image

Unfortunately, most other forms of business rule extraction consists of a distilled form of the procedural logic, which is not what the business owners want. They want the original, non-procedural description recovered from the code. Furthermore, the business owners usually want just the business logic, and often don’t care about the technical logic. Indeed, in many modernization projects the technical logic is mostly thrown away, while the business logic must be carefully preserved or enhanced. This process is very much like withdrawing each strand of spaghetti from a bowl, one at a time.

What Are Business Rules, Again?

Much of the discussion that follows in this section is derivative of the final report from the Business Rules Group,

According to this seminal paper released in 2000, business rules are either structural, e.g., files or database tables, or an action. Generally, one tends to think of the actions first, but actions cannot occur in a vacuum – there must be something to act upon. So, first we must define the structure and then the actions follow.

The structure is derived from a precisely defined business vocabulary, which allows us to talk unambiguously about the structural elements, their relationships, and their transformations. Assembling this vocabulary can be a contentious and difficult process because of differing concepts among different groups. We deal with this issue by making the vocabulary, or “business language phrase”, an attribute of the extracted structure’s technical name in the business rule repository. We provide a recommended vocabulary, allowing the groups to have their semantic arguments without impeding the progress of our project, and we can adjust the vocabulary at any time from the repository once a consensus is reached.

Our nomenclature for structure is “conceptual business rules” which comprises the Business Rule Group discussion of business terms, concepts and calculations. This is very close to the nomenclature of semantic web programming, also known as Web 3.0. Alternatively, we could easily substitute the nomenclature of entities and attributes, or of database tables and columns, though there are some subtle differences that are outside the scope of this essay. Once we have our conceptual business rules, then we can discuss actions.

The Business Rule Group paper defines,

...a business rule is a statement that defines or constrains some aspect of the business. It is intended to assert business structure, or to control or influence the behavior of the business.

Accordingly, a business rule expresses specific constraints on the creation, updating, and removal of persistent data in an information system.

We restate this definition: a business rule controls the change of state in a persistent data store. Since state changes occur from database transactions, our nomenclature for action is “transactional business rules” which comprise validations – that prevent a change of state if violated – and decisions that select a calculation or other transformation to be applied to the data. Implicitly, it is up to the controlling technical logic to execute the persistence.

Our semantic business rule extraction frequently views conceptual business rules derived from a denormalized data model. To remove ambiguities and prepare the consumer of the business rules to make use of them, we renormalize the data model into a form with ambiguities removed, and furthermore only consisting of data elements that have business meaning. Purely technical data elements are documented as such and not considered further, once our customer signs off or corrects our assessment.

The transactional business rules are expressed in a non-procedural form, in a narrative for relatively simple business rules and in a decision table for complex business rules. If the results were to be implemented in a conventional form, we recommend defining methods on the classes derived from the business terms for the simple rules and the use of a business rule management system (“rules engine”) for the complex rules. There are also non-conventional implementations using semantic web concepts which offer many business benefits at the cost of unfamiliarity.

Atomic Data Elements and Atomic Business Rules

The simplest change of state in a persistent data store is of a change to a single data element. Our conceptual business rules are derived from data elements at the lowest level of granularity, and are therefore “atomic” because they cannot be further subdivided. In COBOL terms, we eliminate REDEFINES, OCCURS, and group levels subordinate to the 01 level, and select the lowest level of the hierarchy which has unitary business meaning - which usually means selecting elementary items. The 01 level maps to the business term (or entity), and the selected items map to the facts (or attributes).

Every transformation to an atomic data element which is subsequently persisted to disk maps back to a decision to select that transformation. Therefore, each transformed data element maps to one decision (narrative or decision table) which is itself therefore atomic. Each instance of a transformation to an atomic data element maps to one of the outcomes of the decision. Each decision is an atomic transactional business rule and maps to an atomic data element. The decision and all specified transformations comprise the set of atomic transactional business rules for that data element.

This is described in the context of a single program or a single related group of programs (the “executing unit”). The output goes into the business rule repository associated with that executing unit.

Molecular Decision Services

Our recommended implementation architecture is micro-services based, with each atomic transactional business rule implemented as a stateless decision service. However, because a single persistence operation could be preceded by hundreds or thousands of atomic business rules being invoked, practical performance concerns suggest that the atomic decisions should be grouped into “molecules” which can be invoked as a single stateless decision service. Because they are stateless, multiple molecular decisions can execute in any order in a choreographed implementation leading to optimal scalability.

As we document the full system, we do identify the business functions of the system, but only to the extent necessary to link each function to the conceptual and transactional business rules which define how it is to operate. Thus, the non-procedural implementation of the business rules is linked to a business function in the technical logic which is necessarily procedural. But that business function contains no business logic – it only invokes the business logic, resulting in a clean separation (or “externalization”) of the business logic from the technical logic. This externalization has some very significant implications that are not immediately obvious.

Externalization of Business Rules Eliminates Side Effects

“Side effects” is an unfamiliar term defining a familiar experience to most legacy programmers. Anyone who modifies a legacy program and produces new defects inadvertently and unexpectedly while implementing the desired changes is experiencing side effects. Then you fix those new defects and 10 others appear. You continue the battle until eventually all defects are defeated and the updated program can be put into use, at vastly more time and effort than should ever have been necessary. This is why legacy programs are so costly and time consuming to modify as needed by the business.

The most common source from which side effects derive is global data elements, (“working storage” in COBOL terms). For example, given one set of logic which assigns a new value to a given data element, then other logic that references that data element can fail because of not being programmed to handle the new value. Our semantic business rule extraction eliminates global data elements and other forms of interactions between data elements so that a change to any data element has no opportunity to impact another. Atomic data elements are thereby wholly disjoint and immune to disruption by other changes in the logic.

Programmers often rebel at this change. Eliminating global variables means that intermediate calculations have to be repeated each time they are referenced. That's right - we are spending CPU cycles that are effectively free to save on human programmer cycles which get more and more expensive every year. This is a financial analysis not a technical analysis.

However, because each set of business rules is linked to its anchoring atomic data element through all of the twists and turns of its lifecycle, side effects may be eliminated within an executing unit but still occur between executing units. This is why our analysis links together all the business rules from each transformation across the lifecycle, exposing the lineage of the transformations and thereby allowing rationalization to control side effects for the future. This is especially true when the modernization includes the merging of multiple legacy systems.

Code Slicing Is The Key To Unraveling The Spaghetti

No alt text provided for this image

Within a single executing unit, side effects are eliminated by our use of the analytical method called “code slicing” while extracting the business rules. With this method we focus on a single atomic data element at a time until its transactional business rules (validations, decisions and transformations) are identified. One atomic data element = one set of decisions (either expressed narratively or in a decision table) and one transformation for each business rule in the set of decisions. Every transformed data element must have at least one linked decision narrative or decision table, though it can have zero, one or more validations which prevent transformation.

No alt text provided for this image

This is a sample of actual code taken from a business rule extraction project, redacted to maintain our non-disclosure requirements. This can be translated into a narrative decision like the following, using generic business language phrases within double quotes:

If this is a "Multi Segment Policy" 
and the "Payer Type" is "INSURED", 
then the "Written & Spread surcharge" 
is set equal to "Written & Spread surcharge" 
+ "Written amount of CFEI segments billing in next bill"

If this is a "Multi Segment Policy" 
and the "Payer Type" is "THIRD-PARTY", 
then the "Written & Spread surcharge" 
is set equal to "Written & Spread surcharge" 
+ "Written amount of CFEI segments billing in next bill"

If this is a "Multi Segment Policy" 
then the "Written & Spread surcharge" 
is set equal to "Total written premium of insured paid segments"

Each narrative expression is evaluated in turn, stopping at the first one which resolves to True, in which case that transformation is executed and the evaluation terminates.

This example is simple enough that the narrative form is sufficient. Alternatively, it could be expressed in a simple decision table: 

No alt text provided for this image

Columns D and E of the spreadsheet represent the input data and column B contains the resulting transformation. Each row is evaluated in sequence, starting with sequence 1 (row 3), and halting when D and E are both true on the sane row. This is a basic, generic decision table. (Column C indicates whether or not dynamic business rule extraction was performed on this business rule, but that is the topic for another posting.)

The business rules neither persist the transformed data nor prevent the persistence operation. By definition, if they are stateless services they cannot perform stateful operations. The technical logic which invokes each decision service must interpret the results and execute the correct technical logic. Though business logic and technical logic handshake in this manner, they do so at arms length.

Once this “Written & Spread Surcharge” atomic data element has had its business rules evaluated, the analysis proceeds to the next atomic data element with transformations. Once again we ignore all logic that does not impinge on this new atomic data element, especially ignoring logic that also resulted in a transformation for another data element. We want the data and the decisions affecting each data element to be atomic, and only combined into molecules when required by the technical logic for performance reasons. This is the key to unraveling the spaghetti code.

Externalization Of Business Rules Leads To A “Low Code” Future

This externalization of transactional business rules fulfills one vision of the “low code” approach to programming now gathering adherents, since the only significant code that has to be written is the technical logic. The simple decisions will have some code in the methods on the common classes, but that logic will be simple, non-procedural, and independent of other decisions. The complex business rules will not be implemented in code, but in some executable form such as a rules engine or a rules language.

By necessity, technical logic must be procedural, as, for example, you must read data into the program before you can transform it and write it back to disk. However, what is not obvious is the relative complexity of the technical logic versus the business logic. There is an extraordinary range of complexity in extracted transactional business rules, from trivial rules that unconditionally initialize a data element to monstrous decision tables of literally astronomical complexity. Technical logic is rarely trivial, but it never comes close to the complexity of the monster decision tables. Our conclusion is that because future coding will be applied only to technical logic, it will be straightforward and not very complex, speeding implementation and subsequent maintenance while minimizing cost.

This approach has many implications for the elimination of present code level technical debt and the inhibition of recurring technical debt in the future, which will be the subject of another essay.

Robert Carlston

IACT Certified Autistic Coach, Certified Executive Coach, Certified Business Architect, Organizational Change Catalyst

6 年

Excellent article Don.

回复

要查看或添加评论,请登录

Don Estes的更多文章

社区洞察

其他会员也浏览了