Component Cohesion Principles

Component Cohesion Principles

In the first articles of this newsletter we came to the conclusion that by following the "SOLID Principles of Object Oriented Design" and some additional rules to complete them, we could achieve good design and a clean architecture. And this is not false, but the problem is that as the size of the project grows, even this approach is not enough. Grady Booch, one of the fathers of UML and Chief Scientist of?Rational?Corporation, in 2000 wrote this sentence (.1) "class is a necessary but insufficient vehicle for decomposition, large systems of objects and classes would be overwhelming, a way must exist to deal with groups of classes otherwise, it is almost like building a sand castle from individual grains of sand ", the correct use of components is exactly what must complete our design approach.

In the previous article we had seen several component definitions. Now we give another a little different but I hope more proper for our considerations about the principles of good componentization. This is the definition found in Robert Martin's bestseller (.2) : "The components are aggregates of concrete and abstract classes and constitute the units of distribution. They are the smallest entities that can be deployed as part of a system. In "Java" they are jar files. In ".Net", they are DLLs. In compiled languages, they are aggregations of binary files. In interpreted languages, they are aggregations of source files. In all languages, they are the grain of distribution. Components can be linked together into one executable. Or they can be distributed independently as separate dynamically loaded plug-ins, as .jar or .dll or .exe files. Regardless of how they are finally implemented, well-designed components always maintain the ability to be independently implementable and, therefore, independently developable".

We finished the previous article with the conclusion that the software components can be a good solution to manage the complexity of software systems, as long as the partitioning of the system into components respects certain rules and certain principles that we will see in this article and in the next. Also in "Robert Martin" bestseller (2.) we find the sentence written at the head of the image that introduces this article : "If the SOLID principles tell us how to arrange the bricks into walls and rooms, then the component principles tell us how to arrange the rooms into buildings. Large software systems, like large buildings, are built out of smaller".

In this article we will talk about the first three principles relating to components, those relating to “cohesion”. But before enunciating and analyzing these three principles, let's introduce a very general concept: "orthogonality". The concept of orthogonality between components is fundamental for a correct software componentization, as are the principles that we will see immediately after.

Orthogonality is a fundamental concept if you want to produce systems that are easy to design, build, test and extend. But what is orthogonality? "Orthogonality" is a term borrowed from geometry. Two lines are orthogonal if they meet at right angles, like the axes of a graph. In vector terms, the two lines are independent. Move along one of the lines and your projected position on the other does not change.

In computer science, the term has come to mean a kind of independence or decoupling. Two or more things are orthogonal if changes in one do not affect any of the others. In a well-designed system, the database code will be orthogonal to the user interface: you can change the interface without affecting the database and swap databases without changing the interface.

We want to design components that are self-sufficient: independent and with a single, well-defined purpose (what Yourdon and Constantine call cohesion (.3) for the first time). When components are isolated from each other, we know we can change one without having to worry about the rest. As long as we do not modify the external interfaces of that component, we can rest assured that we will not cause problems that affect the whole system.

We get some main advantages if we write orthogonal systems: increased productivity, reduced risk and decreased test time.

-) Let's see why we get a productivity gain. Changes are localized, so development and testing times are reduced. It is easier to write relatively small, self-contained components than a single large block of code. Simple components can be designed, coded, unit tested, we don't need to keep modifying existing code while adding new code. An orthogonal approach also promotes reuse. If components have specific and well-defined responsibilities, they can be combined with new components in ways that were not intended by their original implementers. The loosely coupled our systems are, the easier they are to reconfigure and redesign. There is a productivity gain when combining orthogonal components. Let's assume that one component does M distinct things and another does N things. If they are orthogonal and we combine them, the result does M × N things. However, if the two components are not orthogonal, there will be overlaps and the result will be less. Get more functionality per unit of effort by combining orthogonal components.

-) So now let's see why we get a decrease in risk. An orthogonal approach reduces the risks inherent in any development. Bad sections of the code are isolated. If a module is bad, it is easier to replace it with a good one or to remove it, fix it and re-plug it. The resulting system is less fragile. If we make small changes and corrections to a particular area then all problems generated will be limited to that area. An orthogonal system will probably be better tested, because it will be easier to design and test its components. We won't be so closely tied to a toolkit or platform or a library, because the interfaces to these third-party components will be isolated to smaller parts of the whole system.

-) And now let's see why the test time decreases. A system designed and implemented orthogonally is easier to test because the interactions between system components are formalized and limited, so more of the system testing can be performed at the single module level. This is good news, because module (or unit) level tests are considerably easier to perform than integration tests. In fact, each module should have its own unit test built into the code and should be run as part of the normal build process. A simple rule to see if a component has been developed orthogonally is the following: "if in order to create a unit test we have to connect to a large percentage of the rest of the system just to compile, then our module is definitely not orthogonal to the rest of the system". Bug fixes are also a good time to evaluate the orthogonality of the system as a whole. When we encounter a problem, we need to evaluate how localized the solution is. Have we changed only one module or the changes were scattered throughout the system? When we make a change, everything is fixed or other problems arise? This is a good opportunity to take advantage of automation. If we use a source control system, we can tag bug fixes and then periodically run reports analyzing trends in the number of source files affected by each bug fix.

Well now it's time to see a real case, let's consider the software of a "Tapis Roulant", and let's try to organize it by components and in an orthogonal way.

Non è stato fornito nessun testo alternativo per questa immagine

figure 1.

In the High Level architecture diagram we can see the following main components:

Equipment is an abstraction of the actual machine.

Training program contains the training logic (e.g. constant heart rate training, weight loss training)

?User Interface handles the dialogue with the user. GUI is an example.

?Tutor can monitor all the events, give the user hints and/or control the other components, including the User Interface.

?The guiding principles of the architecture are quite simple:?

-??No direct interaction between the User Interface and the Equipment. The User Interface notifies the Training Program when any relevant event occurs. The Training Programs contains all the logic to control the Equipment.

-??Event-based communication (see Technical detail Diagram in figure 1). The communication between Training Program, Equipment and the User Interface is based on a subscribe-notify model. Each part can subscribe to (some of)?the events generated by other parts, and can fire events that are then dispatched to all the subscribers.

This allows an high degree of decoupling between parts and?makes monitoring easy (for the Tutor). Also, it makes easy to fire events from new sources (as a remote training program running on another computer).

As far as the "User Interface" is concerned, the "Equipment" is only a source of events. The "Training Program" is both a source and a potential sink of events.?

The class diagram "Technical details" in figure 1 shows the relevant portion of the event handling, the most relevant classes are the yellow?:

Event is a base class for all the events. Each event is defined by a tag. Derived classes may add more data and/or behavior. Tags are partitioned between layers as follows:? 0 - 1,000,000,000 are reserved for the infrastructure. 1,000,000,001 - 2,000,000,000 are reserved for the Equipment. 2,000,000,001 - 3,000,000,000 are reserved for the Training Program. 3,000,000,001 - 4,000,000,000 are reserved for the GUI. 4,000,000,001 - MAX_UINT are reserved for the infrastructure.

All the Event-derived classes must implement the Clone method (as in the Prototype pattern).

EventSource is the base class for all the sources of events. Therefore, both the concrete Equipment classes and the concrete Training Program classes inherits from EventSource. The principal methods in EventSource are:

?Subscribe( tag, EventSink* ) : used by the EventSink to subscribe to a specific kind of event. Subscriptions can be added at any time.

?Unsubscribe( EventSink* ) : used by the event sink to detach itself from a source.

Dispatch( Event* ) : used to dispatch an event from a source.

?EventSink is the base class for the receiver of events, that is, all the classes that want to receive events must derive from EventSink. When an EventSource dispatches an Event which is subscribed from a specific EventSink, the source clones the event and calls the Notify method of the sink. The default behaviour of the Notify method is to add the event into the EventQueue associated with the sink. Derived classes can pop the events from the queue by calling GetEvent. Note that GetEvent returns an auto_ptr< Event > since the receiver is ultimately responsible to delete the received event.

Here is a portion of a dialog box which subscribes (and handles) events from a training program (figure 2): here CManual is the class of a concrete Training Program. The event handled does not carry any data, so only the Tag is needed.

Non è stato fornito nessun testo alternativo per questa immagine

figure 2 : snippet 1

Below is an example of implementation for an event carrying data (figure 3):?

Non è stato fornito nessun testo alternativo per questa immagine

?figure 3 : snippet 2

I showed you this as an example because as you can see this design is completely orthogonal, each component of the main architecture does not reveal anything unnecessary to other modules and which are not based on the implementations of other modules, thanks to a completely event-driven architecture.

Finally, after the analysis on the orthogonality of systems made so far we can extrapolate three rules, let's list them:

-) "Each component must have specific and well-defined responsibilities".

-) "It must be easy to develop the Unit Test of a component".

-) "Be careful to preserve the orthogonality of your system as you introduce third-party toolkits and libraries".

Before moving on to the analysis of Robert Martin's principles for components I want to talk about another interesting aspect of orthogonality, the "design by aspects". The Aspect-Oriented-Programming (AOP), was born as a research project at Xerox Parc, and then ported to JAVA (JVM) and C # (.NET). AOP allows us to express behavior in one place that would otherwise be distributed in the source code. For example, log messages are normally generated by distributing explicit calls to some log functions in the source. With AOP, we implement log function orthogonally to the things that are registered. Using the Java version of AOP, we can write a log message when we enter any Fred class method by coding the appearance:

Non è stato fornito nessun testo alternativo per questa immagine

?figure 4 : snippet 3

If we put this in our code, log messages will be generated. If not, we won't see any messages. In any case, our original source is unchanged.

How useful this programming paradigm would have been to the android operating system programmers in order to avoid such intricate code !!

The android base code is deeply entangled with “mobile” (like smartphone) concerns that do not apply (for instance) to industrial devices or other user device. (like tapis-roulant). The "battery" concern, to name one, is very pervasive within the entire code. However, some devices don't have the battery. There are in fact a large number of cross-cutting concerns inside the android base code. This leads to strong couplings and is bad for modularity. For example, the audio manager ends up depending on the telephony operator, this is right for a phone, but is bad for everything else. Many significant changes to the Android code end up scattered across different compilation units. It is easy to lose sight of what is required for feature when it is implemented by changing a few lines in a couple of files in one module, other lines in a different module, and so on. Because to the above and the high volatility of the code between versions, porting the customization to a later version requires significant effort. It would have been very good for the "Android Developers Team" to handle cross-cutting concerns using?AOP. This would also make it easy to remove unnecessary functionality. We don't have a phone, a GPS, a battery in many of our (industrial or not) devices. Maybe we have Ethernet instead and it was difficult because the "network" concern is not well modularized, but instead "embedded" in the Wi-Fi or 3G / data code. It would have been useful to customize the base code using aspects, instead of modifying the existing code. In this way, small changes to different files of the same module would not be lost and our changes would not be mixed with the base code. This would provide a much better context when thinking about changes. I invite you to look at the following study done by Carlo Pescio on the opportunity to implement the aspects within the android O.S. architecture.

After this discussion on the orthogonality concept we can now move to the analysis of the components cohesion principles.

Let's remember the concept of Cohesion in the Computer Science context. In short terms, it refers to the degree of how much some given elements belongs together. We can think of it as a measure of the relationship between methods, data structures, files, modules and components. When we think about Component Cohesion, we have three principles to use as guides. These principles are defined by Robert Martin in his bestseller (.2), and they help us to create reusable and highly cohesive software.

So let's start with the first one: the "Reuse/Release Equivalence Principle (REP)". It states "the granularity of reuse is the granularity of release". The basic idea of this principle is that components must be separately released, versioned, and tracked to ensure the reusability of the code. Each component must have a release number associated, in a way that modifications are easily communicated through version number changes, and users of the component could choose the version they want. The designer has to organize the classes into reusable components and then track them with the release. Without release numbers, there would be no way to ensure that all the reused components are compatible with each other. With this principle in particular, we want to be concerned with which modules we keep together, because the reusability is for the component and not a particular element. Sometimes we are inclined to make larger components rather than a greater number of smaller components. This is to avoid the nuisance of more release process. But this can create big problems. For instance, if we create a single component to provide two functionality, "audio management" and "battery management", that are definitely orthogonal to each other, we certainly create an inconvenience to all those modules that use it. In fact, if we have a module that uses this component in order to use only audio management , it is quite irritating to have to update it because a new version of the component has been released, especially when the new version has only changes to the battery management. It is obvious that we should have made two distinct components. One to provide functionality audio management and one to provide functionality battery management.

The second principle that we will now analyze is : the "Common/Closure Principle (CCP)". It states "Those classes that could change for the same reasons have to be grouped into the same component" and also "Just as classes should contain methods that change for one reason, Components should contain classes that change for one reason". Those classes that could change at different times and for different reasons have to be separate into different components. This is important because if the code change, we would rather that all of the changes occur in one component, rather than being distributed across many components. In that way, changes are confined to a single component, then we need to redeploy only the one changed component. Other components that don’t depend on the changed component do not need to be revalidated or redeployed. We can also see here that this principle is related to the Open Closed Principle (4.) (classes should be closed for modification but open for extension) so we always need to be concerned on how to change our software behavior by extending it, and not changing existent code.

Finally, the last but not least of the component cohesion principles is : the "Common/ Reuse Principle (CRP)". It states "don't force users of your component to depend on things they don't need". All classes and modules that are to be reusable together must belong to the same component. Have you ever had to implement a method from an interface where you only returned null? Or did you pass a parameter to a method that you had to create just to use the method? This type of problem often occurs when you using a module that implements behaviors that you don't need. We can end up having an error due to this unnecessary dependency. The Common Reuse Principle (CRP) helps us solve this problem by telling us that we shouldn't force our users to depend on things they won't use. While implementing a component, we need to be aware of the dependencies we create, because we can easily increase the difficulty of using our component, causing reusability and maintenance issues. Therefore, we must keep in mind that the elements in a component will be used together. This means that the elements within the component must refer to the same context or functionality. We can achieve this by using code organization techniques such as Packages by Feature instead of using the traditional Package by layer organization, for example. When we refer to the same context, we will certainly increase the cohesion of the modules because the force that binds them together will be the context itself.

Let's conclude with some final considerations on components cohesion principles.

As we have just seen they are a guide for arranging classes into components to make them more organized and manageable. But they tend to conflict with each other.

The REP and CCP are inclusive principles: Both tend to make components larger. The CRP is an exclusive principle, driving components to be smaller. It is the tension between these principles that good architects seek to resolve. We need to find the right balance.

The tension diagram shown in Figure 5 indicates how the three principles of cohesion interact with each other. The edges of the diagram describe the cost of abandoning the principle on the opposite vertex.

An architect who focuses on just the REP (Reuse/Release Principle) and CRP (Common Reuse Principle) will find that too many components are impacted when simple changes are made. In contrast, an architect who focuses too strongly on the CCP (Common closure principle) and REP will cause too many unneeded releases to be generated.

Generally, projects tend to start on the right-hand side of the triangle, where reuse is sacrificed for developability. As the project matures, and other projects begin to draw from it, the project will slide over to the left. This means that the component structure of a project can vary with time and maturity.

Non è stato fornito nessun testo alternativo per questa immagine

figure 5 : Tension diagram of the cohesion’s principles


In the next article we will see principles and rules about coupling between components.

If you haven't already read it, I invite you to read my previous article with which I introduced software complexity management.

thanks for reading my article, and I hope you have found the topic useful,

Feel free to leave any feedback

your feedback is appreciated

?

?


Stefano


References:

1.?Grady Booch “Object-Oriented Analysis & Design with Applications”- Addison Wesley

2.?Robert C. Martin, “Clean Architecture - a craftsman's guide to software structure and design” Prentice-Hall (November 2018).

3. Yourdon, Constantine,?“Structured Design: Fundamentals of a Discipline of Computer Program and Systems Design”–Prentice Hall.

4. S.Santilli,?“Open Closed Principle", https://www.dhirubhai.net/pulse/open-closed-principle-stefano-santilli/

.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了