Concurrency in Java – Part 4, Design Thread Safe Classes and Components

Concurrency in Java – Part 4, Design Thread Safe Classes and Components

This post is the part 4 story in the?Concurrency in Java – Series?that touches the core concepts of concurrency in Java and provides a balanced view from the JVM memory model specifications as well as from the programmer's perspective. To see the list of posts in this series, please visit?here.

In the?previous post, we discussed the two State Visibility Guarantees that Java Memory model offers to its practitioners. In this episode of the Concurrency in Java series we shall?have an exhaustive discussion about the various critical guidelines that we need to follow when designing thread safe concurrent Java applications.

Design Thread Safe Classes and Components

Designing thread-safe classes is not a herculean task but rather a simple 'play by the book' set of rules to be followed and adhered to. This section outlines a set of basic guidelines to follow to design thread-safe classes. As long as we stick to basic OOP principles of encapsulation and abstraction Thread safe class construction is not a big deal.

Constructing thread-safe classes and components involves two things to know and care about:

  • Design principles: These principles relate to what you should and should not do while designing, coding, and refactoring thread safe classes or components. At a high level, these principles can be divided into instance confinement strategies and thread confinement design principles.
  • Documentation principles: These principles advise being explicitly clear and forthcoming about the synchronization policies of the class or component. Typically these principles advise us to clearly answer questions like What the class is intended to do? How does it do it? What considerations and traps should its clients or callers be aware of? or the synchronization policies employed by the class.

Design Principles – Shared State Ownership

One of the primary rules to ensure thread safety is to outline shared state ownership rules. These rules typically answer the following questions without any reservations:

  • Who owns the shared state?
  • Who can mutate this shared state and how?
  • Who can access this shared state without locking and how?
  • Who should lock and wait while the state updates?

When designing thread safe classes it is always?best not to rely on clients or callers of your classes and objects to provide synchronized access to state mutation operations. Because the clients may not be fully aware of what state mutates and when they may eventually wrongly synchronize their calling code which can impact thread safety of overall application. This is a common problem in the Synchronized Collections API which is created using the Collections.synchronized API call.?

The collections returned by Collections.synchronized API calls actually use the decorator pattern to add synchronization on top of existing Collections constructs making the existing collection operations atomic. This means that writes and reads are atomic within themselves. However for client programs that use these collections, there are always compound invariants that involve a read, mutate and write operation on these collections which also should be atomic. So the following code may seem harmless in a single threaded setup but would fail in a multi threaded model.

The?addNewEmployee,?updateEmployee,?deleteEmployee?and?getEmployee?methods are not synchronized within their own context but rely on the synchronization aspects provided by the Collections.synchronized API.?

Assume two threads are working together to update the same employee and read it in parallel. So while Thread 1 reaches line 30 and receives the employee, Thread 2 is in process of updating the same entry. This could lead to stale data. Imagine the same scenario but this time one thread is reading the employee and another thread is trying to delete it. So while the reading thread has completed the?contains?statement at line 29 and found the employee exists, the deleting thread executes line 21 and deletes it.

To fix these problems we need to synchronize the?EmployeeManager?to allow only 1 operation at a time. How bout this?

https://gist.github.com/vireshwali/08f22035909a3e47ff4c73d255a4c8a7#file-snippet_8-java

snippet_8.java

Is this class thread safe now??

The answer is No.?Why?

Because this class synchronizes on?this?object where as it should synchronize on?employees. This is what the Javadocs for Collections.synchronizedMap states, as shown in the image below.

No alt text provided for this image

This means that internally in the?employees?the synchronization is based on the lock of the underlying map. But in our client class?EmployeeManager?it is based on the instance of?EmployeeManager?represented by 'this'. Clearly these are two different objects and thus threads willing for locks on these objects won't block each other. Also if employees is accessed from a subclass of?EmployeeManager?then synchronized on 'this' won’t help in blocking the access either. It’s the map operations which we need to be made atomic not our own code. So we must synchronize on the?employees?variable instead of?this.

The following code example shows one of the ways to do it correctly.

https://gist.github.com/vireshwali/7f6e2cc61301271cb70438c13a9091c0#file-snippet_9-java

snippet_9.java

Design Principles – Instance Confinement

Instance confinement is a technique that uses encapsulation and a couple of design pattern para-diagrams to restrict access to shared mutable data. This technique instills the discipline to make the encapsulating class or its subclass the owner of shared mutable data and does not allow direct data mutation from outside. In some cases, a limited mutation may be allowed but is controlled using thread confinement techniques.

EmployeeManager5?is a good example of an instance confined thread-safe class (though not 100%).

  • All operations on the ‘employees’ are encapsulated inside.
  • EmployeeManager5?is about 95% thread because its methods?addNewEmployee,?updateEmployee,?deleteEmployee?are thread-safe.

However,

  • The?getEmployee?is not 100% thread-safe as it lets a live?Employee?object escape from the?EmployeeManager5?class.?
  • To fix this we should make a copy of the employee object once it is received from the map and then return that copy.
  • EmployeeManager5?does not handle serialization effectively.

Instance confinement is used heavily in the Collections API to make synchronized counterparts of normal Collections like HashMap and Lists. They actually make use of a decorator pattern to create a synchronized version of the collection and keep the synchronized version classes as private static inner classes so that these class instances can’t escape. Instance confinement makes it easy to construct thread-safe classes as the only class to be analyzed for thread safety is the owner class. Without this analyzing thread safety in a big project can become a nightmare.

Instance confinement is largely done using the design techniques of:

  • Delegation
  • Composition
  • Private Monitor Pattern

Instance Confinement – Delegation

Delegation thread safety principle promotes the use of existing thread safe constructs and encourages the main classes to delegate the thread safety work to existing thread safe classes instead of making their own. The idea is simply to "use what you can (and should) and create what you can't". Since we use the thread safety guarantees offer by existing classes and constructs we don’t need thread safe or locking idioms in our code.

Using this principle our?EmployeeManager5?could be better off using a?ConcurrenHashMap?instead of a?synchronizedMap?and delegate all its "check if it contains and then mutate" operations to atomic?putIfAbsent. Also, the 'get' operation on?ConcurrenHashMap?is atomic about the "contains and get" mechanism. So by using a?ConcurrenHashMap?we can eliminate the need for 'contains’'check, in our?EmployeeManager5?code.

Using delegation has another advantage. It eliminates the need for locking in our code. Since in the above example we rely on?EmployeeManager5?now so we can eliminate the?synchronized?keyword and locking idioms from our code and thus make it lock-free with respect to our instance scope. This also eliminates the issue of threads blocking on each other due to?synchronized?block locks.

Delegating thread safety works when we don’t have dependent shared mutable state variables to take care of. E.g. ConnectionPoolWatchDog2 (see part 2 post,?Object behaviour heuristics, in the series for the code example) cannot be made thread safe by delegation alone. Since it involves two shared invariants that are dependent on each other, so some form of client side locking is essential, though the two invariant types are themselves thread safe.

Instance Confinement – Composition

Delegation thread safety principle owns its roots to another Gang Of Four (GOF) prevalent design principle named Composition. In simple terms, GOF principles encourage us to make abundant use of composition as against inheritance to add functionality and behaviour.

Let’s presume we want to add a new behaviour?getIfPresent?to a HashMap. There are two ways to do this.

Approach 1 – Inheritance

  • Create a new interface type?CustomMap?which extends?Map?and adds the new behaviour?getIfPresent
  • Extend?HashMap?to make a?CustomHashMap?which implements all the same interfaces which a?HashMap?has, except the?Map?interface. Instead of the?Map?interface, we add our?CustomMap.
  • Add the method implementation for the new behaviour.

Approach 2 – Composition

  • Create a new interface type?CustomMap?which extends?Map?and adds the new behaviour?getIfPresent
  • Create a new class?CustomHashMapImpl,?and add a private instance variable of type?HashMap?in our?CustomHashMapImpl?instead of extending HashMap.?CustomMapImpl?also implements our?CustomMap?interface.
  • We implement the new behaviour?getIfPresent?in our?CustomHashMapImpl?class.
  • But for all existing Map methods or actions, we just delegate to the private?HashMap?instance. No overriding, no extending.

As discussed earlier, synchronized collections in the Collection API make use of Approach 2 to make all existing collection types thread safe. The inheritance approach seems good and viable when you have a few behaviours to add and new behaviours are not actually an atomic view of existing compound behaviours. e.g.?getIfPresent?is actually a compound of 'if contains then get'. When we need to add many such new behaviours, it’s better to use composition instead of inheritance.

Sometimes grouping such objects types using inheritance, may not actually go well with the OOP principles. E.g. a car may need a behaviour?replaceIfCustomerLikes?for its seats and upholstery. Adding this behaviour to the?Automobile?super type (or any super type of our car class) using inheritance may not be a good idea, as such behaviour is not a generic one that an automobile exhibits. It’s an invariant dependent on the owner of a car. It is something that OOPS principles call as polymorphism.

Thread safety also works on the same lines.

Instance Confinement – Private Monitor Pattern

The Private monitor pattern, also called the java monitor pattern, is a direct derivative of the encapsulation principle. Normally the strict encapsulation principles, when applied to data variables, advise us to hold the state as private instance variables of the class, instead of making them public. This allows the class instance to fully encapsulate the data it owns and provide preferred behaviours to mutate them. To allow controlled mutation of data variables and free viewing of the current data variable state, the class can expose public accessors and mutator methods. The mutator methods can implement conditional logic to implement the desired behaviour during the state changes.

Now extend this principle to an?java.lang.Object?instance stored in the class, as an instance variable, that is intended to be used as the exclusive lock provider for all operations on the instance. Now if we don’t allow a reference to 'this' instance to escape during instance construction, we can achieve a perfectly thread safe class. This is shown in the following code example.

https://gist.github.com/vireshwali/db03cd828ad2e672914d238632afd53f#file-snippet_10-java

snippet_10.java

Since all the methods bodies that need to maintain exclusive thread access use only the?lockMonitor?object, so no two synchronized blocks can ever race against each other. Also the monitor is declared as final so the the reference?lockMonitor?does not accidentally change its target object. As we shall see in the section about final modifier, such an action can have very wrong impact on an otherwise perfectly thread safe class. Furthermore since the?lockMonitor?object is not exposed to outside world, so no one can accidentally modify it or alter it. Client of?EmployeeManager6?don’t need to take any additional synchronization steps as long as they don’t try to derive new operations by combining existing operations of?EmployeeManager6. Responsibility of such new operations should be shouldered by the?EmployeeManager6?class.

The following code block shows an attempt by a client to create a compound operation out of methods exposed by?EmployeeManager6.

https://gist.github.com/vireshwali/8e800f90f055ef372dcc91b90f82e88b#file-snippet_11-java

snippet_11.java?

The operation?deleteEmployeeRecord?is not thread safe, though its two invariant operations themselves are. This is because there is a margin of possibility for a race situation when the code flow make a transition between the calls of?getEmployee?and?deleteEmployee?on?EmployeeManager6?instance as shown below by the disassembled code below.

No alt text provided for this image

But the worst part is that?EmployeeManager6Client?can’t do anything about it. If it would attempt to synchronize the?deleteEmployeeRecord?method on its own it would fail equally miserably.?

Why?

This is because the invariant operations?getEmployee?and?deleteEmployee?are actually synchronized on the lock which is provided by the?lockMonitor?object and this?lockMonitor?object is fully encapsulated inside the?EmployeeManager6?class instance. So believing that it would be a bad practice to add a getter for?lockMonitor?in?EmployeeManager6?class, the only option left for us to follow is to let?EmployeeManager6?shoulder the responsibility of implementing this operation inside it and expose the implementation on its public contract.

Design Principles – Thread Confinement

Sharing mutable data requires correct synchronization mechanisms, failing which the results can be dangerous. One way to avoid this is to not share the mutable data directly. If we can ensure that only one thread ever accesses the mutable state then we don’t need synchronization at all to manage the data mutation problems. This philosophy is termed as thread confinement. There are a couple of ways to do this as subsequent sections explain.

Thread Confinement – Implementation Owned

As discussed earlier, proper encapsulation and abstraction principles ensure that mutable data is never escaped from the implementation classes or components. When the implementation of frameworks and APIs fully own the responsibilities of keeping data thread safe under all circumstances, it becomes naturally easy for their clients to work through the code. The implementations should also ensure that data serialization semantics does not garble data sanity. Serialization mechanisms if followed wrongly can create thread safety hazards in the surfaced objects. Since serialization would create a new copy of our original object so we need to make sure that the new copy is also thread-safe. Inheritance is another scenario that implementations need to take care of. Subclasses can easily override the superclass methods and play havoc with the state data if the state is not corrected encapsulated and hidden. The implementations should ensure that subclasses don’t get direct control to mutate the shared invariants.

Thread Confinement – Thread Stack Confined

This is a special type of thread confinement where the state can only be reached through the local variables since the local variables that the thread creates are incidentally confined to only that thread’s execution stack. One example of this is method parameters and their copies within a method. Another example is the variables that we define inside blocks of code.

Consider the following code example screenshot.

No alt text provided for this image

In the above example the local variable?tempEmployee?is totally thread safe by design. Since all the threads entering the method?updateEmployee?would own their own copies of?tempEmployee?so two threads would never cross each other on this. We cannot violate its thread safety even if we want to.

Another thing to note here is that the caller of this method passes the employee parameter by reference, so it still has a mutating handle on the 'employee'. If we don’t create a local variable (a copy of it) out of this employee parameter, there can be a situation where the check on?containsKey?passes, but at that very moment the caller of this method changes the?employeeId?on the associated employee object. It's always a good practice to leave nothing to chance when it comes to thread safe coding. Make the method parameter references final and make a copy of mutable parameters before actually proceeding to work with them.

Thread Confinement – Working with safe copies

Safe copies, also called as Shared Thread Safe Copies, extend the Thread Stack confinement idea to a bit bigger level. Same principles apply but at the inter class or inter component data sharing level. The previous code example above made use of clone method to create a copy of the employee parameter before analyzing it in the employees List. This is essentially the use of prototype design pattern. Another beautiful use of the same idea is shown in the code example below where we read the employee from the list and return it to the caller.

https://gist.github.com/vireshwali/ba78616a07dfd823dabb75b357c8859e#file-snippet_12-java

snippet_12.java?

The example code here gives an effective example of creating a safe working copy. The method returns a working copy of the employee which it fetches from an internal list. The clients of this method use the working copy of this employee object instead of a direct reference to the employee which is in the list. Even if the clients change the returned employee object, the original employee in the list is not modified and thus the invariant state is not altered. It is important to note that the?Employee?object returned by the method is not immutable in nature. Clients can still make changes to the same object. It's just that those changes don’t find their way back in the list unless someone tried to update the employee again.

Thread Confinement – Thread Local copies

ThreadLocal?is a more Java SDK inclined way of dealing with Thread confinement design. Java SDK offers excellent API semantics to make variables and state thread inclined using the?ThreadLocal?class. Variables that are declared as?ThreadLocal?are essentially local to each thread scope which tries to access them. This means that each thread which tried to access them has its own copy of the variable on which it works and this copy is private to the thread’s context. So multiple threads don’t share the same?ThreadLocal?variable but make copies of it, for their own use. This also consequently means that threads cannot influence or mutate?ThreadLocal?variable present inside other threads. This makes it a wonderful thread confinement mechanism.?

It is important to note that?ThreadLocal?instances are typically private static fields in classes because it makes no sense to declare them as instance variables.

ThreadLocal?is used exhaustively in common middleware technologies like JMS, stateful connection pools, message dispatchers, ORM libraries etc. One of the most important and frequent use of?ThreadLocal?is for random number generation for use in security round robins and database primary key generation. The following code example inspired by the?ThreadLocal?JDK API Javadocs shows a prominent use of?ThreadLocal.

https://gist.github.com/vireshwali/110df7aa2f66e37a236fe47299d44f97#file-snippet_13-java

snippet_13.java

Another common use of?ThreadLocal?was in?HibernateUtils?during the times of pure Hibernate days. To make use of Hibernate then you were required to create a utility class that would hold the Hibernate session factory and return newly created session objects to its callers.

Use of?ThreadLocal?has two primary considerations/limitations though that Java developers need to be aware of.

State Sharing Only

One common pattern that we can see in classes that make use of?ThreadLocal?effectively, is that the classes themselves are not interested in the?ThreadLocal?data.They are actually interested in sharing that data. E.g.?RandomNumber?generator doesn’t consume the number itself. It gives those numbers to its callers to use.?Similar is the case for?HibernateUtils?too. It just creates the session and gives it to its callers. Similar structure is also prevalent in connection pool designs.

Managed Thread Pools

Another issue which?ThreadLocal?have is related to application relying on heavy Thread Pooling.?ThreadLocal?can be very dangerous when it comes to long running applications and garbage collections especially in managed server applications.

Reading the?ThreadLocal?API and Javadocs it can be safely concluded that the JVM memory model provides guarantees that the objects put in a?ThreadLocal?context are associated to a particular thread, and shall be garbage collected once the owner thread instance is dead and garbage collected.

So there is a high risk that in a wrong design, the object stored in the thread local may never be garbage collected when your app runs on a server like Tomcat or WebSphere, with their own pool of working thread, even when the class that created this?ThreadLocal?instance is garbage collected. This can lead to transparent memory and connection leaks.

This is the exact problem that happens in Tomcat when we use custom connection pools which create their own thread locals but rely on tomcat for their thread pool management requirements. Tomcat displays a following message (or similar one) in its logs when such a webapp is undeployed, redeployed or shutdown. This message is for the MySQL driver ThreadLocal leak.

SEVERE: A web application registered the JBDC driver [com.jdbc.mysql.Driver] but failed to unregister it when the web application was stopped. To prevent a memory leak, the JDBC Driver has been forcibly unregistered.        

Design Principles – Immutability and Final

Immutable objects and final references are the two of the most debated and disputed aspects of the Java programming model. Immutability essentially states that once the object is created and initialized its internal state cannot be changed again. Because its internal state cannot be changed again it automatically becomes thread safe once it is created. Such an immutable object or defensive copies, as they are also called, can be passed to untrusted client code which may try to mutate it. Then again any mutating operation on the immutable object creates a new object to the same type with the new target state instead of altering the existing one. Thus immutable objects are always thread-safe.

This post doesn’t go into details about creating an immutable object as this is a broad topic in itself. However, a brief guideline is provided about the behaviour of immutable objects. An object is deemed immutable if and only if:

  • It does not allow any public access to mutate its internal state.
  • It does not allow state mutation using cloning and reverse cloning.
  • It does not allow state mutation using reflections API.
  • All its state data is preferably tagged with the final modifier.
  • It does not allow a reference to ‘this’ to escape during construction

Final modifier on variables plays a fundamental role in ensuring immutable state transition and also in JVM runtime performance optimizations. Please note that this is not the same as the final modifier on classes and method signatures. While objects can be made immutable as outlined earlier but still their references are not immutable in nature.

Consider the following code example.

https://gist.github.com/vireshwali/7bd57d0ca314cf3b070022fab7944bb8#file-snippet_14-java

snippet_14.java

Though?name?is a String and is immutable in the above code and let’s say constitutes the shared data so we synchronize the getter and setter on it. But the String object itself is immutable, this we know, however, the reference?name?is not. This means we can construct another String object somewhere in our program and assign the reference?name?to it and nothing would go wrong, at least programmatically. This is essentially what the setter does here. But if the object to which the reference?name?changes then our synchronized locks fail.

Why?

Because the?synchronized(name)?code obtains the lock on the String object to which the reference?name?points to, and not a lock on the reference itself. When a thread T1 calls?synchronized(name)?at line 11, it obtains a lock on the String object to which the reference?name?points. But then T1 changes the object pointed to by?name?on line 12. At that point in time any other thread T2 or T3 can work through both the methods (get and set) since the object on which they would obtain a lock would most probably not be the same as the one on which T1 did.

To make?Temp6?thread safe we need to declare?name?as final and provide capability methods or a factory to make new?Temp6?objects as the state of?name?variable changes. That way the reference cannot change again. In essence, we need to make?Temp6?immutable and declare the reference?name?as final.

The?EmployeeManager5?class which we discussed earlier also suffers from this problem. Since the synchronized block in the class, synchronizes on the reference?employees?which is not final, changing the object to which?employees?points can break the thread safety of our class.

Because of these reference swapping restriction rules on the final variables, we get what is called as?Initialization safety?for such variables and their use across the code. This allows JVM implementations to work up some performance optimizations of their own too. Because of the initialization guarantees for their variables, JVM runtimes are free to cache final variables in processor local L1, L2 & L3 fast caches and defer direct?memory?access as needed. Since these references cannot be shifted from one object to another, such objects can be safely cached in the memory model implementations for quick access with minimal loss to durability and consistency.

Design Principles – Guarded By Lock

Most of the design principles outlined so far, work well enough to ensure thread safety of classes and components. However in some situations these alone may not be enough. In such situations it becomes a necessity to include some locking mechanism, either intrinsic or extrinsic in addition to the above techniques.

Though locking provides a fool proof mechanism to ensure thread safety but locking idioms are essentially blocking in nature. By using locking idioms we intentionally enforces sequential behavior over the code block that it guards even if there are multiple threads waiting for access. This is analogous somewhat to what pipes do in Operating System designs.?

An immediate impact of this is on the performance of the overall program but keeping lock contention scopes as minimal as possible, performance bottlenecks can be reduced though never totally eliminated. All of the other techniques outlined earlier are non-blocking in nature and thus provide much better performance than intentional locking and granularity mechanisms.

Documentation Principles

As much as it is difficult and important to write effective concurrent code, it is equally difficult and far more important to document the behaviour of your concurrent code correctly. Most of the time incorrect multi-threaded code results from the incorrect use of library APIs and frameworks.

How many of us know if e.g.

  1. The connection pools that we used in our projects do provide us a thread-safe connection object or not?
  2. Our own connection factory for pooling the JCR session object provides thread-safe access to JCR session objects?
  3. Hibernate provides us thread-safe access to the session object?
  4. The reactive libraries that we use in our Java app access and pass the state in a thread-safe way?

For instance, I know that point 3 is true but I am not so sure about points 1 and 2 (depending on the library/framework you pick). That is because the framework owners or creditors may not have documented this aspect as effectively and elaborately as the Hibernate creators did. Hibernate Javadoc API clearly states this out and any deviations naturally mean a bug in Hibernate code. So what about point 1, 2 and 3? e.g. if the connection pool in my app does not behave as expected, is that a bug in my code or in the framework.Honestly, there is no end to this argument once it starts. The point to understand here is that concurrency policies and behaviours for classes should be documented explicitly and clearly. We need to call out if the clients of our class need to do things in a specific way, employ synchronization to achieve thread safety or do we handle it effectively in our framework code. In general, there is no substitute for good clear documentation.

Conclusion

This story covered important aspects and considerations for designing and writing thread-safe Java applications. The post covered all the main nuts and bolts offered by the JVM memory model for state safety in multi-threaded execution and also highlighted the importance of having clear and precise documentation accompanying it.

Once you understand the low-level vagaries of the JVM memory model and the guarantees that it offers to you, are an architect and developer, writing highly concurrent applications in Java, leveraging the full power of your modern multi-processor machines is so much fun.?

To check the previous post in the series please click?here. To see the list of posts in this series, please visit?here.

Happy Coding!! ??

要查看或添加评论,请登录

Viresh Wali的更多文章

社区洞察

其他会员也浏览了