Top 5 Java Performance Problems
Java is a popular programming language that powers several mission critical applications all over the world. In this post let’s discuss some of the commonly confronted performance problems by Java applications and potential solutions to solve them.
1. External Systems Slowdown
This is probably the most common Java Performance Problem. Modern Java Applications talks with various System Of Records, Payment Gateways, AI Platforms… When these external systems slow down or have intermittent hiccups, those problems cascades into our Java application as well. It also will cause performance bottlenecks in the application. This type of problem primarily happens due to the following reasons:?
?a. Lack of defensive coding?
?b. Extended (or no) timeout settings in the connections
?c. Under allocation of threads
?Here is a case study of a large financial institution’s middleware which was suffering from an outage due to slow down in one of its primary SOR Oracle RAC Cluster. In this case study you can learn how this financial Institution went about diagnosing the problem and fixing it.
2. CPU Spike
Another commonly confronted performance problem is CPU spike. There is only one reason for CPU to spike in Java applications: When a thread goes on an infinite loop or executes multiple lines of code continuously in the application. Say suppose a developer write a code like this:
while (condition-check) {
// do something
}
What will happen if ‘condition-check’ starts to return to ‘true’ always? Then the thread which is executing this code will start to loop on this ‘while’ clause infinitely. When a thread starts to loop infinitely then the CPU will start to spike up.
Since CPU spikes are always caused because of thread, you need to do thread dump analysis. Here is a tutorial that shows the most light-weight and non-intrusive approach to troubleshoot CPU spikes in the Java Applications.
Here is a case study of a major trading application that was suffering from severe CPU spikes, significantly affecting its performance during critical trading hours. Learn the approach they pursued to troubleshoot this CPU spike problem.
3. Memory Leak
Memory Leak is a popular Java Performance Problem. Due to programming errors, certain objects can keep growing, causing java.lang.OutOfMemoryError. Consider this sample code:
public class MapManager {
private static HashMap<Object, Object> myMap = new HashMap<>();
public void grow() {
long counter = 0;
while (true) {
myMap.put("key" + counter, "Large stringgggggggggggggggggggggggg" + counter);
++counter;
}
}
}
The above program has a ‘MapManager’ class that internally contains a ‘HashMap’ object that is assigned to the ‘myMap’ variable. Within the ‘grow()’ method, there is an infinite ‘while (true)’ loop that keeps populating the ‘HashMap’ object. On every iteration, a new key and value (i.e., ‘key-0’ and ‘Large stringgggggggggggggggggggggggg-0’) is added to the ‘HashMap’. Since it’s an infinite loop, ‘myMap’ object will get continuously populated until the heap capacity is saturated. Once the heap capacity limit is exceeded, application will result in ‘java.lang.OutOfMemoryError: Java heap space’.?
When this memory leak happens, you need to capture Heap Dump from the application analyzer. Heap dump is a snapshot of your application’s memory at a point in time. It contains information such as what objects are in memory, what values do they carry, what is their size, what other objects do they reference. Here is a blog post that provides a step by step guide to troubleshoot OutOfMemoryError.
4. Garbage Collection Overhead
In earlier days, especially in languages like C and C++, to service the new incoming requests, developers had to manually allocate and as well deallocate memory using ‘malloc()’ and ‘free()’ APIs. Business applications are complex, they tend to contain multiple workflows and use cases. If a developer misses to deallocate objects even in one workflow, then those objects will start to build up in the memory, eventually causing OutOfMemoryError. Thus, OutOfMemoryError was quite pervasive in the earlier languages.
When Java was introduced in 1995, it promised automatic garbage collection. It told Java developers that they only need to allocate objects; and deallocation is handled by the Java runtime environment. Developers loved this feature because it reduces memory leaks and allows them to focus on business logic. However automatic Garbage Collection comes with 2 primary side effects:
领英推荐
?a. Response Time degradation due to GC pauses
?b. High CPU consumption
Learn more about automatic garbage collection overhead.? So, one has to do proper GC tuning to address these two side effects. Here are the 9 tips to reduce long GC pause in your application.
Here is a case study of one of the largest Cloud service provider’s critical service which was suffering from poor response time due to automatic poor Garbage Collection. You can learn how this cloud provider changed their memory settings to improve their garbage collection performance.
5. Concurrency Issues
Java is a multi-thread programming language, which means multiple threads can run concurrently. While multi-threaded programming can result in faster performance, it can also result in concurrency/racing condition problems like threads getting into prolonged BLOCKED state, deadlock, circular deadlock…? This problem primarily stems due to lack of thread safe programming practice in the organization. Consider the below example code:
public synchronized void getData() {
// Some long-running code...
}
In this code, the getData() method is marked as synchronized, meaning only one thread can enter the method at any given time for that object. Let’s say Thread-1 enters the getData() method first and starts executing. While Thread-1 is still inside the method, let’s say Thread-2 attempts to enter. Since Thread-1 has not yet completed execution, Thread-2 is forced into a BLOCKED state.
If the getData() method takes an unusually long time due to inefficient or buggy code, Thread-2 and all other threads that are attempting to invoke getData() will wait indefinitely. This can lead to an application-wide slowdown, significantly degrading performance.
Below are couple of case studies caused due to racing conditions in two major applications: a. How Random Number Generation derailed application’s availability? b. Deadlock in Apache Open Source Library
How to Troubleshoot these Performance Problems?
Fig: 360-degree data
The best way to troubleshoot these sort of tricky performance problems is to equip yourself with 360-degree troubleshooting artifacts. You can use the open source yCrash data script? to capture 360° data from your application stack. This script basically captures 16 different artifacts from your application stack (GC Log, thread dump, heap substitute, netstat, iostat, ….) and runs less than 30 seconds. Thus, it doesn’t add any measurable overhead to your application. You can trigger this script from any platform (all Linux flavors, windows,? ..) and any environment (bare metal, cloud, containers, k8…).
Once you have captured the data, you can analyze them using the yCrash server. You can upload the captured zip file to the yCrash server for analysis. yCrash server analyzes all the captured data and generates one unified root cause analysis report instantly.?
Fig: CPU consumption by threads reported by yCrash
Above section is the excerpt from yCrash report, which shows all the CPU consuming threads, amount of CPU each thread consumes and the exact lines of code they are working on. Equipped with this information, you can spot the ‘black sheep’ lines of code that are causing the CPU to spike up. Similarly, you can solve most performance problems using the yCrash tool.
Conclusion
We hope this post highlighted common Java performance issues, their root causes and effective solutions to troubleshoot them. Happy Hacking!!