Garbage Collection
The garbage collector is an automatic mechanism to detect and collect the dead object’s memory space and make it possible for further usages and it is tightly bonded to the physical and virtual memory allocation mechanism.
Each program consists of different numbers of processes including some threads for their performance. The CLR allocates a contiguous region in the memory for each thread called the managed heap. In each heap, there is also a pointer to show the next place for the assignment of the new object.
There are two more important reasons declares that memory allocation in the managed heap is easier than the unmanaged heap:
- A new object just needs a value for its pointer and it is as fast as allocating memory from stack memory.
- The memory allocation in the managed heap is consecutive, so their contiguous allocation is easier.
The garbage collector (GC) determines the objects that are no longer in use from the application’s roots including static fields, local variables on a thread's stack, CPU registers, GC handles, and the finalize queue. Using these roots makes it possible for the GC to access all the objects in use.
The other objects which are not accessible through the roots are unreachable and the GC uses their deallocated memory space and make compaction on the whole managed heap (only if there are significant memory fragments detected by the GC).
When will the Garbage Collector be triggered?
The garbage collector mechanism has invoked the situations below:
- Low physical memory when OS sends a notification to it
- Memory suppress the heap space using by the objects
- The GC.Collect memory is called.
All managed heaps in the memory are consist of large objects heap (objects with 85000 bytes) and small objects heap. For the better performance of the GC, it divides the managed heap into 3 main generations
The 0 generation contains the youngest objects. Generation 1 is considered as a buffer for short-lived objects which survive the GC operation. In fact, it contains the competition results of generation 0. If there is not a significant result for the compaction and creating a new space for new objects then the GC performs on generation 1 and 2.
Generation 2 contains the long-lived objects like the objects in the server application that live for the process duration. The survived objects from the collection process remain here until being unreachable in the future.
Generation 0 and 1 are also called ephemeral generations. And its size is depended on the type of system (32 or 64 bit) and the type of garbage collector (workstation or server GC).
The large objects with more than 85000 bytes are big enough to be treated differently. For increasing the performance of the GC, large objects are maintaining in a different heap called LOH. They could be considered as generation 3 but logically they are generation 2.
There are some important motivations to call for the Large Object collection:
- When the allocation threshold of the generation 0 and the Large Objects Heap exceeds its maximum value
- When the GC.collect() is invoked by the CLR.
- When the capacity of the memory is low and the demand for it is more than its free space.
Different phases of the garbage collection:
The garbage collections procedure occurs in three main phases:
The objects in generations 1 and 2 could occupy multiple segments because they are promoted to generation 2 and it is the long-lived objects’ generation.
How does the garbage collector determine that the objects are alive?
- Stack variables recognized by the JIT compiler or stack walkers
- Garbage collection handles allocated by the programmers or the CLR managing the objects
- Static data or static objects referring to other objects and kept in the application domain
Executing the garbage collector thread needs to suspend all other threads at first, then triggering the thread of GC.
Unmanaged resources
Most of the objects created during the executions can be handled with the GC mechanism implicitly. But it is a bit different about the unmanaged codes. These resources need to clean up their dead objects by an explicit command. Objects that wrap some operating system resources like the filehandle, window handle, or network connection are such unmanaged objects.
The GC can keep the track of objects which are implicitly using unmanaged resources without any knowledge about how to clean up these resources. One way is that they can invoke the public Dispose() method for an explicit memory clean up when the object’s execution has finished.
IDisposable
IDisposable interface makes it possible to write a code that helps to clean up the memory space occupied by the unmanaged resources.
Each class or interface inheriting this interface should implement the Dispose() method. Sometimes the Dispose() method may have an error on execution or it's forgotten to invoke it, so it is vital to implement other ways to release the unmanaged resources, for example by implementing a SafeHandle or overriding the Object.Finalize() methods.
Hower the GC is a self-tuned mechanism and can adapt itself to different scenarios but it is on the CLR to consider the characteristics of the workload and assign the best type to the GC. Here are some different types of GC:
- The workstation GC designed for client applications like hosted apps by ASP.NET. It can be concurrent (or background) and non-concurrent. In the concurrent, other operation could be executed in parallel with garbage collection.
- Background GC replaced by the .NET framework 4 and later versions.
- Server GC is for server-side applications with high scalability.
Garbage collection and performance:
There is some profiler software that helps us to track the performance of the objects, heaps and memory. For example tracking the Memory Performance Counter, debugging with SOS, and Garbage Collection ETW Events, Profiling API, and Application Domain Resource Monitoring.
There are also some issues caused by garbage collectors and some hints to reduce their effects on the performance of the applications. There are also provided some guidelines starting to investigate the issues caused by garbage collectors. For example, we should check that if we are using the correct version of garbage collectors used by our program (workstation or server ones) or When to measure the managed heap size?
Note: All I have written about the garbage collection in this article are my interpretations of its Microsoft document and all sub-links provided there.