You can Cause Memory Leaks in .NET even though you have Garbage Collector

You can Cause Memory Leaks in .NET even though you have Garbage Collector

Any experienced .NET developer knows that even though .NET applications have a garbage collector, memory leaks occur all the time. It’s not that the garbage collector has bugs, it’s just that there are ways we can (easily) cause memory leaks in a managed language.

Memory leaks are sneakily bad creatures. It’s easy to ignore them for a very long time, while they slowly destroy the application. With memory leaks, your memory consumption grows, creating GC pressure and performance problems. Finally, the program will just crash on an out-of-memory exception.

In this article, we will go over the most common reasons for memory leaks in .NET programs. All examples are in C#, but they are relevant to other languages.

Defining Memory Leaks in .NET

In a garbage collected environment, the term memory leak is a bit counter intuitive. How can my memory leak when there’s a garbage collector (GC) that takes care to collect everything?

There are 2 related core causes for this. The first core cause is when you have objects that are still referenced but are effectually unused. Since they are referenced, the GC won’t collect them and they will remain forever, taking up memory. This can happen, for example, when you register to events but never unregister. Let’s call this a managed memory leak.

The second cause is when you somehow allocate unmanaged memory (without garbage collection) and don’t free it. This is not so hard to do. .NET itself has a lot of classes that allocate unmanaged memory. Almost anything that involves streams, graphics, the file system or network calls does that under the hood. Usually, these classes implement a Dispose method, which frees the memory. You can easily allocate unmanaged memory yourself with special .NET classes (like Marshal) or with PInvoke.

Many share the opinion that managed memory leaks are not memory leaks at all since they are still referenced and theoretically can be de-allocated. It’s a matter of definition and my point of view is that they are indeed memory leaks. They hold memory that can’t be allocated for another instance and will eventually cause an out-of-memory exception. For this article, I will address both managed memory leaks and unmanaged memory leaks as, well, memory leaks.

Here are the most common offenders.

1. Subscribing to Events

Events in .NET are notorious for causing memory leaks. The reason is simple: Once you subscribe to an event, that object holds a reference to your class. That is unless you subscribed with an anonymous method that didn’t capture a class member. Consider this example:

public class MyClass
{
    public MyClass(WiFiManager wiFiManager)
    {
        wiFiManager.WiFiSignalChanged += OnWiFiChanged;
    }
 
    private void OnWiFiChanged(object sender, WifiEventArgs e)
    {
        // do something
    }
    }
}

Assuming the wifiManager outlives MyClass, you have a memory leak on your hands. Any instance of MyClass is referenced by wifiManager and will never be allocated by the garbage collector.

Events are dangerous indeed so what can you do? There are several great pattern to prevent memory leaks from event in the mentioned article. Without going into detail, some of them are:

  1. Unsubscribe from the event.
  2. Use weak-handler patterns.
  3. Subscribe if possible with an anonymous function and without capturing any members.

2. Capturing members in anonymous methods

While it might be obvious that an event-handler method means an object is referenced, it’s less obvious that the same applies when a class member is captured in an anonymous method.

Here’s an example:

public class MyClass
{
    private JobQueue _jobQueue;
    private int _id;
 
    public MyClass(JobQueue jobQueue)
    {
        _jobQueue = jobQueue;
    }
 
    public void Foo()
    {
        _jobQueue.EnqueueJob(() =>
        {
            Logger.Log($"Executing job with ID {_id}");
            // do stuff 
        });
    }
}

In this code, the member _id is captured in the anonymous method and as a result the instance is referenced as well. This means that while JobQueue exists and references that job delegate, it will also reference an instance of MyClass.

The solution can be quite simple – assigning a local variable:

public class MyClass
{
    public MyClass(JobQueue jobQueue)
    {
        _jobQueue = jobQueue;
    }
    private JobQueue _jobQueue;
    private int _id;
 
    public void Foo()
    {
        var localId = _id;
        _jobQueue.EnqueueJob(() =>
        {
            Logger.Log($"Executing job with ID {localId}");
            // do stuff 
        });
    }
}

By assigning the value to a local variable, nothing is captured and you’ve averted a potential memory leak.

3. Static Variables

Some developers I know consider using static variables as always a bad practice. While that’s a bit extreme, there’s a certain point to it when talking about memory leaks.

Let’s consider how the garbage collector works. The basic idea is that the GC goes over all GC Root objects and marks them as not-to-collect. Then, the GC goes to all the objects they reference and marks as not-to-collect as well. And so on. Finally, the GC collects everything left (great article on garbage collection).

So what is considered as a GC Root?

  1. Live Stack of the running threads.
  2. Static variables.
  3. Managed objects that are passed to COM objects by interop (Memory de-allocation will be done by reference count)

This means that static variables and everything they reference will never be garbage collected. Here’s an example:

public class MyClass
{
    static List<MyClass> _instances = new List<MyClass>();
    public MyClass()
    {
        _instances.Add(this);
    }
}

If, for whatever reason, you decide to write the above code, any instance of MyClass will forever stay in memory, causing a memory leak.

4. Threads that Never Terminate

We already talked about how the GC works and about GC roots. I mentioned that the Live Stack is considered as a GC root. The Live Stack includes all local variables and members of the call stacks in the running threads.

If for whatever reason, you were to create an infinitely-running thread that does nothing and has references to objects, that would be a memory leak. One example of how this can easily happen is with a Timer. Consider this code:

public class MyClass
{
    public MyClass()
    {
        Timer timer = new Timer(HandleTick);
        timer.Change(TimeSpan.FromSeconds(5), TimeSpan.FromSeconds(5));
    }
 
    private void HandleTick(object state)
    {
        // do something
    }

If you don’t actually stop the timer, it will run in a separate thread, referencing an instance of MyClass, preventing it from being collected.

5. Adding Dispose without Calling it

In the last example, we added the Dispose method to free any unmanaged resources. That’s great, but what happens when whoever used the class didn’t call Dispose?

One thing you can do is to use the using statement in C#:

using (var instance = new MyClass())
{
    // ... 
}

This works on IDisposable classes and translates by the compiler to this:

MyClass instance = new MyClass();;
try
{
    // ...
}
finally
{
    if (instance != null)
        ((IDisposable)instance).Dispose();
}

This is very useful because even if an exception was thrown, Dispose will still be called.

Another thing you can do is utilize the Dispose Pattern. Here’s an example of how you would implement it:

public class MyClass : IDisposable
{
    private IntPtr _bufferPtr;
    public int BUFFER_SIZE = 1024 * 1024; // 1 MB
    private bool _disposed = false;
 
    public MyClass()
    {
        _bufferPtr =  Marshal.AllocHGlobal(BUFFER_SIZE);
    }
 
    protected virtual void Dispose(bool disposing)
    {
        if (_disposed)
            return;
 
        if (disposing)
        {
            // Free any other managed objects here.
        }
 
        // Free any unmanaged objects here.
        Marshal.FreeHGlobal(_bufferPtr);
        _disposed = true;
    }
 
    public void Dispose()
    {
        Dispose(true);
        GC.SuppressFinalize(this);
    }
 
    ~MyClass()
    {
        Dispose(false);
    }
}

This pattern makes sure that even if Dispose wasn’t called, then it will eventually be called when the instance is garbage collected. If, on the other hand, Dispose was called, then the finalizer is suppressed. Suppressing the finalizer is important because finalizers are expensive and can cause performance issues.

The dispose-pattern is not bulletproof, however. If Dispose was never called and your class wasn’t garbage collected due to a managed memory leak, then the unmanaged resources will not be freed.

Salvatore Fagone

Making Cybersecurity Simple, Value Driven and Human Centric Software Engineer in Digital Identity

10 个月

a cache without a defined retentionpolicy is also a leak ;)

回复

要查看或添加评论,请登录

Pradip Shinde的更多文章

社区洞察

其他会员也浏览了