登录查看更多内容

When Memory Runs Dry: Understanding the OOM Killer’s Decision Process

Mohit Mishra

Engineering @ Amadeus || JGEC 2023 || KWoC 2021 || GWOC 2021

发布日期: 2024年10月24日

The Out-of-Memory (OOM) Killer’s decision-making process is a complex and crucial component of Linux memory management. This process determines which process(es) to terminate when the system is under severe memory pressure. Let’s explore the intricacies of this mechanism in depth.

Activation Triggers

Before getting into the decision-making process, it’s important to understand when the OOM Killer is activated. The primary triggers include:

Physical memory exhaustion: When all available RAM is consumed.
Swap space depletion: If present, when swap space is fully utilized.
Memory reclamation failure: When the kernel’s attempts to reclaim memory through other means (e.g., page cache eviction) are insufficient.

The kernel continuously monitors memory usage and pressure. When these conditions are met, it initiates the OOM Killer’s decision-making process.

The OOM Score Calculation

At the heart of the OOM Killer’s decision-making process is the OOM score. This score is calculated for each process in the system and determines the likelihood of a process being terminated. The calculation involves several factors:

Memory Consumption

The primary factor in the OOM score calculation is the process’s memory consumption. This includes:

Resident Set Size (RSS): The amount of physical memory the process is currently using.
Virtual Memory Size: The total amount of virtual memory allocated to the process.
Shared Memory: Memory shared with other processes is weighted differently.

The kernel uses a logarithmic scale to calculate the memory score, which prevents processes with extremely large memory footprints from always being killed.

CPU Time

The OOM Killer considers both total CPU time and recent CPU usage. This factor is included to avoid killing actively running, important system processes. The calculation involves:

Total CPU time: Accumulated over the process’s lifetime.
Recent CPU usage: Weighted more heavily to favor currently active processes.

Process Lifetime

Long-running processes are given a slight preference to survive. This is calculated based on the process’s start time relative to system uptime.

Nice Value

The process’s nice value, which represents its scheduling priority, is factored into the OOM score. Processes with higher nice values (lower priority) are more likely to be terminated.

Process Flags

Certain process flags can significantly influence the OOM score:

Privileged processes (e.g., those running as root) receive a lower score.
Processes marked as unkillable are given a very low score to avoid termination.

Process Hierarchy

The OOM Killer considers the process’s position in the process tree. Child processes of a high-scoring parent may receive a higher score to encourage killing entire process trees when appropriate.

OOM Score Adjustment

System administrators can manually adjust a process’s OOM score through the /proc/<pid>/oom_score_adj file. This allows fine-tuning of the OOM Killer's behavior for specific processes.

Score Normalization

After calculating the raw scores, the OOM Killer normalizes them to a scale of 0 to 1000. This normalization ensures consistent behavior across different system configurations and loads.

The Selection Algorithm

Once the OOM scores are calculated and normalized, the OOM Killer employs a selection algorithm to choose which process(es) to terminate. This algorithm involves several steps:

Threshold Determination

The kernel determines a threshold score based on current memory pressure. Processes with scores above this threshold are considered candidates for termination.

Candidate Filtering

The candidate list is filtered to remove:

Essential system processes
Processes explicitly marked as unkillable
Processes with negative OOM score adjustments that bring them below the threshold

领英推荐

Science at the Edge! HPE Liquid Loop, Lots of…

StorageReview.com 1 年前

Of Dials and Switches -- Part III: Turning Dials…

Mark Ray 4 年前

DDR5 Memory Enables Next-Generation Computing

AKEN Cheung 封装基板制造商 8 个月前

Badness Calculation

For each candidate process, a “badness” score is calculated. This score is based on:

The normalized OOM score
The estimated amount of memory that would be freed by killing the process
The process’s current state (e.g., sleeping processes might be preferred over actively running ones)

Selection

The process with the highest badness score is selected for termination. In cases of ties, additional factors like process ID may be used as a tiebreaker.

Below is a basic C implementation that demonstrates the core concepts of the OOM Killer’s decision-making process.

Termination Process

Once a process is selected, the OOM Killer initiates the termination process:

A SIGKILL signal is sent to the chosen process. This signal cannot be caught or ignored, ensuring immediate termination.
The kernel logs the termination event, including details about why the process was chosen.
Memory used by the terminated process is reclaimed.
If sufficient memory is not freed, the OOM Killer may repeat the process to select additional victims.

Post-Termination Actions

After terminating a process, the OOM Killer performs several actions:

Memory Reevaluation: The kernel reassesses the memory situation to determine if the termination was sufficient to alleviate memory pressure.
Survivor Notification: Surviving processes may be notified of the OOM event through the cgroup memory controller.
System Stability Check: The kernel verifies that critical system processes remain intact and the system is stable.

Feedback Loop

The OOM Killer incorporates a feedback mechanism to refine its decision-making process:

Effectiveness Monitoring: The kernel monitors how effective each termination was in freeing memory.
Score Adjustment: Based on the effectiveness, future OOM score calculations may be subtly adjusted.
Pattern Recognition: The kernel attempts to recognize patterns in which processes tend to cause OOM situations, potentially influencing future decisions.

Edge Cases and Special Considerations

The OOM Killer’s decision-making process also accounts for several edge cases:

Cgroup-aware Selection

In systems using cgroups (control groups), the OOM Killer can make decisions based on cgroup hierarchies, potentially targeting entire groups of processes.

NUMA Considerations

On NUMA (Non-Uniform Memory Access) systems, the OOM Killer may preferentially select processes on nodes experiencing the most severe memory pressure.

Virtualization Awareness

In virtualized environments, the OOM Killer may consider the memory usage of the entire virtual machine, not just individual processes within it.

Container Environments

In containerized setups, the OOM Killer may interact with container runtime memory limits, potentially terminating entire containers rather than individual processes.

Continuous Improvement

The OOM Killer’s decision-making process is continually refined in new kernel versions. Recent and ongoing improvements include:

More sophisticated memory pressure detection to trigger the OOM Killer more appropriately.
Improved handling of memory-hungry but short-lived processes.
Better integration with other memory management subsystems like zswap and zcache.
Enhanced logging and debugging capabilities to help system administrators understand and tune OOM Killer behavior.

Ethical Considerations

The design of the OOM Killer’s decision-making process also involves ethical considerations:

Fairness: Ensuring that the selection process doesn’t unfairly target certain types of applications.
Predictability: Balancing the need for deterministic behavior with the ability to make optimal decisions in varied scenarios.
Transparency: Providing clear logs and explanations for why specific processes were chosen.

Performance Implications

The decision-making process itself consumes some system resources. The kernel developers must balance the thoroughness of the selection process with its performance impact, especially considering that it runs when the system is already under memory pressure.

In conclusion, the OOM Killer’s decision-making process is a complex, multi-faceted system that balances numerous factors to make critical decisions about process termination under memory pressure. It represents a crucial last line of defense in maintaining system stability and exemplifies the intricate balance between resource management, system performance, and reliability in modern operating systems.

要查看或添加评论，请登录

Mohit Mishra的更多文章

Addressing TCP Limitations of Head-of-Line Blocking: The Journey from HTTP/2 to HTTP/3 with QUIC

2024年12月24日

Addressing TCP Limitations of Head-of-Line Blocking: The Journey from HTTP/2 to HTTP/3 with QUIC

Table of Contents Introduction Historical Context: From HTTP/1 to HTTP/2 TCP Fundamentals and Their Impact HTTP/2…
Basic Understanding of Threads - Practical Guide to POSIX Threads in C

2024年12月7日

Basic Understanding of Threads - Practical Guide to POSIX Threads in C

Table of Content Introduction What Are Threads? Why Use Threads? Thread Lifecycle Thread Implementation Models Thread…
Evolution Of Cloudflare's Cache Purging System: From Centralized To Distributed Architecture

2024年12月6日

Evolution Of Cloudflare's Cache Purging System: From Centralized To Distributed Architecture

Table of Contents Introduction Background The Old System: Centralized Architecture Challenges with the Old System The…
User Space to Kernel: Build Your Own Linux Kernel Network Stack

2024年11月28日

User Space to Kernel: Build Your Own Linux Kernel Network Stack

Table of Content Introduction Layer Model in Kernel Architecture Overview User Space Layer Application Layer Kernel…
From Pixel to Parallel: Understanding Modern GPU Architecture

2024年11月24日

From Pixel to Parallel: Understanding Modern GPU Architecture

Table of Contents Introduction GPU vs CPU: Architectural Differences Physical Architecture of Modern GPUs Memory…
From S Locks To SX Locks: From Mysql 5.6 To 8.0 A Concurrency Journey

2024年11月23日

From S Locks To SX Locks: From Mysql 5.6 To 8.0 A Concurrency Journey

Table of Contents Introduction Understanding B+ Trees Locking Mechanisms in Databases MySQL 5.6 Implementation MySQL 8.
From Code to Kernel: Why is my "Hello World" so Big?

2024年11月18日

From Code to Kernel: Why is my "Hello World" so Big?

Table of Content About This Series What am I doing? Introduction Our Starting Point: The Simplest C Program…
From ZLIB to Zstandard: Discord’s Real-Time Communication Optimization

2024年11月1日

From ZLIB to Zstandard: Discord’s Real-Time Communication Optimization

In real-time communication platforms, efficiency isn't just about user experience—it's about sustainability and…
Understanding Database Indexing And Concurrent Operations

2024年10月31日

Understanding Database Indexing And Concurrent Operations

Introduction Database indexing is a fundamental concept that significantly impacts application performance. However…
Memory Allocation: What Happens When You Push the Limits?

2024年10月25日

Memory Allocation: What Happens When You Push the Limits?

Have you ever wondered what that malloc() call actually does? We all know programs need memory, but what happens when…

1 条评论

See all articles

Activation Triggers

The OOM Score Calculation

Memory Consumption

CPU Time

Process Lifetime

Nice Value

Process Flags

Process Hierarchy

OOM Score Adjustment

Score Normalization

The Selection Algorithm

Threshold Determination

Candidate Filtering

领英推荐

Badness Calculation

Selection

Termination Process

Post-Termination Actions

Feedback Loop

Edge Cases and Special Considerations

Cgroup-aware Selection

NUMA Considerations

Virtualization Awareness

Container Environments

Continuous Improvement

Ethical Considerations

Performance Implications

Mohit Mishra的更多文章

Addressing TCP Limitations of Head-of-Line Blocking: The Journey from HTTP/2 to HTTP/3 with QUIC

Basic Understanding of Threads - Practical Guide to POSIX Threads in C

Evolution Of Cloudflare's Cache Purging System: From Centralized To Distributed Architecture

User Space to Kernel: Build Your Own Linux Kernel Network Stack

From Pixel to Parallel: Understanding Modern GPU Architecture

From S Locks To SX Locks: From Mysql 5.6 To 8.0 A Concurrency Journey

From Code to Kernel: Why is my "Hello World" so Big?

From ZLIB to Zstandard: Discord’s Real-Time Communication Optimization

Understanding Database Indexing And Concurrent Operations

Memory Allocation: What Happens When You Push the Limits?

社区洞察

其他会员也浏览了

DDR5 Memory: Coming Soon To A Server Near You

Performance, Scalability and Availability checklist which can be used to check if costly CPU cycles are the reason for the impact.

CPU works. Oh really? But how?

Understanding Spinlocks - How CPU supports Atomic locks

From Every Acorn ...

Re-RAM - RAM and SSDs will unite!

CPU Computing: The Old Dog Driving Today’s Tech

Introduction to Raspberry Pi 5 | Specs

"The Ultimate Guide to CPU Analysis: Boosting Efficiency and Troubleshooting Performance"

Understanding Processes and Threads: The Backbone of Modern Operating Systems