Memory Corruptions - Embedded System

Memory Corruptions - Embedded System

Embedded Operating System Invasion

The pervasive presence of smart objects in every corner of our life urges the security of such embedded systems to be the point of attention. Memory vulnerabilities in the embedded program code, such as buffer overflow, are the entry point for powerful attack paradigms such as Code-Reuse Attacks (CRAs), in which attackers corrupt systems’ execution flow and maliciously alter their behavior. Control-Flow Integrity (CFI) has been proven to be the most promising approach against such kinds of attacks, and in the literature, a wide range of flow monitors are proposed, both hardware-based and software-based. While the former are costly to implement, software solutions are more flexible and also portable to the existing devices.

Real-Time Operating Systems (RTOS) and their key role in application development for embedded systems is the main concern regarding the application of the CFI solutions. RTOS adopt kernel protection methods (e.g., mandatory access control, kernel address space layout randomization, control flow integrity, and kernel page table isolation) as essential countermeasures to reduce the likelihood of kernel vulnerability attacks. However, kernel memory corruption can still occur as the vulnerable kernel code and the attack target kernel code or kernel data are located in the same kernel address space. [i]

These attacks exploit software vulnerabilities attributed to the uneven diffusion of the latest (and thus most potent) protection and/or mitigation technologies to which the attackers are able to find ways of circumventing, thus resulting in the counter-measures to loose effectiveness. With the kernel having highest privilege in most computer systems, its code integrity is critical to the entire system’s security and hence, essential to preventing an attacker from modifying kernel code pages directly or trick the kernel into executing instructions stored outside the kernel address area. Existing prevention mechanisms rely on the memory management unit in which certain memory pages are marked as not-executable in supervisor mode to prevent such attacks. This too can be bypassed by directly manipulating the page table contents with malicious code.

Silent Data Corruptions

C and C++ lack memory safety features. Large number of attacks exploit control-flow-hijacking. The Silent Data Corruptions (SDCs) are stealthy saboteurs that silently corrupt data, remaining undetected by traditional error handling mechanisms. The silent nature of SDCs makes them challenging to trace at the hardware level, as they evade error reporting systems. Their effects manifest at the application level, potentially causing data loss and system-wide issues. Detecting and measuring SDCs present unique challenges. Their low occurrence rates, dependence on hardware structure and software workloads, and correlation to environmental factors make accurate measurement complex. Addressing SDCs requires proactive measures to prevent data corruption and ensure digital integrity. Software redundancy methods provide a means to tolerate SDCs by introducing duplication or triplication of application resources. However, these methods come with limitations, including increased code size, altered execution patterns, and potential vulnerability to other types of failures. Understanding the nature of SDCs and developing effective mitigation strategies are crucial for maintaining digital integrity in large-scale infrastructure services. By addressing the challenges posed by SDCs, digital systems can be fortified to ensure reliability and integrity of our digital infrastructure.

Memory Corruption Detection Tools

Memory corruption, reading uninitialized memory and other memory-related errors are some of the most difficult programming bugs to identify and fix. Dedicated memory checking tools are difficult to build, primarily due to three significant challenges: performance, accuracy, and system dependencies.

These tools are not easy to use and are infested with false positive error reports. Avoiding these false positives requires monitoring not only memory accesses but nearly every single application instruction, further decreasing performance.

Accuracy is an issue for leak checking. To do this, most tools perform a garbage collection scan at the application exit to identify unreachable heap allocations. Without semantic information, assumptions are made - invariably leading to false positives. While reading or writing beyond the bounds of allocated heap memory tools add a red-zone around each allocation to increase the chance that heap overflows will not be mistaken for an access to an adjacent allocation.

A leak is popularly defined as heap memory that no longer has any pointer to it, in addition to for some cases considering any unfreed memory as a leak. Leak scan uses the registers of each thread as well as all non-heap addressable memory, which includes below the top of the stack for each thread and the data section of each library. An indirect leak is a heap object that is reachable by a pointer to its start address, but the pointer itself is originating in leaked objects.

The most widely-used memory checking tool today is MemCheck, built on the Valgrind dynamic instrumentation platform. Memcheck only supports UNIX platforms. It replaces library allocation functions, uses a single threshold for stack swap detection, and does not distinguish false positive possible leaks from common C++ data layouts, making its possible leak reports more difficult to use. Purify was one of the first commercial memory checking tools, and the first tool to combine detection of memory leaks with detection of use-after-free errors. Purify uses link[1]time instrumentation and reports reads of uninitialized errors immediately, which can result in false positives. Purify’s basic leak detection approach is used by Memcheck and Dr. Memory.

Parallel Inspector is a commercial tool built on the Pin dynamic instrumentation platform that combines data race detection with memory checking. Like Purify, it reports reads of uninitialized errors immediately. Insure++ ?is another commercial memory checking tool. It supports inserting instrumentation at various points, including the source code prior to compile time, at link time, and at runtime, but its more advanced features require source code instrumentation. Instrumentation at runtime is inserted using dynamic binary instrumentation, just like Dr. Memory, Memcheck, and Parallel Inspector, via a tool called Chaperon. Insure++ does support delaying reports of uninitialized memory but only across copies and not other operations.

Third Degree is a memory checking tool for the Alpha platform. It inserts instrumentation at link time using ATOM. It detects uninitialized reads by filling newly allocated memory with a sentinel or canary value and reporting an error on any read of the canary value. BoundsChecker monitors Windows heap library calls and detects memory leaks and addressable accesses. It does not detect uninitialized reads. Some leak detection tools, including LeakTracer and mprof, only report memory that has not been freed at the end of execution. For these tools to be usable, the application must free all of its memory prior to exiting, even though it may have data whose lifetime is the process lifetime where it is more efficient to let the operating system free those resources.[ii]

Kernel Register Locking

An option consists in using a pair of control registers for each contiguous kernel code range to lock the register. The range register specifies the address range of a kernel code chunk, and the offset register specifies the offset between the virtual and physical page numbers of the corresponding kernel code chunk. The 0th and 1st bits of the range register determine how lock treats the register pair. The lock bit (0th) is a sticky bit that cannot be cleared once set and indicates if the kernel has finished initializing the register. It refuses any update to the register if the lock bit is set unless the write request is made from the machine mode. The valid bit (1st) determines whether the register specifies valid kernel code pages. If the valid bit is 0, lock ignores the values in the register when it sanitizes the page table entry. The lock uses the remaining bits for holding the range of the kernel code chunk as a pair of the base address and the mask. This range register can specify the chunks whose size is a power of 2 bytes, ranging from 16 KB to 16 MB, and is aligned to its size. Ten bits from the range register represent the mask, and the rest (20) are for the base address. The lock uses the value stored in the BASE as the base address of the kernel code chunk after shifting the value to the left by 14 bits. The MASK value is left-shifted by 14 bits and prepended with 1s to compose the mask value. In summary, the lock considers a specified physical address (addr) as a valid kernel code address.[iii]

Recently, applications in embedded systems have increasingly become complex, unfortunately, because of hardware cost and performance penalty, most of embedded processors lack to be equipped a MMU (Memory Management Units) which allows protecting memory accesses in general purpose computer systems. A FPGA-based off-chip detector hooked on memory bus to monitor memory access for multitasking Real-time Operating System (RTOS) applications can be used .for real time memory leak monitoring.

Multiple Kernel Memory (MKM) Mechanism

To gain complete control of a host, adversaries focus on kernel code invocations, such as function pointers that rely on the starting points of the kernel protection methods. To mitigate such subversion attacks, multiple kernel memory (MKM) can be employed for kernel address space separation. The MKM mechanism focuses on the isolation granularity of the kernel address space during each execution of the kernel code. MKM provides two kernel address spaces, i) the trampoline kernel address space, which acts as the gateway feature between user and kernel modes and ii) the security kernel address space, which utilizes the localization of the kernel protection methods (i.e., kernel observation). Additionally, MKM achieves the encapsulation of the vulnerable kernel code to prevent access to the kernel code invocations of the separated kernel address space. The evaluation results demonstrated that MKM can protect the kernel code and kernel data from a proof-of-concept kernel vulnerability that could lead to kernel memory corruption. In addition, the performance results of MKM indicate that the system call overhead latency ranges from 0.020 μs to 0.5445 μs, while the web application benchmark ranges from 196.27 μs to 6, 685.73 μs for each download access of 100,000 Hypertext Transfer Protocol sessions. MKM attained a 97.65% system benchmark score and a 99.76% kernel compilation time (https://ieeexplore.ieee.org/document/9502080).

OS for Critical Systems in Space Vehicles

RTOS stores needed data and instructions in embedded memories. Corruption in these memories due to space radiations generates non-deterministic, wrong behaviors. Software or hardware testing mechanisms can detect and sometimes correct such dangerous situations. In either case, the application programmer has to devise special tasks devoted to testing and ensure fully working mechanisms without impacting the the RTOS scheduler controlling critical computing hardware on space vehicles.

With the large volume of data acquisition required for satellite missions, downlinking presents an increasingly expensive bottleneck that drastically reduces mission efficiency. Hence, there is a skyrocketing demand to perform edge computation on board the payload system, often via a graphics processing unit (GPU). Multiple edge-computing CubeSat missions set to operate in Low Earth Orbit (LEO), house an Nvidia module to perform onboard computer vision. As is the case for many commercial off-the-shelf (COTS) devices used in CubeSats, the module does not come radiation-hardened, and its most vulnerable component is its eMMC disk. Even with the option of radiation shielding, there is nevertheless a possibility of single event effects (SEEs) reaching the module and calling for software-level mitigation as a final line of defense. A minimized Yocto-based operating system with built-in redundancy can be designed to handle these environmental pressures. The OS contains patches that enable real-time scheduling in Linux for time-sensitive reliability in flight, the methodologies of operating system minimization, software-based triple modular redundancy in persistent memory with associated bootloader modifications, and a RAM-based file system allow the device to rely less on its eMMC card and render it less prone to radiation-induced damage. Results from proton SEE tests on the device's chip exhibit lower expected error rates in LEO compared to stock devices. Additionally, the devices tested were less prone to permanent failure under a narrower beam than used in previous tests, confirming that peripherals including flash are the highest contributors to critical failures on the module.[iv]

Concurrency Vulnerability

Concurrent programs make better use of processor resources but inevitably introduce a new set of problems in terms of reliability and security. Concurrency bugs usually lead to program crashes and unexpected behavior. From a security perspective, concurrency vulnerabilities are those that exhibit harmful behavior exclusively in concurrent executions. They can take place in a diverse range of environments, such as in operating system kernels, file system operations, or general-purpose multithreaded programs. A particular characteristic of concurrency is that it not only introduces new problems, but also enables traditional vulnerabilities to be triggered in concurrent-specific ways. Those that lead to dangerous security vulnerabilities usually cause memory corruption, a strong and flexible primitive for exploitation, and are known as concurrency memory corruption vulnerabilities. Detection, focusing on concurrency memory corruption vulnerabilities in C and C++ programs and their exploitation are a subject of intensive study.

Concurrency vulnerabilities can be best found by exploring all possible thread interleaving of a program to identify harmful execution orders of operations. However, the number of possible thread interleaving grows exponentially, leading to interleaving space explosion and making this method unfeasible. Further, using data race detectors to find concurrency vulnerabilities leads to potential vulnerabilities being missed, and requires additional effort, since usually only a small portion of all reported data races are indeed harmful and even some are purposefully introduced by developers. Recently, some approaches have been proposed that rely on fuzzing due to its demonstrated effectiveness, whereas others borrow and adapt predictive detection techniques from concurrency bug detection approaches. [v]

Conclusion

Memory corruption is one of the oldest problems in computer security. Vulnerabilities caused by memory corruption related bugs are a pervasive threat, continually undermining the security of the whole computing environment. The lack of memory safety mechanisms in indispensable systems programming languages like C or C++ leaves plenty of room for programmer-induced errors which often result in catastrophic security breaches. Memory errors are a major source of reliability problems in current computing systems. Undetected errors may result in program termination, or, even worse, silent data corruption. Studies have shown that the frequency of permanent memory errors is an order of magnitude higher than previously assumed and regularly affects everyday operation. Often, neither additional circuitry to support hardware-based error detection nor downtime for performing hardware tests can be afforded. In the case of permanent memory errors, a system faces two challenges: detecting errors as early as possible and handling them while avoiding system downtime. To increase system reliability, online memory testing infrastructure capable of efficiently detecting memory errors and providing graceful degradation by withdrawing affected memory pages from further use is required.[vi]


[i] https://ieeexplore.ieee.org/document/9581003

[ii] https://ieeexplore.ieee.org/document/5764689/

[iii] https://ieeexplore.ieee.org/document/10151854

[iv] https://ieeexplore.ieee.org/document/10115703

[v] https://ieeexplore.ieee.org/document/10114930

[vi] https://ieeexplore.ieee.org/document/6133070


要查看或添加评论,请登录

Prashant Mishra的更多文章

  • 6G Challenges

    6G Challenges

    Introduction Generally, each cellular technology has an average lifecycle of 10 years from original conception to full…

    3 条评论
  • 4G to 5G – The Transition Road

    4G to 5G – The Transition Road

    Evolution of the Mobile Telephony In the past forty years mobile telephony has gone from the first generation of mobile…

    4 条评论
  • Zero Touch Provisioning - A Perspective

    Zero Touch Provisioning - A Perspective

    Introduction Adding services to existing networks is placing increased operational stress on network architects and…

  • Anti-Aging Hormones

    Anti-Aging Hormones

    Disclaimer – This article is based on open-source information and is in no way certified by any medical authority or…

  • Death with Dignity

    Death with Dignity

    I have written this article triggered by extremely sad passing away of my maternal aunt on 3rd Jan 24 following a…

    2 条评论
  • LHC and the White Rabbit Synchronisation Protocol

    LHC and the White Rabbit Synchronisation Protocol

    Section 1: Introduction Philosophy of Science Thomas Kuhn’s, The Structure of Scientific Revolutions (1962) argues that…

    1 条评论
  • Elon Musk – Creating Creativity

    Elon Musk – Creating Creativity

    Elon Musk Quotes "The path to the CEO's office should not be through the CFO's office, and it should not be through the…

    10 条评论
  • E Sreedharan – The Metroman

    E Sreedharan – The Metroman

    Abstract In a day and age when all that we read about are delayed infrastructure projects and spiralling costs, it is…

    12 条评论
  • Inspiring Story from Ramayana

    Inspiring Story from Ramayana

    Abstract Tim Fargo says about Strength of Character, “Your friends will believe in your potential, your enemies will…

    41 条评论
  • The Need for 3rd Aircraft Carrier

    The Need for 3rd Aircraft Carrier

    Abstract Every state possessing the status of a sea power strives for permanent presence of the country's flag in the…

    32 条评论

社区洞察