Unlocking the Power of ARM: Exploring Cortex Processor States

Unlocking the Power of ARM: Exploring Cortex Processor States

Published on 19/10/2024

Written by: Malek BOUBAHRI & Khaled Dhif

Introduction

ARM processors are widely used in embedded systems, with a key feature being their support for multiple instruction sets. This article is part of our ARM Cortex-M series, building on previous discussions of architecture and hands-on experiences. In a prior article Unlocking the Power of ARM: Operating States, Modes and Access Levels, we explored Thumb and Debug states. Now, we’ll dive into the ARM and Thumb instruction sets, the role of the T-bit, and how modern Cortex-M cores like the Cortex-M7 handle performance and code size.

Let’s explore how these differences shape the flexibility of ARM-based systems.

ARM, Thumb, and ThumbEE States

A processor executing ARM instructions operates in ARM state, while one executing Thumb instructions operates in Thumb state. There is also a ThumbEE state for executing ThumbEE instructions and a Jazelle state for executing Java bytecode.

In each state, a processor cannot execute instructions from another instruction set; for example, a processor in ARM state cannot run Thumb instructions, and vice versa. It’s crucial to ensure the processor receives the correct instruction set for its current state.

Most ARM processors start executing code in ARM state, but some can be configured to start in Thumb state or may only execute Thumb code.

Changing State

Each instruction set includes specific instructions to change the processor state. To switch between ARM and Thumb states, the assembler mode must be adjusted to generate the appropriate opcodes using ARM or THUMB directives.

Thumb Execution Environment (ThumbEE) and Jazelle State

Thumb Execution Environment (ThumbEE) is a special version of the Thumb instruction set that helps run code generated dynamically on the device. This means the code can be compiled right before or while it’s being executed, often from a form of bytecode or other intermediate types. ThumbEE is designed to work with various compilation methods, like Just-In-Time (JIT), which compiles code as it's needed, or Ahead-Of-Time (AOT), which compiles code before it runs. This makes it especially useful for programming languages that use managed pointers and arrays, allowing for better performance and efficient use of memory. However, ThumbEE cannot easily work alongside the regular ARM and Thumb instruction sets. Although the use of ThumbEE has been deprecated as of the latest ARM manual, it still has relevance in certain applications. The ThumbEE extension is required for ARMv7-A profile implementations and is optional for ARMv7-R profile implementations.

Jazelle State, on the other hand, is a feature that allows ARM processors to run Java bytecode directly. This capability improves the performance of Java applications by eliminating the need for an interpreter or Just-In-Time (JIT) compilation. When the processor is in Jazelle state, it knows which piece of Java code (called JVM bytecode) to run next, which allows for smooth execution of Java programs. Jazelle is specifically focused on Java and can handle native methods using instructions from the ARM, Thumb, or ThumbEE instruction sets, ensuring that Java applications run efficiently on embedded and mobile devices. Jazelle state is primarily associated with ARMv7-A profile implementations.

In summary, both ThumbEE and Jazelle enhance the capabilities of ARM processors. ThumbEE is designed for a variety of programming languages and execution methods, while Jazelle is specialized for running Java bytecode. Each serves a unique purpose, making them suitable for different types of applications in embedded and mobile environments.

ARM and Thumb Instruction Sets

ARM Instruction Set

The ARM instruction set consists of 32-bit fixed-length instructions. Each instruction is exactly 32 bits long, providing the flexibility to implement a wide variety of operations with simple decoding. ARM instructions are 4-byte aligned and offer high performance due to the large range of instructions and addressing modes.

  • Registers: ARM instructions can utilize all available registers, providing a richer feature set for complex applications. The architecture supports 16 general-purpose registers, allowing for extensive data manipulation and control.

Thumb Instruction Set

The Thumb instruction set was introduced to improve code density, especially for embedded systems where memory space is a premium.

  • Instructions: Thumb instructions are typically 16 bits long, but with the introduction of Thumb-2, there are also 32-bit Thumb instructions. The Thumb instruction set is designed to provide a subset of ARM instructions in a more compact format, utilizing only a subset of registers (the low registers) compared to the full ARM instruction set.

While the original Thumb instruction set had limitations in functionality, particularly with conditional branch instructions, Thumb-2 (introduced with ARMv6T2) significantly expanded its capabilities by adding many 32-bit instructions to improve performance while retaining the code size benefits of 16-bit instructions.

ARM vs. Thumb programmer’s models


Evolution of Thumb and Thumb-2

Original Thumb (ARMv4T)

The original Thumb instruction set was introduced with ARMv4T and provided 16-bit instructions that were essentially a compressed subset of ARM instructions. This allowed smaller, more efficient code but with some sacrifices in flexibility and performance.

  • Branching Limitations: One notable limitation was in branch instructions, which had a restricted range of ±4 KB. This meant that code far from the branch target could not be reached with a single instruction, leading to potential complications in larger applications.

assembly code:

In ARM, conditional execution is handled within the same instruction. The EQ suffix allows the MOV and ADD instructions to be conditionally executed without the need for a branch, making the code more compact and efficient, especially for small conditional operations.

assembly code:


Thumb-2 (ARMv6T2 and Later)

With ARMv6T2, ARM introduced Thumb-2 Technology, which added 32-bit instructions to the Thumb instruction set, allowing for a richer and more powerful set of operations. Thumb-2 maintained backward compatibility with the original 16-bit Thumb instructions but extended the capabilities of the instruction set to match that of ARM.

Key features of Thumb-2:

  • 16-bit narrow instructions and 32-bit wide instructions.
  • Instructions that are almost identical in functionality to ARM, but with the added benefit of smaller code size where appropriate.

Note: While Thumb-2 introduced 32-bit instructions that offer similar functionality and might even look similar in assembly syntax to ARM instructions, they have totally different encodings under the hood. This distinction is crucial: even though the high-level operations might appear identical in both ARM and Thumb-2, the binary representation of the instructions is different, reflecting the unique encoding schemes of each instruction set.

Key Enhancements of Thumb-2:

  • 32-bit instructions added to support exception handling in Thumb state, access to coprocessors, digital signal processing (DSP), and media instructions.
  • Improved performance in cases where a single 16-bit instruction restricts compiler capabilities.
  • Introduction of a 16-bit IT instruction that enables conditional execution for one to four subsequent Thumb instructions, forming an IT block.
  • Addition of a 16-bit Compare with Zero and Branch (CZB) instruction to enhance code density by replacing a two-instruction sequence with a single instruction.


Switching Between ARM and Thumb State

The T-bit in the Program Status Register (PSR) determines whether the processor is in ARM state or Thumb state:

  • T-bit = 0: The processor is in ARM state, executing 32-bit ARM instructions.
  • T-bit = 1: The processor is in Thumb state, executing a mix of 16-bit and 32-bit Thumb instructions (Thumb-2).

Switching States with the BX Instruction

The BX (Branch and Exchange) instruction is used to switch between ARM and Thumb states:

  • When branching to an address, if the least significant bit (LSB) of the target address is set to 1, the processor switches to Thumb state (T-bit = 1).
  • If the LSB is 0, the processor switches to ARM state (T-bit = 0).

assembly code:

If R0 contains an address with the LSB set to 1, the processor switches to Thumb state. If the LSB is 0, the processor switches to ARM state. The 32-bit ARM Thumb-2 instructions are introduced in the space previously occupied by the Thumb BL and BLX instructions.


ARMv7-M Architecture as an Example

The ARMv7-M architecture is tailored specifically for microcontroller applications, emphasizing efficiency, low power consumption, and a small footprint. Key features include:

  • Thumb-2 Instruction Set: This architecture utilizes only the Thumb-2 instruction set, combining the advantages of both 16-bit and 32-bit instructions. The inclusion of both instruction sizes allows for greater flexibility and performance while conserving memory.
  • Low Power Consumption: Designed for low power, ARMv7-M processors are ideal for battery-powered devices, ensuring efficient operation in energy-sensitive applications.
  • Nested Vectored Interrupt Controller (NVIC): ARMv7-M processors include an integrated NVIC that provides low-latency interrupt handling, essential for real-time applications. This feature allows for fast response to external events, critical in embedded systems.

ARMv7-M (Cortex-M7 Series) and the T-bit

In processors like the Cortex-M7, which are based on the ARMv7-M architecture, only the Thumb instruction set is supported. There is no ARM state. In these processors, the T-bit is permanently set to 1, and the processor can only execute Thumb instructions.

Although the Cortex-M7 retains the T-bit, it does not facilitate switching to ARM state because the ARM instruction set is not supported. This design choice simplifies the architecture and reduces hardware complexity, critical for power-sensitive embedded applications.

Keeping the T-bit in the EPSR (Execution Program Status Register) helps maintain consistency with the broader ARM architecture, which is shared across the A (Application), R (Real-Time), and M (Microcontroller) profiles. Even though the Cortex-M7 does not execute ARM instructions, having the T-bit still allows developers familiar with the ARM architecture to transition smoothly to the Cortex-M environment.

Summary

The ARM and Thumb instruction sets are integral to the ARM architecture, providing distinct The ARM and Thumb instruction sets are crucial to the ARM architecture, balancing performance and code density. ARM provides high flexibility, while Thumb—especially with Thumb-2—optimizes memory use for embedded systems. Understanding these sets is key for efficient software development on ARM-based platforms.

Stay tuned for our hands-on tutorial, where we’ll show you how to switch between instruction sets and optimize your code. Let us know what you’re looking forward to learning!

Olivier Lehé

IT Director - COMEX member - P&L Leader of Data and Cloud Platform

5 个月
回复
Ranim Khalfallah

Embedded systems engineering student | PCB design enthusiast

5 个月

Good job !

Malek BOUBAHRI

Intern @STMicroelectronics

5 个月

Interesting ??

要查看或添加评论,请登录

Khaled Dhif的更多文章

社区洞察

其他会员也浏览了