A closer look at the Cortex-M3 architecture and abbreviations

2024-01-29 16:56:38 14

First of all, when learning Cortex-M3, we must know the necessary abbreviations. ​
It is organized as follows:
AMBA: Advanced Microcontroller Bus Architecture ADK: AMBA Design Kit
AHB: Advanced High Performance Bus AHB-AP: AHB Access Port
APB: Advanced Peripheral Bus ARM ARM: ARM Architecture Reference Manual
ASIC: Industry Application Specific Integrated Circuit ATB: Advanced Trace Bus
BE8: Byte invariant big-endian mode CPI: Number of cycles per instruction
DAP: Debug Access Port DSP: Digital Signal Processor
DWT: Data Watchpoints and Tracking ETM: Embedded Tracking Macrocell
FPB: Flash address reload and breakpoint FSR: fault status register
HTM: Core Sight AHB tracking macrocells
ICE: Online emulator IDE: Integrated development environment
IRQ: Interrupt request (usually external interrupt request)
ISA: Instruction System Architecture ISR: Interrupt Service Routine
ITM: Instrumented Tracking Macrocell
JTAG: Joint Point Test Action Group (a standard for testing and debugging interfaces)
LR: link register
LSB: least significant bit
MSB: most significant bit
LSU: load storage unit
MCU: Microcontroller Unit
MPU: Memory Protection Unit
MMU: Memory Management Unit
MSP: main stack pointer
NMI: non-maskable interrupt
NVIC: Nested Vectored Interrupt Controller
PC: program counter
PPB: Private Peripheral Bus
At the same time, the following provisions are also required:
numerical value
1. 4''hC, 0x123 both represent hexadecimal numbers
2. #3 means number 3 (e.g., IRQ #3 refers to interrupt No. 3)
3. #immed_12 represents a 12-bit immediate number
4. Register bits. Usually it represents a numerical value of a bit field, for example
bit[15:12] represents the value of the bit sequence counting from 15 to 12.
Register access type
1. R means read-only
2. W means write only
3. RW means readable and writable (everyone on earth seems to know the first 3 items)
4. R/Wc means readable, but write access will clear it to 0
Cortex-M3 chip introduction
1. The basic structure of the chip is as shown below


2. Knowledge about ARMv7
In this version, the core architecture changes from a single style to 3 styles for the first time.
Style A: An "open application platform" designed for high performance - getting closer to computers.
Style R: For high-end embedded systems, especially those with real-time requirements - both fast and real-time. ​
Style M: For use in deeply embedded, microcontroller-style systems.
Introduction A: "Open Application Platform" for high performance, applied to processors that need to run complex applications. Support large embedded operating systems.
R: Used for high-end embedded systems that require real-time performance.
M: Used in deeply embedded, microcontroller-style systems.
3. The stage of Cortex-M3 processor

High performance + high code density + small silicon chip area make CM3 an ideal processing platform for large areas. It is mainly used in the following fields:
(1) Low-cost microcontroller
(2) Automotive electronics
(3) Data communication
(4)Industrial control
(5) Consumer electronics products
4. Cortex-M3 Overview

(1 Introduction
Cortex-M3 is a 32-bit processor core. The internal data path is 32-bit, the registers are 32-bit, and the memory interface is 32-bit. CM3 adopts Harvard architecture and has independent instruction bus and data bus, which can allow instruction fetching and data access to run in parallel. In this way, data access no longer occupies the instruction bus, thus improving performance. In order to realize this feature, CM3 contains several bus interfaces, each of which is optimized for its own application, and they can work in parallel. But on the other hand, the instruction bus and data bus share the same memory space (a unified memory system).
                 
More complex applications may require more storage system functions, for which CM3 provides an optional MPU and can also use external cache if needed. In addition, in CM3, both little-endian mode and big-endian mode are supported.
(2) Simplified diagram of Cortex-M3



(3) Register group
The processor has a register set of R0-R15, of which R13 is the stack pointer SP. There are two SPs, but only one can be seen at the same time. This is the so-called "banked" register.

a. R0-R12 are all 32-bit general-purpose registers used for data operations. But note: most 16-bit Thumb instructions can only access R0-R7, while 32-bit Thumb-2 instructions can access all registers.
                       
b. Cortex-M3 has two stack pointers, but they are banked, so only one of them can be used at any time. ​

Main stack pointer (MSP): The stack pointer used by default after reset, used for the operating system kernel and exception handling routines (including interrupt service routines)

Process Stack Pointer (PSP): Used by user's application code.
---The lowest two bits of the stack pointer are always 0, which means the stack is always 4-byte aligned. ---
                 
c. R14: Connection register--when a subroutine is called, R14 stores the return address.
                         
d. R15: Program counter register - points to the current program address. If its value is modified, the execution flow of the program can be changed (there are many advanced techniques here)
                         
e. Cortex-M3 is also equipped with several special function registers at the core level, including:
Program Status Word Registers (PSRs)

Interrupt mask register set (PRIMASK, FAULTMASK, BASEPRI)

Control register (CONTROL)

The Cortex-M3 processor supports two processor operating modes and also supports two levels of privileged operations. ​
                 
The two operating modes are: processor mode and thread mode. The original intention of introducing the two modes is to distinguish between the code of ordinary applications and the code of exception service routines - including the code of interrupt service routines.
                     
Another aspect of Cortex-M3 is the hierarchy of privileges - privileged level and user level. This can provide a memory access protection mechanism so that ordinary user program code cannot accidentally or even maliciously perform critical operations. The processor supports two privilege levels, which is also a basic security model.

When CM3 is running the main application (threaded mode), you can use either privileged or user level; however, exception service routines must execute at privileged level. After reset, the processor defaults to thread mode with privileged access. At the privilege level, the program can access all ranges of memory (if there is an MPU, outside the restricted areas specified by the MPU), and can execute all instructions.
                 
Programs at the privileged level can do whatever they want, but they may also play themselves in - switch to user level. Once you enter the user level, you have to go through "legal procedures" if you want to come back - a user-level program cannot simply try to rewrite the CONTROL register and return to the privileged level. It must first "appeal": execute a system call instruction ( SVC). This triggers an SVC exception, which is then taken over by an exception service routine (usually part of the operating system). If entry is granted, the exception service routine modifies the CONTROL register to re-enter the privileged level in user-level thread mode.
                     
In fact, the only way from the user level to the privileged level is an exception: if an exception is triggered during program execution, the processor always switches to the privileged level first, and when the exception service routine completes execution and exits, it returns to the previous state.


By introducing privilege levels and user levels, certain untrusted or undebugged programs can be restricted at the hardware level, preventing them from arbitrarily configuring critical registers, thus improving the reliability of the system. Furthermore, if equipped with an MPU, it can also supplement the privilege mechanism - protecting critical storage areas from corruption, which are usually areas of the operating system. ​
(4) Built-in nested vector interrupt controller
Cortex-M3 is equipped with an interrupt controller at the core level - the Nested Vectored Interrupt Controller NVIC (Nested Vectored Interrupt Controller). It has deep "intimate contact" with the kernel - it is tightly coupled with the kernel.
NVIC provides the following functions:

Nestable interrupt support

Vectored interrupt support

Dynamic priority adjustment support

Interrupt latency is greatly shortened

Interrupts can be masked

Nestable interrupt support: Nestable interrupt support has a wide scope, covering all external interrupts and most system exceptions. The external manifestation is that these exceptions can be assigned different priorities. The current priority is stored in a dedicated field of the xPSR. When an exception occurs, the hardware automatically compares whether the priority of the exception is higher than the current exception priority. If a higher priority exception is found, the processor will interrupt the current interrupt service routine (or ordinary program) and service the new exception - that is, preempt it immediately.
                       
Vector interrupt support: When starting to respond to an interrupt, CM3 will automatically locate a vector table, find the entry address of the ISR from the table based on the interrupt number, and then jump to it for execution. There is no need for software to determine which interrupt has occurred, as in the previous ARM, and there is no need for semiconductor manufacturers to provide private interrupt controllers to complete this work. In this way, the interruption delay time is greatly shortened.
(5) Memory mapping
Cortex-M3 supports 4G storage space, the specific allocation is as follows:


(6) Bus interface
There are several bus interfaces inside Cortex-M3 so that CM3 can access addresses and internal (access memory) at the same time. They are:
Instruction storage area bus (two)

system bus

Private peripheral bus

​​
There are two code storage area buses responsible for accessing the code storage area, namely the I-Code bus and the D-Code bus. The former is used for fetching instructions, and the latter is used for table lookup and other operations. They are optimized for the best execution speed.
                 
The system bus is used to access memory and peripherals, covering areas including SRAM, on-chip peripherals, off-chip RAM, off-chip expansion devices, and part of the system-level storage area. ​
​​
The private peripheral bus is responsible for accessing some private peripherals, mainly accessing debugging components. They are also in system-level storage. ​
(7) Memory protection unit (MPU)
Cortex-M3 has an optional memory protection unit. Once equipped with it, different access restrictions can be imposed on privileged level access and user level access. When a violation is detected, the MPU will generate a fault exception, which can be analyzed by the fault exception service routine and corrected if possible.
                 
There are many ways to play with MPUs. The most common one is that the MPU is used by the operating system to prevent the data of privileged code, including the data of the operating system itself, from being corrupted by other user programs. When MPU protects memory, it is managed by area. It can set certain memory regions as read-only to prevent the contents there from being accidentally changed; it can also isolate data areas between different tasks in a multi-tasking system. In a word, it will make embedded systems more robust and reliable (many industry standards, especially aviation ones, stipulate that MPU must be used to perform protection functions - Translation
Note) .
(8) Brief review of Cortex-M3
1. High performance
Many instructions are single-cycle - including multiplication-related instructions. And in terms of overall performance, Cortex-M3 is comparable to most other architectures. ​

The instruction bus and data bus are separated, and value acquisition and internal access can be done in parallel.
The arrival of Thumb-2 bids farewell to the old generation of state switching. It is no longer necessary to spend time switching between the 32-bit ARM state and the 16-bit Thumb state. This simplifies software development and code maintenance, getting products to market faster. ​
The Thumb-2 instruction set brings more flexibility to programming. Many data operations can now be done with shorter code, which means Cortex-M3 has higher code density and requires less memory. ​
Instructions are processed as 32 bits. Up to two instructions can be fetched in the same cycle, leaving more bandwidth for data transmission. ​
The design of Cortex-M3 allows the microcontroller to operate at high frequencies (modern semiconductor manufacturing technology can guarantee speeds above 100MHz). Even if it runs at the same speed, the number of cycles per instruction (CPI) of CM3 is lower, so it can be done at the same MHz. More work; on the other hand, the same application requires a lower clock speed on CM3.
2. Advanced interrupt processing function
The built-in nested vectored interrupt controller supports 240 external interrupt inputs. The vectorized interrupt function greatly reduces interrupt latency because software is no longer required to determine the source of the interrupt. Nesting of interrupts is also implemented at the hardware level and does not require software code to implement it.
Cortex-M3 automatically pushes R0-R3, R12, LR, PSR and PC when entering the exception service routine, and automatically pops them when returning. How refreshing! It not only speeds up the response to interrupts, but also eliminates the need for assembly language code.
NVIC supports setting different priorities for each interrupt, making interrupt management extremely flexible. The broadest implementation must support at least 8 levels of priority and can be modified dynamically.
There are two ways to optimize interrupt response, they are the "tail-biting interrupt mechanism" and the "late interrupt mechanism". ​
Some instructions that require more cycles to execute can be interrupted and resumed as if they were a sequence of instructions. These instructions include load multiple registers (LDM), store multiple registers (STM), PUSH involving multiple registers, and POP involving multiple registers.
Unless the system is completely locked up, NMI (non-maskable interrupt) will respond to the request as soon as it is received. NMI is essential for many safety-critical applications (such as emergency shutdown when a chemical reaction is about to run out of control).
Through the above, we can easily understand some basic knowledge and structure of Cortex-M3, laying a foundation for learning Cortex-M3.

Tags: