Arm® Cortex®-M7 in a nutshell
The 32-bit Arm® Cortex®-M7 processor core offers the best performance among the Cortex-M line up. It features dedicated Digital Signal Processing (DSP) IP blocks, including an optional double precision Floating-Point Unit (FPU). The high-performance features of the Arm Cortex-M7 core perfectly address demanding digital signal control applications, which require efficient, easy-to-use control, without the need for complex operating systems. Typical application examples include IoT, motor control, power management, embedded audio including voice recognition, industrial and home automation, healthcare, and wellness applications.
The Cortex-M7 core achieves 2.14 DMIPS/MHz and a 5.29 CoreMark/MHz thread performance.
Inside the Arm Cortex-M7 core: key features
- Armv7E-M architecture
- Bus interfaces: 64-bit AMBA4 AXI, 32-bit AHB peripheral port, 32-bit AMBA AHB slave port for external master (such as DMA controller) to access TCMs, AMBA APB interface for CoreSight debug components
- Instruction cache: 0 to 64 Kbytes, 2-way associative with optional ECC
- Data cache: 0 to 64 Kbytes, 4-way associative with optional ECC
- Instruction TCM: 0 to 16 Mbytes with optional ECC interface
- Data TCM: 0 to 16 Mbytes with optional ECC interface
- Thumb/Thumb-2 subset instruction support
- 6-stage superscalar + branch prediction
- DSP extensions: Single cycle 16/32-bit MAC, Single cycle dual 16-bit MAC, 8/16-bit SIMD arithmetic, Hardware Divide
- Optional single and double precision floating point unit (choices of none, single precision only, and single and double precision) compliant with IEEE 754 standard
- Optional 8- or 16-region MPU with sub-regions and background region
- Integrated Bit-Field Processing Instructions
- Non-maskable interrupt and 1 to 240 physical interrupts
- Optional wake-up interrupt controller
- Integrated WFI and WFE Instructions and Sleep-On-Exit capability, Sleep & Deep Sleep Signals, Optional Retention Mode with Arm Power Management Kit
- Optional JTAG and Serial Wire Debug ports. Up to 8 breakpoints and 4 watchpoints
- Optional Instruction Trace (ETM), Data Trace (DWT), and Instrumentation Trace (ITM). Optional full data trace with ET
- Support for Dual Core Lock-Step Support (DCLS)
Why choose Arm Cortex-M7 MCUs: key advantages
Armv7E-M architecture
Built on the Armv7E-M architecture from the Cortex-M4 core, the Cortex-M7 architecture offers:
- Higher performance thanks to:
- A 6-stage superscalar pipeline with branch prediction, combined with instruction and data caches. The caches not only increase performance when executing or accessing internal content, but also when using external content connected to the external memory interfaces. Just like with an application processor, developers using MCUs based on a Cortex-M7 core can use larger code and expand data beyond the limits of the internal resources to add advanced middleware and services (Artificial Intelligence models, cloud connectivity and services, multiprotocol support). MCU developers can still leverage the software packages and design environments they are used to, and can benefit from a highly integrated MCU that offers more simplicity and reduced costs thanks to an embedded Power Management Unit IC (PMIC) with no DDR memory needed.
- Higher CPU frequency, which can be achieved thanks to the deeper 6-stage pipeline architecture, offering significant improvement compared to the Cortex-M4 that includes a 3-stage pipeline.
- Instruction and data Tightly Coupled Memories (TCM) allowing 0-wait execution: while the caches increase internal and external memory performance, cache misses introduce latency, which can cause issues in hard real-time applications. Mapping the most critical routines and data in the TCMs will guarantee the 0-wait performance in such applications.
- The 64-bit AMBA4 AXI interface that adds high bandwidth peripherals such as external memory controllers, graphic IPS, GPU, internal memories, and more.
- Additional DSP extensions, like Single Instruction Multiple Data (SIMD) processing, saturation arithmetic instructions, a wide range of single-cycle MAC instructions, and an optional FPU that supports double-precision floating point operations.
The architecture of the Cortex-M7 is perfectly suited for real-time control applications requiring highly deterministic operations with low-cycle count execution, minimum interrupt latency, a short pipeline, and the possibility to perform cache-less operations.
Digital Signal Processing
Microcontrollers based on the Cortex-M7 rely on its built-in, advanced DSP hardware accelerators to process signals using mathematical calculations. The DSP hardware accelerator can process any analog signal, such as the output signal of a microphone, the feedback from a sensor embedded in a motor control system, or outputs from sensor-fusion applications.
Thanks to Digital Signal Processing, fewer cycles are required to run control-loop algorithms, therefore contributing to the performance and the power efficiency of the application. Fixed point and double precision float are both implemented in hardware on MCUs running on a Cortex-M7. They typically offer much higher performance than MCUs based on the Cortex-M4, doubling the performance levels on FFT, FIR, IIR and other key algorithms.
With increased DSP performance and higher achievable maximum frequency, the Cortex M7 matches the requirements of the most demanding signal processing applications, including audio & voice recognition, motor control, digital power, artificial intelligence and sensor fusion.
All STM32 Cortex-M7 MCUs embed the DSP with the optional double precision floating point.
Scalability and power efficiency
Microcontrollers based on the Arm Cortex-M7 support the Cortex Microcontroller Software Interface Standard (CMSIS), thereby enabling developers to port their code to or from different microcontrollers for future projects. This interface also eases the integration of third-party software, helping to reduce time to market.
The flexibility and scalability of the Cortex-M7 architecture allow designers to run most of the recent Machine Learning algorithms. Extremely power efficient, Cortex-M7 microcontrollers are excellent choices for IoT edge controllers or battery-operated sensor hubs or concentrators, as well as e-bikes.
The Cortex-M7 core is mostly embedded in single-core MCUs. However, a new generation of multi-core microcontrollers pushes back the limits of system integration and performance optimization, implementing two-task partitioning use cases:
- The Cortex-M7 can be used as the main control core, associated with the real-time Cortex-M4 core (communication protocols, sensor acquisition, real-time control)
- Alternatively, a Cortex-M4 core can be used as the real-time, general-purpose companion core to the computing horsepower of the Cortex-M7 core, which can process advanced graphics, complex digital signal processing algorithms, artificial intelligence algorithms and/or communication protocols.
STM32 microcontrollers based on the Arm Cortex-M7
Combining the Arm Cortex-M7 core with its unique proprietary, low-power silicon technology, and expertise in non-volatile embedded memory technology, hardware accelerators (Cordic for trigonometric & hyperbolic calculation & FMAC for filtering, crypto and hash engines, Graphic Processing Units, JPEG encoder and decoders), high-performance architectures, and connectivity, STMicroelectronics offers the STM32 Arm Cortex-M7 MCUs as a solution to the many technical and commercial challenges engineers need to solve.
STM32 Cortex-M7 MCUs are fully integrated into the STM32Cube development environment and leverage the tools and solutions offered by ST’s extensive network of partners.
Single Core Series | Speed (MHz) | Performance (CoreMark) | Flash (kB) | RAM (kB) | Power Supply (V) | Packages | Connectivity | Analog |
STM32F7 | 216 | 1082 | 64 to 248 | 256 to 512 | 1.7 to 3.6 | LQFP64/100/144/176,208, UFBGA144/176, TFBGA100/216, WLCSP143/180 | Common connectivity* + CAN, Camera interface, SDIO, dual Quad SPI, FMC, 2D GPU, TFT, MIPI DSI, JPEG codec | Yes |
STM32H7 | 600 | 3174 | 64 to 2048 | 564 to 1400 | 1.62 to 3.6 | VFQFPN68, LQFP64/100/144/176,208, UFBGA144/169/176, TFBGA100/216/225/240, WLCSP115/132/156, WLSCP100 | Common connectivity* + FDCAN, Camera interface, SDIO, dual Quad and Octo SPI, FMC, 2D GPU, TFT, MIPI DSI, JPEG codec, I3C, UCPD | Advanced analog |
Dual Core Series - High Performance | Cortex M7 speed (MHz) | Processor 2 | Flash (kB) | RAM (kB) | Power Supply (V) | Packages | Connectivity | Analog |
STM32H7 | 480 +L1 | Cortex-M4@240MHz | 128 to 2048 | 1024 | 1.62 to 3.6 | LQFP144/176,208, UFBGA169/176, TFBGA240, WLCSP156 | Common connectivity* + FDCAN, Camera interface, SDIO, dual Quad and Octo SPI, FMC, 2D GPU, TFT, MIPI DSI, JPEG code | Advanced analog |
Start your development using the Arm Cortex-M7 core with our recommended starter kit
NUCLEO-H7S3L8
STM32 Nucleo-144 development board with STM32H7S3L8H6, supports Arduino, ST Zio and morpho connectivity.
Explore STM32 MCU based solutions
Explore STM32 ecosystem
Explore Arm® Cortex®-M cores in STM32 32-bit microcontroller portfolio:
Arm® Cortex®-M7
Highest performance Cortex-M processor