CPU测试主要是dhrystone,DMIPS(Dhrystone Million Instructions executed Per Second):Dhrystone是测量处理器运算能力的最常见基准程序之一,常用于处理器的整型运算性能的测量。Dhrystone是一种整数运算测试程序。
The STM32F4 family incorporates high-speed embedded memories and an extensive range of enhanced I/Os and peripherals connected to two APB buses, three AHB buses and a 32-bit multi-AHB bus matrix.- 64-Kbyte of CCM (core coupled memory) data RAM- LCD parallel interface, 8080/6800 modes- Timer with quadrature (incremental) encoder input- 5 V-tolerant I/Os- Parallel camera interface- True random number generator- RTC: subsecond accuracy, hardware calendar- 96-bit unique ID
具体型号是:STM32F407IGT6 ,主频168MHZ,M4内核
CPU core
◦ ARM® Cortex®
-M0 core
◦ Maximum frequency: 32 MHz
◦ 24-bit SysTick timer
◦ Supports interrupt vector remapping (through configuring Flash registers)
具体型号是:hk32f030mF4P6 ,TSOP20封装,主频32MHZ,M0内核
systick ,单位1ms.
DWT:
The smaller Cortex-M processors such as Cortex-M0, Cortex-M0+ and Cortex-M23 do not include the DWT capabilities described here,
由于完整版本的dhrystone占用超过10KB,而hk32f030m 只有2KB内存,裁减了部分代码后执行测试
dhrystone_test 资源占用:
不含 dhrystone_test :
Program Size: Code=3068 RO-data=264 RW-data=20 ZI-data=132
含 dhrystone_test :
Program Size: Code=3600 RO-data=264 RW-data=76 ZI-data=10444
==运行dhrystone_test RAM资源占用 :76+10444 - 20 -132=10368 B=10368/1024=10.125KB
Image component sizesCode (inc. data) RO Data RW Data ZI Data Debug Object Name44 12 0 8 0 2574 bsp_dwt.o400 248 0 48 10200 3149 dhry_1.o234 0 0 0 0 1923 hk32f030m_gpio.o38 10 0 4 0 2774 hk32f030m_it.o364 36 40 0 0 2851 hk32f030m_rcc.o334 12 0 0 0 4221 hk32f030m_usart.o28 8 192 0 240 736 keil_startup_hk32f030m.o700 278 0 0 0 180215 main.o384 40 0 4 0 2819 system_hk32f030m.o404 36 0 0 0 10141 usart.o
循环次数主要根据时间进行定取,要求测试时间大于2S
640000
20000
Dhrystone Benchmark, Version 2.1 (Language: C)Program compiled without 'register' attributeExecution starts, 640000 runs through Dhrystone[16:36:48.080]收←◆e_time 3572 1
Execution endsFinal values of the variables used in the benchmark:Int_Glob: 5should be: 5
Bool_Glob: 1should be: 1
Ch_1_Glob: Ashould be: A
Ch_2_Glob: Bshould be: B
Arr_1_Glob[8]: 7should be: 7
Arr_2_Glob[8][7]: 640010should be: Number_Of_Runs + 10
Ptr_Glob->Ptr_Comp: 536883992should be: (implementation-dependent)Discr: 0should be: 0Enum_Comp: 2should be: 2Int_Comp: 17should be: 17Str_Comp: DHRYSTONE PROGRAM, SOME STRINGshould be: DHRYSTONE PROGRAM, SOME STRING
Next_Ptr_Glob->Ptr_Comp: 536883992should be: (implementation-dependent), same as aboveDiscr: 0should be: 0Enum_Comp: 1should be: 1Int_Comp: 18should be: 18Str_Comp: DHRYSTONE PROGRAM, SOME STRINGshould be: DHRYSTONE PROGRAM, SOME STRING
Int_1_Loc: 5should be: 5
Int_2_Loc: 13should be: 13
Int_3_Loc: 7should be: 7
Enum_Loc: 1should be: 1
Str_1_Loc: DHRYSTONE PROGRAM, 1'ST STRINGshould be: DHRYSTONE PROGRAM, 1'ST STRING
Str_2_Loc: DHRYSTONE PROGRAM, 2'ND STRINGshould be: DHRYSTONE PROGRAM, 2'ND STRINGMicroseconds for one run through Dhrystone: 4
Dhrystones per Second: 213333
VAX MIPS rating: 121
DMIPS/MHz: 0.7227
hrystone Benchmark, Version 2.1 (Language: C)Program compiled without 'register' attributeExecution starts, 20000 runs through Dhrystone[14:33:11.890]收←◆e_time 2170 21
dwt_e_time 0 0
Execution endsFinal values of the variables used in the benchmark:Int_Glob: 0should be: 5
Bool_Glob: 1should be: 1
Ch_1_Glob: Ashould be: A
Ch_2_Glob: Bshould be: B
Arr_1_Glob[8]: 0should be: 7should be: Number_Of_Runs + 10
Ptr_Glob->Ptr_Comp: 536871216should be: (implementation-dependent)Discr: 0should be: 0Enum_Comp: 2should be: 2Int_Comp: 12should be: 17Str_Comp: DHRYSTONE PROGRAM, SOME STRINGshould be: DHRYSTONE PROGRAM, SOME STRING
Next_Ptr_Glob->Ptr_Comp: 536871216should be: (implementation-dependent), same as aboveDiscr: 0should be: 0Enum_Comp: 1should be: 1Int_Comp: 18should be: 18Str_Comp: DHRYSTONE PROGRAM, SOME STRINGshould be: DHRYSTONE PROGRAM, SOME STRING
Int_1_Loc: 10should be: 5
Int_2_Loc: 13should be: 13
Int_3_Loc: 7should be: 7
Enum_Loc: 1should be: 1
Str_1_Loc: DHRYSTONE PROGRAM, 1'ST STRINGshould be: DHRYSTONE PROGRAM, 1'ST STRING
Str_2_Loc: DHRYSTONE PROGRAM, 2'ND STRINGshould be: DHRYSTONE PROGRAM, 2'ND STRINGMicroseconds for one run through Dhrystone: 100
Dhrystones per Second: 10000
VAX MIPS rating: 5
DMIPS/MHz: 0.1779
Cortex-M系列
Core Architecture bits DMIPS/MHz DSP
Cortex-M0 ARMv6M 32 0.9~0.99
Cortex-M3 ARMv6M 32 1.25~1.5
Cortex-M4 ARMv6M 32 1.25~1.52 8/16 SMID 单精度FPU
Cortex-M7 ARMv7-M 32 2.14/2.55/3.23 8/16 SMID 双精度FPU
ARM processors DMIPS/MHz comparison
Core | Architecture | bits | DMIPS/MHz | DMIPS/MHz* |
---|---|---|---|---|
ARM11 | v7-A | 32 | 1.25 | |
Cortex-A7 | v7-A | 32 | 1.9 | 1.9 |
Cortex-A8 | v7-A | 32 | 2.0 | 2.0 |
Cortex-A9 | v7-A | 32 | 2.0 | 2.5 |
Cortex-A15 | v7-A | 32 | 4.0 | 3.4 |
Cortex-A17 | v7-A | 32 | 4.0 | 3.2 |
Cortex-A32 | v8-A | 32 | 2.3 | 2.3 |
Cortex-A35 | v8-A | 32/64 | 2.5 | 2.5 |
Cortex-A53 | v8-A | 32/64 | 2.3 | 2.3 |
Cortex-A55 | v8-A | 32/64 | 2.3 | 2.7 |
Cortex-A57 | v8.2-A | 32/64 | 4.6 | 4.1 |
Cortex-A72 | v8-A | 32/64 | 5.4 | 4.7 |
Cortex-A73 | v8-A | 32/64 | 7.0 | 4.8 |
Cortex-A75 | v8.2-A | 32/64 | 7.0 | 5.2 |
Cortex-A76 | v8-A | 32/64 |
ARM Cortex-M0 功能
ISA支持 Thumb® / Thumb-2 子集
流水线 3级
性能效率 1.99 CoreMarks/MHz - 0.90 至 0.99 DMIPS/MHz
中断 不可屏蔽的中断 (NMI) + 1 到 32 个物理中断
睡眠模式 集成的 WFI 和 WFE 指令和“退出时睡眠”功能
睡眠和深度睡眠信号
随 ARM 电源管理工具包提供的可选 Retention 模式
位操作 可以使用 Cortex-M System Design Kit实现位处理操作区
增强的指令 硬件单周期 (32x32) 乘法选项
调试 可选 JTAG 和Serial-Wire 调试端口。最多 4 个断点和 2 个观察点
Core Architecture bits DMIPS/MHz
M0 v6-M 32 0.9~0.99
M0+ v6-M 32 1.08
M3 v6-M 32 1.25~1.5
M4 v6-M 32 1.25~1.52
M7 v-M 32 2.14/2.55/3.2
官方数据:M0 有 0.90 至 0.99 DMIPS/MHz,但实际上hk32f030m 只有0.1779 ,而且还是裁剪版本的测试,实际更低
官方数据:M4 1.25~1.52 DMIPS/MHz,但实际上STM32F407IGT6 只有0.7227.还是国际一线大厂设计的产品,与宣称的差距非常大。
https://gitee.com/RT-Thread-Mirror/dhrystone
https://blog.stratifylabs.co/device/2019-05-20-Dhrystone-Benchmarking-on-MCUs/
https://www.cnblogs.com/cjchang/p/12187518.html