Edge AI Power Benchmarking — Part 6: Measuring the Power Efficiency of MemryX MX3 | Mario BergeronEdge AI Power Benchmarking — Part 6: Measuring the Power Efficiency of MemryX MX3<br>June 19, 2026 · 17 min · Mario Bergeron<br>Table of ContentsInstalling the MemryX SDK<br>Reproducing the MemryX benchmarksThroughput results at 14 TFLOPS<br>Throughput results at 20 TFLOPS
Measuring MemryX MX3 Power with mb-powermon.pyMeasuring MemryX MX3 Power at 14 TFLOPS<br>Measuring MemryX MX3 Power with its on-board telemetry<br>Measuring MemryX MX3 Power at 20 TFLOPS
Thermal ConsiderationsThermal Throttling
Idle Power<br>Known Issue with Earlier MX3 modules<br>Conclusion<br>What Next?<br>Vendor Engagement Disclaimer<br>Version History
In Parts 1–3, we established a methodology for independent power measurement on edge AI accelerators. In Parts 4 and 5, we applied it to the Axelera Metis and the DeepX M1.<br>Series : Edge AI Power Benchmarking<br>Part 1: Hailo-8, the Reference Methodology<br>Part 2: Power Insertion with ElmorLabs<br>Part 3: Measuring Edge AI Power with INA228<br>Part 4: Measuring the Power Efficiency of Axelera Metis<br>Part 5: Measuring the Power Efficiency of DeepX M1<br>Part 6: Measuring the Power Efficiency of MemryX MX3 (this post)
Now we apply the same methodology to the MemryX MX3 M.2 acceleration module.<br>Installing the MemryX SDK#<br>MemryX provides excellent instructions on installing their driver, runtime, and tools:<br>MemryX Developer HubGet StartedInstall runtime<br>Install tools
After installation, where I created a “venv-mx” python virtual environment, I was able to confirm the presence of the MemryX MX3 module with the mx_bench utility:<br>(venv-mx) $ mx_bench --hello<br>Hello from MXA!
Device ID | Chip Count | Freq | Volt<br>----------|------------|-------|-----<br>0 | 4 | 600 | 700<br>Reproducing the MemryX benchmarks#<br>Before measuring power, I wanted to reproduce MemryX’s published benchmark. In line with the previous articles, I chose ResNet50, knowing it has the lightest post-processing stage, being a classification model.<br>MemryX Model ZooResNet-50 (MXA-Optimized)14 TFLOPS (600 MHz) : 1778 FPS<br>20 TFLOPS (850 MHz) : 2317 FPS
They publish two different benchmarks. The first benchmark corresponds to the default configuration (600 MHz clock). The second benchmark is taken in over-clocked mode (850 MHz clock).<br>Our initial target is to reproduce the benchmark of 1778 FPS .<br>I will attempt to perform the same in over-clocked mode, but am not certain if my host will support this.<br>In order to measure the FPS metric for the ResNet50 model, I downloaded the following files from the MemryX model zoo:<br>ResNet-50 (MXA-Optimized)<br>Throughput results at 14 TFLOPS#<br>MemryX provides a benchmarking utility, mx_bench, that takes a .dfp compiled model and a frame count, and reports average FPS and system latency:<br>(venv-mx) $ mx_bench -v -d ResNet_50_MXA_Optimized_224_224_3_onnx.dfp -f 50000
╭─────────────────┬─────┬─────┬────────╮<br>│ │ │ │ │<br>│ │ ├──── │<br>│ │ │ ╞══ ══╡ │<br>│ │ │ │ ├──── │<br>│ │ │ │ │ │ │<br>╰─────┴─────┴─────┴─────┴─────┴────────╯
╔══════════════════════════════════════╗<br>║ Benchmark ║<br>║ Copyright (c) 2019-2026 MemryX Inc. ║<br>╚══════════════════════════════════════╝
Ran 50000 frames<br>Model: 0<br>Average FPS: 1796.36<br>Average System Latency: 3.24 ms
(venv-mx) $ mx_bench -v -d ResNet_50_MXA_Optimized_224_224_3_onnx.dfp -f 50000
╭─────────────────┬─────┬─────┬────────╮<br>│ │ │ │ │<br>│ │ ├──── │<br>│ │ │ ╞══ ══╡ │<br>│ │ │ │ ├──── │<br>│ │ │ │ │ │ │<br>╰─────┴─────┴─────┴─────┴─────┴────────╯
╔══════════════════════════════════════╗<br>║ Benchmark ║<br>║ Copyright (c) 2019-2026 MemryX Inc. ║<br>╚══════════════════════════════════════╝
Ran 50000 frames<br>Model: 0<br>Average FPS: 1796.36<br>Average System Latency: 3.31 ms<br>Two back-to-back runs on the same module land at exactly the same throughput of 1796.36 FPS , with latency varying only slightly (3.24 ms vs. 3.31 ms).<br>Not only did I match MemryX’s published 14 TFLOPS (600MHz) benchmark of 1778 FPS, I exceeded it by ~1%, hitting 1796 FPS.<br>Throughput results at 20 TFLOPS#<br>In order to access the 20 TFLOPS performance of the MemryX MX3, I need to over-clock to 850MHz.<br>This can be done with the mx_set_powermode command:<br>(venv-mx) $ sudo mx_set_powermode<br>Once in the MX3 Power Tweak Utility’s GUI, select:<br>1 - Set Power Mode (4-chip module)9 - 850 MHz<br>OK
3- Exit
The fact that all frequencies above 600 MHz are in RED is probably a foreshadowing of what was going to happen, but I moved ahead with 850 MHz just the same.<br>I noticed that the frequency change only took effect after a reboot.<br>(venv-mx) $ mx_bench --hello<br>Hello from MXA!
Device ID | Chip Count | Freq | Volt<br>----------|------------|-------|-----<br>0 | 4 | 850 | 780
(venv-mx) abbeefai@AlbertaBeefAI:/media/abbeefai/TheExpanse/memryx$ mx_bench -v -d ResNet_50_MXA_Optimized_224_224_3_onnx.dfp -f 50000
╭─────────────────┬─────┬─────┬────────╮<br>│ │ │ │ │<br>│ │ ├──── │<br>│ │ │ ╞══ ══╡ │<br>│ │ │ │ ├──── │<br>│ │ │ │ │ │...