Hello everyone,
I’m preparing a benchmark to compare the computational performance of code generated from a Simulink model on two target boards:
- TMS320F28388D (TI C2000 family)
- AM2634 (TI Sitara family)
I do not have Simulink Hardware Support Packages for both boards available, so my goal is to generate one portable C codebase that can be compiled and executed on both targets for a fair comparison.
Before I post my model or code, I’d like to ask the community for recommendations and best practices on several point:
How to best measure “computational power” / performance
- What metrics should I use for a fair comparison between these two architectures? (e.g., execution time per model_step(), cycles/step, throughput, latency, memory footprint, flash/ROM usage, RAM/stack usage, determinism/jitter)
ERT vs GRT — which to use for benchmarking?
- Is ert.tlc (Embedded Coder) always preferable for embedded benchmarking vs grt.tlc?
- If I generate with grt.tlc for quick host prototyping, what differences should I expect vs ert.tlc that would meaningfully affect timing/size comparisons?
How to configure the “Hardware Implementation” / ProdHW if I don’t have both HW packages
- If I want a single source tree that compiles on both boards, what are the safest Hardware Implementation settings to choose (native word size, endianness, char/short/int/long sizes, portable word sizes, floating-point settings)?
- Is setting the target to a Generic 32-bit little-endian + Portable word sizes = ON the recommended conservative approach?
Does using a specific hardware support package / ProdHW device improve generated code optimization?
- If I later install the official TI hardware support packages for each board and regenerate code per-board, how much difference should I expect in performance compared to a single generic build?
- In practice, is the recommended approach to generate single portable source that compiles on both boards, or generate two board-specific builds (each with its ProdHW/hw package configured) to maximize per-board optimization?
Thanks in advance — any pointers, links to docs/examples or short code snippets will be very helpful.
Best regards,
Pasquale