By: Thang Tran, CTO Simplex Micro
We have started a joint development project with CircuitSutra Technologies, headquartered in India, to develop a performance model of our RISC-V scalar/vector processor core. This blog series will showcase the features and benefits of the model as the model moves from specification to completion.
The objective is straightforward. We want a model that reveals behavior over time, not just instruction correctness. An ISS can confirm that instructions execute correctly. It cannot show where performance is lost. We need early visibility into stalls, branch penalties, structural conflicts, memory timing, and scalar/vector interaction.
This joint project builds on an open-source ISS and our existing cycle-accurate VPU model. The performance model does not intend to replace the functional execution layer which is already stable and validated. Instead, it consumes the functional trace and overlays timing behavior to expose performance characteristics before RTL is finalized.
The work is structured deliberately.
First, we establish a structural timing baseline. This is Phase 1. It includes front-end timing, the time/resource execution model, scalar–vector time alignment, and IPC-over-time output. This stage should take several months if development proceeds in parallel. The full effort, including added realism and extended modeling, is expected to run six to eight months.
The sequencing is intentional. Build the framework first. Add complexity later.
The model itself is different than the traditional performance modeling technique with the focus on Simplex Micro’s philosophy of simplicity and orthogonal division of major units.
- The front-end models branch prediction and instruction fetch timing. It determines when instructions become available and applies mispredict penalties.
- The decode/execution models decode, issue, execution, and memory timing using a time/resource abstraction which is based on Simplex Micro’s innovative time-based scheduling technique.
- Integration of the existing VPU model as an extension of the decode/execution model.
On the vector side, the immediate task is porting the VPU model from its original Visual C++ environment to Linux/C++. The performance model must run cleanly in a standard Linux environment before it can be packaged for external use.
In parallel, the visualization layer is under development with the goal of being user friendly and giving the customer an appreciation of the execution pipeline for instructions.
The visualization layer can be developed independently with synthetic to reduce integration risk.
Phase 1, goal is to provide the first level of accuracy to prove the achievable performance of Simplex CPU/VPU where the customers can integrate the performance model as part of their hardware/software co-design in their SoC. Our ultimate goal is to have the performance model correlated with RTL within 5% and building the framework for future designs of Simplex Micro road map as well as for CircuitSutra to have a modeling framework & expertise for the performance modeling for processors.
Phase 2 includes realistic behavior of the CPU/VPU such as multi-level cache modeling.
This week was about structure and alignment. The next update will show measurable progress in one of the Phase 1 models.
We build the model the same way we build the architecture: establish clean structure first, then add detail.


No responses yet