By: Thang Tran, CTO Simplex Micro
Earlier this month we introduced the performance modeling effort and the collaboration with CircuitSutra Technologies to develop a timing model for the RISC-V scalar/vector processor core. The model follows the structure described earlier: a front-end timing model, a time-resource execution model, and VPU integration. This separation allows the system to be developed in parallel while keeping the model manageable and easier to debug.
During the past two weeks the initial architecture walkthroughs with the Circuit Sutra engineering team were completed. These sessions reviewed the CPU/VPU architecture and the modeling approach that will be used for the project. With that groundwork finished, implementation work has now started across several components of the model.
The performance model is structured to allow independent works
- On the vector side, the VPU model has been ported from its original Visual C++ environment to Linux. This is necessary because the performance model must run in a standard Linux environment for integration and eventual packaging. With the port complete, the next step is verifying all vector instructions through the VPU model. The team is preparing to run the full set of vector instruction tests once the corresponding RTL test cases are available.
- The front end which includes the branch prediction unit and instruction fetch unit is developed using the Whisper ISS trace.
- The back-end with includes the instruction decode unit, the execution units, and the load-store unit also uses the Whisper ISS to generate instructions.. This establishes the structural framework needed for the time-resource execution model.
- The visualization tool is also taking shape. The graphical framework can now generate IPC traces across one million cycles using synthetic data and allows navigation from an IPC point to a corresponding pipeline timing view. Work has started on rendering the detailed pipeline graph.
Development of supporting infrastructure is progressing in parallel. Work on register renaming and the register scoreboard is underway as part of the execution timing model. On the memory side, an initial cache structure has been implemented using a pseudo-LRU replacement scheme. The structure is intentionally simple at this stage and provides the basic indexing and data storage required for the timing model.
During this process a few issues were also identified in the instruction decode table, which will be corrected as the model development continues.
At this stage the focus remains on establishing the structural elements of the performance model: instruction flow from the ISS, execution timing through the pipeline framework, and integration with the VPU model. As these pieces come together, the next milestone will be enabling instructions to propagate end-to-end through the timing model so that scalar and vector activity can be represented on a unified timeline.


No responses yet