Processor and memory model¶
This example models a simple processor-memory system. The processor is a probabilistic instruction trace generator: it generates a stream of instructions, each of which unconditionally performs an instruction fetch and optionally performs a data load, a data store, or both (an atomic load-store). The memory is a byte-addressable array that responds to each request after a fixed latency.
The two modules communicate through a request-response token protocol: the processor sends a request token and blocks until the memory returns a response token. This models blocking memory access — a natural starting point before introducing features such as out-of-order issue or caches.
What this example demonstrates:
- Request-response protocol with two directed nets
- Typed multi-field token payloads with
sitar::pack/sitar::unpack - Blocking communication:
wait until (port.peek())to suspend until a response arrives - Probabilistic behavior using
rand()in code blocks - Module-level initialization of a data array in
init
System overview¶
flowchart LR
proc["Processor\n(trace generator)"]
mem["Memory\n(1 KB, latency=LATENCY)"]
proc -->|"req token<12>"| mem
mem -->|"resp token<8>"| proc
The two nets have capacity 1, enforcing strict alternation: the processor issues one request and the memory returns one response before the next request can proceed.
Token formats¶
Request token<12>: [ type : 4B | addr : 4B | data : 4B ]
Response token<8> : [ error : 4B | data : 4B ]
The type field encodes the operation:
| Value | Name | Description |
|---|---|---|
| 0 | IFETCH |
Instruction fetch — always issued for every instruction |
| 1 | LOAD |
Data load only |
| 2 | STORE |
Data store only |
| 3 | ATOMIC_LS |
Combined load and store |
The error field in the response is 1 if the address is out of bounds; 0 otherwise.
Top-level structure¶
The system is parameterized at the System level. The processor parameters LOAD_PCT and STORE_PCT are independent integer percentages (0-100): an instruction has probability LOAD_PCT% of including a load and independently STORE_PCT% of including a store. An instruction with both becomes an ATOMIC_LS.
Processor¶
The processor behavior loops over instructions. For each instruction:
- IFETCH — always send an instruction fetch to the memory at address
pc * 4and wait for the response. - Probabilistic decision — independently decide whether to load and/or store, using
rand()against the percentage thresholds. - Data access — if a data access is needed, pack the appropriate request token, push it (with retry on backpressure), and wait for the response.
The processor uses member variables (declared in decl) for all state so that values persist between code blocks within the same behavior iteration.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 | |
Blocking communication
wait until (resp_port.peek()) suspends the processor in place until the memory places a response token on resp. The processor makes no forward progress while waiting. This models a blocking memory interface: one outstanding request at a time.
Memory¶
The memory module holds a 1 KB byte array initialized in init with the pattern mem[i] = i % 256. It processes one request at a time: pull the request, wait LATENCY cycles, perform the read or write, and push the response.
For a write (STORE or ATOMIC_LS), the memory copies 4 bytes from the request data field into the array. For a read (IFETCH, LOAD, or ATOMIC_LS), it copies 4 bytes from the array into the response data field. An out-of-bounds address sets error=1 in the response without modifying the array.
Expected output¶
With LOAD_PCT=40, STORE_PCT=30, NUM_INSTR=5, LATENCY=3, a representative run might look like:
(0,1) TOP.sys.proc : IFETCH pc=0 addr=0
(3,0) TOP.sys.mem : MEM READ addr=0 data=50462976
(3,0) TOP.sys.proc : IFETCH OK data=50462976
(3,1) TOP.sys.proc : LOAD addr=64 wdata=0
(6,0) TOP.sys.mem : MEM READ addr=64 data=67438087
(6,0) TOP.sys.proc : LOAD OK rdata=67438087
(6,0) TOP.sys.proc : --- instr 1 complete ---
(6,1) TOP.sys.proc : IFETCH pc=1 addr=4
...
(29,0) TOP.sys.proc : Processor done: 5 instructions executed
Simulation stopped at time (29,0)
The exact load/store pattern varies with the random seed. With srand(42) the output is deterministic across runs.
Varying the parameters
- Set
LOAD_PCT=0andSTORE_PCT=0for a pure instruction-fetch workload. - Increase
NUM_INSTRto generate longer traces. - Increase
LATENCYto observe the processor spending more cycles waiting for memory responses. - To model a non-blocking processor, separate request issue and response collection into two concurrent branches using a parallel block or a second module.