Skip to content

Router

This example models a packet router with two input ports and four output ports. Incoming packets are forwarded to an output based on the low-order bits of their address field. When both inputs have packets ready in the same cycle, the router uses round-robin arbitration to decide which to serve first.

What this example demonstrates:

  • Multi-port modules using C++ pointer arrays for indexed port access
  • Address-based routing with a compile-time mask address & (N-1)
  • Round-robin arbitration across input ports
  • Per-output buffering via net capacity
  • Two-phase routing: arbitrate in phase 0, forward in phase 1

Token format

All ports carry 8-byte tokens encoding two 32-bit integers: an address and a data payload.

token<8>:   [ address : 4 bytes | data : 4 bytes ]

The destination output port is selected by the low-order bits of the address:

dst = address & (N - 1)    // N=4, so low 2 bits; range 0..3

Structure

Two Sources inject packets cycling through all four addresses. The Router forwards each packet to the appropriate output net, which acts as the per-output buffer. Four Sinks drain and log received packets.

module Top
    submodule router : Router<4, 4>   // N=4 outports, DEPTH=4
    submodule src0   : Source<0>
    submodule src1   : Source<1>
    submodule snk0   : Sink<0>
    submodule snk1   : Sink<1>
    submodule snk2   : Sink<2>
    submodule snk3   : Sink<3>

    // Input nets (low capacity — back-pressure reaches sources quickly)
    net i0 : capacity 2 width 8
    net i1 : capacity 2 width 8

    // Output nets (capacity = DEPTH, acting as per-output buffers)
    net o0 : capacity 4 width 8
    net o1 : capacity 4 width 8
    net o2 : capacity 4 width 8
    net o3 : capacity 4 width 8

    src0.outp  => i0    router.inp0 <= i0
    src1.outp  => i1    router.inp1 <= i1
    router.outp0 => o0    snk0.inp <= o0
    router.outp1 => o1    snk1.inp <= o1
    router.outp2 => o2    snk2.inp <= o2
    router.outp3 => o3    snk3.inp <= o3
end module
flowchart LR
    src0["Source 0"]
    src1["Source 1"]
    rtr["Router\n(2-in, 4-out)"]
    snk0["Sink 0"]
    snk1["Sink 1"]
    snk2["Sink 2"]
    snk3["Sink 3"]

    src0 -->|"i0 cap=2"| rtr
    src1 -->|"i1 cap=2"| rtr
    rtr  -->|"o0 cap=4"| snk0
    rtr  -->|"o1 cap=4"| snk1
    rtr  -->|"o2 cap=4"| snk2
    rtr  -->|"o3 cap=4"| snk3

The input nets (i0, i1) have a small capacity of 2, so back-pressure reaches the Sources quickly when the router stalls. The output nets (o0-o3) have capacity 4, acting as per-output buffers.


Router module

The Router declares its ports individually and creates C++ pointer arrays in decl and init to allow indexed access inside code blocks. This is the standard pattern for multi-port modules in Sitar.

module Router
    parameter int N     = 4   // number of outports (must be power of 2)
    parameter int DEPTH = 4   // buffer depth per outport (informational; enforced by net capacity in Top)

    // Two inports and four outports, each carrying 8-byte tokens
    inport  inp0 : width 8
    inport  inp1 : width 8
    outport outp0 : width 8
    outport outp1 : width 8
    outport outp2 : width 8
    outport outp3 : width 8

    decl $
    // C++ pointer arrays allow indexed access to the named ports in code blocks.
    // This is the standard pattern for multi-port modules in Sitar.
    inport<8>*  ins[2];
    outport<8>* outs[4];

    int      pkt_addr;    // address field of the currently routed packet
    int      pkt_data;    // data field
    token<8> pending;     // token being forwarded this cycle
    int      rr;          // round-robin pointer (0 or 1)
    int      dst;         // resolved output port index
    bool     found;       // whether a token was picked this cycle
    bool     ok;
    $
    init $
    ins[0] = &inp0;  ins[1] = &inp1;
    outs[0] = &outp0; outs[1] = &outp1;
    outs[2] = &outp2; outs[3] = &outp3;
    rr = 0;  found = false;  ok = false;
    $

    behavior
        do
            // ---- Phase 0: arbitrate and pick one token ----
            $found = false;$;
            do
                wait until (this_phase == 0);
                $
                // Check inports in round-robin order starting at rr
                for (int a = 0; a < 2 && !found; a++) {
                    int idx = (rr + a) % 2;
                    if (ins[idx]->pull(pending)) {
                        sitar::unpack(pending, pkt_addr, pkt_data);
                        dst   = pkt_addr & (N - 1);   // low bits select output
                        rr    = (idx + 1) % 2;         // advance round-robin
                        found = true;
                        log << endl
                            << "in[" << idx << "] -> out[" << dst << "]"
                            << "  addr=" << pkt_addr
                            << "  data=" << pkt_data;
                    }
                }
                $;
                if (not found) then wait end if;
            while (not found) end do;

            // ---- Phase 1: forward to destination output ----
            $ok = false;$;
            do
                wait until (this_phase == 1);
                $ok = outs[dst]->push(pending);$;
                if (not ok) then wait end if;
            while (not ok) end do;
        while (1) end do;
    end behavior
end module

The routing loop operates in two phases each cycle:

  1. Phase 0 — arbitrate: Scan inports starting at the round-robin pointer rr. The first inport with a token wins. Unpack the address, compute the destination, and advance rr.
  2. Phase 1 — forward: Push the packet to the destination outport. If the output buffer is full, retry each phase until it drains.

One packet per cycle

The router handles at most one packet per cycle. When neither inport has a token, the router stalls at the phase 0 wait until one arrives.


Source

Each Source generates 12 packets (3 per output port), cycling through addresses 0, 1, 2, 3. The two Sources start at different address offsets to create interleaving traffic.

module Source
    parameter int ID = 0   // used for logging and initial address offset

    outport outp : width 8

    decl $
    static const int NUM_PKTS = 12;   // 3 packets per output port
    int      seq;
    int      next_addr;
    token<8> t;
    bool     ok;
    $
    init $seq = 0;  next_addr = ID % 4;$

    behavior
        do
            wait until (this_phase == 1);
            $
            sitar::pack(t, next_addr, seq);
            ok = outp.push(t);
            if (ok) {
                log << endl << "src[" << ID << "] addr=" << next_addr << " data=" << seq;
                next_addr = (next_addr + 1) % 4;
                seq++;
            }
            $;
            if (not ok) then wait end if;
            if (seq >= NUM_PKTS) then
                stop simulation;
            end if;
        while (1) end do;
    end behavior
end module

Sink

Each Sink drains its output net every phase 0 and logs the address and data of each received packet.

module Sink
    parameter int ID = 0

    inport inp : width 8

    decl $
    token<8> t;
    int      addr;
    int      data;
    int      total;
    $
    init $total = 0;$

    behavior
        do
            wait until (this_phase == 0);
            $
            while (inp.pull(t)) {
                sitar::unpack(t, addr, data);
                log << endl << "snk[" << ID << "] addr=" << addr << " data=" << data;
                total++;
            }
            $;
            wait;
        while (1) end do;
    end behavior
end module

Expected output (excerpt)

(0,1) TOP.src0 : src[0] addr=0 data=0
(0,1) TOP.src0 : src[0] addr=1 data=1
(0,1) TOP.src1 : src[1] addr=1 data=0
(1,0) TOP.router : in[0] -> out[0]  addr=0  data=0
(1,0) TOP.router : in[1] -> out[1]  addr=1  data=0
(2,0) TOP.snk0  : snk[0] addr=0 data=0
(2,0) TOP.snk1  : snk[1] addr=1 data=0
...
Simulation stopped at time (...)

The round-robin arbiter alternates between in[0] and in[1] each cycle when both have packets, distributing load evenly across the two sources. Output port utilization depends on the address distribution generated by the Sources — in this example, each output receives exactly 6 packets (3 from each source).