A student writes:
Dr. Patt,
While you were going over the omega network, you asked us
to design different omega networks for n=64 nodes. I did not
understand what you were asking. Could you please explain it again.
Thank you
<>
Sure. I guess that did get covered pretty fast. Aater has already asked me to write up
the stuff I put on the board covering the cost, contention and latency for all the interconnection
structures we went over. And, since
Aater is the boss... I will hand that out on Monday and answer any questions
before we get to cache coherency.
With respect to omega networks, recall it is the "half-way" house between a bus and a full crossbar.
Recall it is an array of cross-bar switches that allows you to get from any node on the left to any
node on the right by going through logk n columns of k by k crossbar switches. The crux is that with
a k by k switch, you can go to k different places with one switch. If the output of that switch goes
to another k by k switch, you can get to k^2 places after two switches. If the output of the second
switch goes to another k by k switch, you can get to k^3 places after three switches. Each column
of k by k switches provides such paths for every node on the left (at the input).
In class, we showed the case n=8, and k=2. To get to all 8 nodes with 2x2 switches, we need three
columns of switches, since 2^3 = 8.
Suppose n=16, and k=4. Then, to get to all 16 nodes, we only need two columns, since 4^2 = 16.
I have constructed the network below, leaving you to "connect most of the dots" since my ascii terminal
would make a mess trying.
--- ---
---| |----a1--a1---| |----
---|4x4|----a2 b1---| |----
---| |----a3 c1---| |----
---| |----a4 d1---| |----
--- ---
--- ---
---| |----b1 a2---| |----
---| |----b2--b2---| |----
---| |----b3 c2---| |----
---| |----b4 d2---| |----
--- ---
--- ---
---| |----c1 a3---| |----
---| |----c2 b3---| |----
---| |----c3--c3---| |----
---| |----c4 d3---| |----
--- ---
--- ---
---| |----d1 a4---| |----
---| |----d2 b4---| |----
---| |----d3 c4---| |----
---| |----d4--d4---| |----
--- ---
What I asked you to do is compute, for n=64, what the cost and latency would be using 2x2, 4x4,
8x8 and 64x64 crossbars.
Latency is simply the number of columns, since it takes 1 time unit to get through one k by k switch.
Cost is k^2 for each switch times the number of switches.
OK?
Hope you are enjoying the Thanksgiving weekend.
Yale Patt
ps. Anticipating another question, I will also write up the equations and example for S, E, U,
and R that we covered in class on Monday.