Maker Pro
Maker Pro

Multi-layer switch network?

D

Davy

Jan 1, 1970
0
Hi all,

I am reading a article. It mentions "multi-layer switching networks
including, e.g.
omega(perfect shuffle)/delta networks, log shifter networks, etc.".

But why not use single-layer instead of multi-layer switch network?

I know simple barrel shifter. What's omega, delta and log shifter?

Any suggestions will be appreciated!
Best regards,
Davy
 
K

Kolja Sulimma

Jan 1, 1970
0
Davy said:
But why not use single-layer instead of multi-layer switch network?

A very abstract answer:
I a single layer network every input connects to N other pins.
These everage wire length for each input is at least proportional to
(n*sqrt(n)), same for the capacitance.
Therefor the bandwidth is reciprocal to n^3, the latency is proportional
to n^3.

The other extreme is a tree of 2-way switches. Each path has log(n)
switches. Each connection has 2 pins. The area is n^2 (n*log(n) is only
true for unbounded number of rounting layers). The average wire length
is n. The bandwidth is constant, the latency is proportional to log(n).

Of course the latency of the switch will have a larger constant value
than the wire. But the difference between log and n^3 is extreme, so the
break even point will be for rather small n.

Kolja Sulimma
 
D

Davy

Jan 1, 1970
0
Hi Kolja,

Thank you for your help :)
But it seems it is hard to understand the idea.

May you please direct me to some reference paper/article about this
subject?

Best regards,
Davy
 
I

Iain McClatchie

Jan 1, 1970
0
Kolja> A very abstract answer:
I a single layer network every input connects to N other pins.
These everage wire length for each input is at least proportional to
(n*sqrt(n)), same for the capacitance.
Therefor the bandwidth is reciprocal to n^3, the latency is proportional
to n^3.

The other extreme is a tree of 2-way switches. Each path has log(n)
switches. Each connection has 2 pins. The area is n^2 (n*log(n) is only
true for unbounded number of rounting layers). The average wire length
is n. The bandwidth is constant, the latency is proportional to log(n).

Of course the latency of the switch will have a larger constant value
than the wire. But the difference between log and n^3 is extreme, so the
break even point will be for rather small n.

Kolja Sulimma

I see some more issues.

Some switches are entirely on a single chip. I've heard of folks
implementing these as multi-layer switches. Perhaps if something
specific about the switching pattern is known, that may make sense.
(e.g. it's a 64-bit shifter, implemented as 3 layers of 4-way muxes).
Or perhaps when latency does not matter. But when latency matters,
a full crossbar, implemented on a single chip, seems quite reasonable
to me, even for e.g. 64 16-bit ports. My reasoning is that it's not
the
64 64-way 16bit muxes that's going to chew up the area, it's the
buffering and scheduling. And a full crossbar should have fewer
scheduling issues than a multi-layer switch.

Once your switch is distributed across more than one chip, you
have a very different problem. The wires between chips cost so
much more than the wires on chip ($0.02 each versus $0.00001 each)
that you can't afford to stall a board wire due to contention for a
chip
wire.

I'm currently quite enamoured with the load-balanced switch idea.
(Previously I was enamoured with the Tiny Tera design, both have
come from Nick McKeown's group at Stanford.) The nice thing about
a load-balanced switch is that the switch fabric itself can be a
shifter, or pair of shifters, which is a *lot* easier to implement. I
think
a load balanced switch implemented on a single chip is an interesting
idea, that may have already been implemented as part of a shared
memory switch.
 
Top