2011-05-16
How our network is implemented (as of May 2011)
In an earlier entry I gave a nice simple description of how our network looks at the logical level of subnets and so on. Today, it's time for a look at how it's actually implemented with switches and so on. As before, this picture is somewhat simplified.
The core of our physical network is a string of core switches. It's a string because each switch has only two 10G ports, so we have to chain them together in order to get all of them connected. We have space in three different buildings, and between the three buildings we have five core switches (two of them in our main machine room, one each in the core wiring area of the other two buildings, and the fifth in another machine room in the building that has our main machine room). This core backbone transports research group sandboxes, and in our main machine room it transports our primary public network across the room.
There are two approaches to connecting machines to this sort of core network; you can hang multi-VLAN switches off the core switches with tagged links and then configure individual ports on those switches with the appropriate sandbox network, or you can hang single-network switches off untagged single-network ports on the core switches. We've opted to do the second; this requires more configuration of the core switches but keeps all of the edge switches simple (in fact, identical). Thus there are a bunch of single-network switches hanging off core switches in the places where the various networks are needed.
These core switches do not transport our port isolated networks. Those are carried by a separate network of port isolated switches (with a separate set of links between our various buildings). This network is connected to one core switch so that it can be joined together with other sandboxes and transported to the NAT gateway.
(A few sandboxes also have dedicated links between buildings that run between their own single-network switches. Mostly this is for historical reasons; there's a lot of history in action around here.)
How our public networks are handled is a little peculiar. Only one public network (what is now our primary one) is carried on the core switches; all other public networks simply live on single-network switches in the machine room. Our public networks interconnect through our core router, which has an untagged port for each network that connects to the top level switch for that network (including the touchdown network that is our connection to the campus backbone). The 'top level' switch for our primary public network also connects to an untagged port on one of the core switches, thereby connecting everything up.
(The core router also carries static routes for the sandboxes that point to their NAT gateway.)
All of our fileservers are on the primary public network; to maximize aggregate bandwidth to them, each fileserver gets a direct connection to one of the core switches. A few machines that do a lot of NFS activity also get direct connections, such as our IMAP server and our Samba server. We have also split our login and compute servers onto several switches, each of which is directly connected to the core switch with the fileservers.
There is also a completely separate management network, which has its own switches and its own links between buildings. For reasons beyond the scope of this entry (they involve switch bugs) the main thing on it in other buildings is serial port to Ethernet bridges that give us remote serial console access to various switches, most crucially the core switches. In our main machine room, it also has various other things such as web-enabled PDUs.