Our network layout (as of May 2011)

May 13, 2011

Today, I feel like talking about how our networks are organized at the logical level of subnets and routing and interconnections (instead of the physical level of switches and so on; that's another entry). This will be simplified somewhat because going into full detail would make it feel too much like work.

We have a single connection to the campus backbone; these days it uses a touchdown network that connects to our core router (although it didn't used to, which caused some issues). Hanging off of the core router is all of our public subnets.

Now, we have nowhere near enough public IP addresses to go around, and especially we don't have enough subnets to isolate research groups from each other. So basically the only thing on our public subnets is centrally managed servers and infrastructure; actual people and research groups use subnets in private IP address spaces that we call 'sandboxes'. Most sandboxes are routed sandboxes; they sit behind routing NAT gateways (although sandbox machines are not NAT'd when they talk to internal machines), are routed internally, and can reach both internal machines and the outside world (and with the right configuration, the outside world can reach them if desired). Generally each research group gets its own sandbox (which they have a lot of control over), and we have a few generic sandboxes as well.

(We provide a few services for research group sandboxes; we host the DNS data, provide DHCP if they want it, and of course manage the NAT gateways. DNS-wise, sandbox machines have names that look like sadat.core.sandbox; we have a complex split horizon DNS setup, and of course .sandbox exists only in the internal view.)

The most important generic sandbox is what we sometimes call our 'laptop network', which is for general user-managed machines such as Windows machines. Unlike regular sandboxes, this sandbox is port isolated so that user machines can't infect each other. Because people sometimes want to run servers on their own machines and have them reachable by other people on the laptop network, we have a second port isolated subnet for 'servers' (broadly defined). We also have a port isolated wireless network that sits behind a separate NAT gateway. Unlike conventional sandboxes, the wireless network is NAT'd even for internal machines.

(There is also a PPTP VPN server, which needs yet another chunk of private address space for the tunnels it creates with clients.)

These NAT gateways sit on our normal public subnets, or I should actually say 'subnet'; we have been slowly relocating our servers so that almost everything lives on a single subnet. Among other advantages, this means that we avoid round trips through our core router when servers talk to each other or to sandbox machines. Since some research groups have compute clusters in their sandbox that NFS mount filesystems from our fileservers and care about their performance, we like avoiding extra hops when possible. Our core router has static routes configured for all of the routable sandbox subnets, and various important servers on the subnet also have them for efficiency.

(Well, okay, basically all of the servers on the subnet have the static routes because we automated almost all of the route setup stuff, and once you do that it's easy enough to make it happen everywhere.)

There is a general campus-wide requirement that networks at least try to block anonymous access. For the wireless network, we deal with this by requiring authentication on the wireless gateway (or that you use our VPN, which does its own authentication); for the laptop network, we have a couple of self-serve DHCP registration systems with a tangled history. For research group sandboxes, we leave it up to the research group and anyways, most research group sandboxes can only be used from physical areas with good access control like the group's lab space.

We (the central people) use a number of sandboxes ourselves for various reasons. Some of them are routed, but some are not for various reasons (for example, we consider it a feature that the iSCSI interconnect subnets are not reachable even by internal machines). There's a few sandboxes that we don't route for especially interesting reasons, but that's another entry.

Sidebar: on some names

Our laptop network is conventionally called the red network or just 'the red', and the collective public subnets that we have our servers on are called the blue network. These names come from the colours of network cables used for each subnet, especially in areas that normal people see. Being strict on cable colour allows us to tell people things like 'only plug your laptop into a red cable, but you can plug it into any red cable you find and it should work'.

(Partly because there aren't enough commonly available cable colours to go around, sandbox network cables are generally blue and blue has thus become our generic cable colour in public areas. A few groups have different, specific cable colours for their sandboxes for historical reasons. The rules are different and much more complicated in our machine room and in wiring closets.)

Written on 13 May 2011.
« How ZFS lets you recover from damaged metadata, and what the limitations are
Our environment illustrated: what network cable colours mean what »

Page tools: View Source, Add Comment.
Login: Password:
Atom Syndication: Recent Comments.

Last modified: Fri May 13 02:11:23 2011
This dinky wiki is brought to you by the Insane Hackers Guild, Python sub-branch.