Wednesday, February 27, 2008

Night Shift

Last night was a rare night off (ish) for me, so I decided to put in a few hours with the cluster. We just received our SAN and FC switch this past week, along with a 2 post open telco relay rack (generously 'donated' by the computing center). A network drop to the room is in progress, and our ethernet switch should be ordered sometime in the near future. We have also rolled the four post cabinet housing our retired Alpha server cluster into 61A.

So, the next steps are really rather manual. The Alpha rack needs to be stripped, carefully, of all the hardware that we no longer need in it. This includes the alphaservers themselves (8xDS10Ls) along with a 16 port Miyranet 2000 switch, and a splattering of other hardware. We're going to be keeping all of it, but we really need to rack to house the SAN and a few other servers that will eventually be moved to 61A.

So, now to provide you a little context. 61A is located in the basement of the Mathematics, Computing, and Engineering Center at Trinity College (we call it the MCEC for short). This building was constructed in the early 90s, and designed (or so I'm told) to resemble a computer chip. The Comp-Sci, Math, and Engineering departments are homed here, and it is the former home of academic computing).

The basement hallway is very long. Most of the rooms down here are engineering labs and machine shops. When the building was built, Academic Computing's main machine room was located on the left at the end in a nice purpose built room with redundant Liebert units and a very convenient raised floor. After the library was renovated in the mid 90s, Academic Computing vacated the space and turned a lot of it over to the engineers, who then had it subdivided into four rooms. 83, the one at the very end, remained an Academic Computing machine room. 63, the middle section of the former machine room, is now an engineering lab. 61, which is at the far end of the space, is also (according to its sign) an engineering lab, but it appears to be really more storage than anything else at this point.

61A is actually a telecomm closet which the computing center retained control over. It's only accessible through 61.

This is a picture of the Liebert System 3 in 61. It's actually one of two redundant units both of which push cold air into the floor. Originally, there were no walls to the left or right, as this was all one really long room. Now, however, thinking about how air will return to this thing keeps me up at night. Hopefully it will be able to provide us with enough cooling (it almost certainly will) however, the return of the hot air is a bit of an issue. The computing center has granted us use of their space in 61A, but truth be told it would be a lot nicer to use 61.

As you can see here, they're doing some work on the unit presumably to get it operational again. I don't know much about HVAC, but I'm assuming that the vacuum pump is a precursor to adding fresh refrigerants to the system (or possibly a method of checking for leaks). Either way, I hope there's nothing actually wrong with the unit, as without some serious A/C, this is not going to work. I am, however, optimistic that it will all get sorted out in time.





But, enough about the space, it's time to cover the equipment and the effort. First off, as I said, we received our SAN this past Tuesday, along with a SFP FC switch (with 10 SFPs). The SAN is a StoreVault S500 with five 250GB drives (four active, one hot spare). We also got the FC kit, so it comes with a fibre channel card pre-installed. It's a pretty nice unit that fits our needs almost perfectly, while still being affordable and leaving room to grow.

The fibre channel switch, a QLogic SanBox, has 10 SFP ports and comes with 10 SFPs (not pictured). We got ours from the vendor who sold us the SAN. However, a quick search of CDW shows they also sell pretty much the exact same kit retail. All things considered, this is again a great deal. The only real drawback is that the switch itself cannot be rackmounted without an additional rackmount kit, which, as you may or may not have guessed, is not an impulse buy. We'll probably figure out some other solution for keeping it up in the air.



This next picture is kind of a "well duh" shot. Up until this point I have not actually discussed the cluster itself at all. This machine was in use at a major financial institution until they decided to move to different hardware and donated it to Trinity. The system consists of three Egenera BladeFrames with a mix of blades running from Pentium IIIs, to four way Xeon 2.6ghz systems. The system has about 160 processors total, and while not necessarily the most bleeding edge of systems, it provides us with a real opportunity to do some serious research and science. This is really a huge step up for us since our previous cluster consisted of eight DEC/Compaq Alphaservers. Each of the bladeframes has two Control Blades, or cBlades, which aside from a variety of other tasks, provide the external I/O and Network interfaces for the system. Each cBlade has two dual port gigabit SX fiber interfaces, and two dual port Fibre Channel interfaces. According to the technical specifications, at least one of the the ethernet ports and one of the fibre channel ports must be populated on each cBlade. That totals 6 FC ports, and 6 1000BASE-SX ethernet ports for network and I/O. Expensive accessories.

This really provides a good contrast and scale comparison with our old cluster, an 8 node (9 if you count the controller) DEC/Compaq Alphaserver system with a Myrinet 2000 interconnect. This really was not a bad system for its time, and the Myrinet hardware is still really a very nice interconnect. For comparison, Egenera says the interconnect of the baldeframes reaches 'gigabit' speeds. This Myrinet system can sustain bandwidths of nearly twice that and at a far lower level of latency than is possible with traditional ethernet networks. Many commodity clusters are interconnected using standard gig-e today, however, if you can afford an option like Myrinet or Infiniband, the differences are huge (particularly due to the fact that network comm time is usually the limiting factor in all but 'embarrassingly parallel' HPC applications). I spent most of my time last night removing cables from this rack. Whoever originally cabled it did a really good job, and except for a nasty bundle of power cables near the bottom, the process was relatively painless. For those of you out there who are counting servers, there is actually one missing. A power supply in one of the systems failed over the summer, and it was really a very expensive part to try to replace. That server is still absent. The large one in the middle is the control server, which is of a slightly different model. At the bottom, you can see our 16 port Myrinet 2000 switchbox, and if you look carefully, you can see the Myrinet cards in each of the servers.

I stopped at the point of being able to remove the Myrinet box from the rack. The current plan is to install the SAN right about where that box currently sits (this avoids the need to remove the alphaservers until we can find a real home for them to stay in, or at least a good plan). The switchbox will actually probably just be moved down a few U. I don't want to mount the SAN too close to the bottom, because there are some really monstrously large servers which might get put into this rack, and I want to leave them with enough room that nothing will have to really get moved once its mounted. I diddn't dare move the switchbox on my own, as it has no rails, and dropping it would probably be more expensive than running your car into a tree at 15mph. I also have some additional pictures of the space and some of the equipment in it, but really don't have the time to tell that part of the story right this minute. Hopefully, I'll be able to put some additional time in this Sunday on the cluster, so look back for an update about it then.

No comments: