Friday, February 29, 2008

CampTrin/Trincoll.Info

During my sophomore year at Trinity, I decided to start a specialized wiki that only members of the college community could edit. For a while, maintaining it and encouraging others to contribute was an interesting and meaningful experience. Unfortunately, my Junior year was extremely busy with a number of more important projects, and despite my pestering several of my friends, no one was really willing to step up and take over the project. Amazingly, the site still gets a fair amount of pageviews. Not a million, by any stretch, but enough to know that it's out there and noticed, as well as indexed with several search engines. The real question is, what do I do with it now that I'm graduating? It's one that I'm going to have to seriously ponder.

-AJB

A glimpse of what's to come

I just received word that the final components for the new cluster have been ordered by the computing center. Hopefully we will receive our new ethernet switch (1000BASE-SX, kind of neat) and necessary patch cables (for both the ethernet and the fiber channel) sometime next week. There is still a lot to do before then, so don't forget to check back this Sunday for an update!

Also, as I promised earlier, I'm thinking about how to write up a post on the "Tech" of the Quadcast. I can't promise something polished right now, but essentially we have a 'retired' Sun Ultra 5 workstation (the CS department is trying to get rid of them, they were formerly arranged as 'cluster 2,' which has been out of service for several years and is now profoundly obsolete) configured with cron and a streamripper (actually called 'streamripper). It's running Debian Etch, and lives on campus here at Trinity. Files are captured automatically by this system, quickly edited with Audacity, and then thrown up on some webspace allocated to students on Shakti (an ancient unix server). We'll probably have to switch to another system soon, as these large MP3s are probably rather conspicuous.

-AJB.

P.S. - A brief interlude
The actual technical specifications of Shakti are a mystery to me. But, it is worth pointing out that at the very least it is impressively old. This suggests that the system has been online at least since 1997. It's uname -a output seems to support this dating as well

SunOS shakti x.x Generic_xxxxx-xx sun4u sparc SUNW,Ultra-2 Solaris.

Am I to believe that Shakti is a sun4u series Ultra 2 machine? So far as I know, there was only one variety of these ever made -- the A14 "pulsar." According to this nifty website, such a machine would be of approximately equivalent if not slightly lower power to the Ultra 5 machines that the CS department is now cannibalizing for our various side projects (such as powering the Quadcast).

Wednesday, February 27, 2008

Night Shift

Last night was a rare night off (ish) for me, so I decided to put in a few hours with the cluster. We just received our SAN and FC switch this past week, along with a 2 post open telco relay rack (generously 'donated' by the computing center). A network drop to the room is in progress, and our ethernet switch should be ordered sometime in the near future. We have also rolled the four post cabinet housing our retired Alpha server cluster into 61A.

So, the next steps are really rather manual. The Alpha rack needs to be stripped, carefully, of all the hardware that we no longer need in it. This includes the alphaservers themselves (8xDS10Ls) along with a 16 port Miyranet 2000 switch, and a splattering of other hardware. We're going to be keeping all of it, but we really need to rack to house the SAN and a few other servers that will eventually be moved to 61A.

So, now to provide you a little context. 61A is located in the basement of the Mathematics, Computing, and Engineering Center at Trinity College (we call it the MCEC for short). This building was constructed in the early 90s, and designed (or so I'm told) to resemble a computer chip. The Comp-Sci, Math, and Engineering departments are homed here, and it is the former home of academic computing).

The basement hallway is very long. Most of the rooms down here are engineering labs and machine shops. When the building was built, Academic Computing's main machine room was located on the left at the end in a nice purpose built room with redundant Liebert units and a very convenient raised floor. After the library was renovated in the mid 90s, Academic Computing vacated the space and turned a lot of it over to the engineers, who then had it subdivided into four rooms. 83, the one at the very end, remained an Academic Computing machine room. 63, the middle section of the former machine room, is now an engineering lab. 61, which is at the far end of the space, is also (according to its sign) an engineering lab, but it appears to be really more storage than anything else at this point.

61A is actually a telecomm closet which the computing center retained control over. It's only accessible through 61.

This is a picture of the Liebert System 3 in 61. It's actually one of two redundant units both of which push cold air into the floor. Originally, there were no walls to the left or right, as this was all one really long room. Now, however, thinking about how air will return to this thing keeps me up at night. Hopefully it will be able to provide us with enough cooling (it almost certainly will) however, the return of the hot air is a bit of an issue. The computing center has granted us use of their space in 61A, but truth be told it would be a lot nicer to use 61.

As you can see here, they're doing some work on the unit presumably to get it operational again. I don't know much about HVAC, but I'm assuming that the vacuum pump is a precursor to adding fresh refrigerants to the system (or possibly a method of checking for leaks). Either way, I hope there's nothing actually wrong with the unit, as without some serious A/C, this is not going to work. I am, however, optimistic that it will all get sorted out in time.





But, enough about the space, it's time to cover the equipment and the effort. First off, as I said, we received our SAN this past Tuesday, along with a SFP FC switch (with 10 SFPs). The SAN is a StoreVault S500 with five 250GB drives (four active, one hot spare). We also got the FC kit, so it comes with a fibre channel card pre-installed. It's a pretty nice unit that fits our needs almost perfectly, while still being affordable and leaving room to grow.

The fibre channel switch, a QLogic SanBox, has 10 SFP ports and comes with 10 SFPs (not pictured). We got ours from the vendor who sold us the SAN. However, a quick search of CDW shows they also sell pretty much the exact same kit retail. All things considered, this is again a great deal. The only real drawback is that the switch itself cannot be rackmounted without an additional rackmount kit, which, as you may or may not have guessed, is not an impulse buy. We'll probably figure out some other solution for keeping it up in the air.



This next picture is kind of a "well duh" shot. Up until this point I have not actually discussed the cluster itself at all. This machine was in use at a major financial institution until they decided to move to different hardware and donated it to Trinity. The system consists of three Egenera BladeFrames with a mix of blades running from Pentium IIIs, to four way Xeon 2.6ghz systems. The system has about 160 processors total, and while not necessarily the most bleeding edge of systems, it provides us with a real opportunity to do some serious research and science. This is really a huge step up for us since our previous cluster consisted of eight DEC/Compaq Alphaservers. Each of the bladeframes has two Control Blades, or cBlades, which aside from a variety of other tasks, provide the external I/O and Network interfaces for the system. Each cBlade has two dual port gigabit SX fiber interfaces, and two dual port Fibre Channel interfaces. According to the technical specifications, at least one of the the ethernet ports and one of the fibre channel ports must be populated on each cBlade. That totals 6 FC ports, and 6 1000BASE-SX ethernet ports for network and I/O. Expensive accessories.

This really provides a good contrast and scale comparison with our old cluster, an 8 node (9 if you count the controller) DEC/Compaq Alphaserver system with a Myrinet 2000 interconnect. This really was not a bad system for its time, and the Myrinet hardware is still really a very nice interconnect. For comparison, Egenera says the interconnect of the baldeframes reaches 'gigabit' speeds. This Myrinet system can sustain bandwidths of nearly twice that and at a far lower level of latency than is possible with traditional ethernet networks. Many commodity clusters are interconnected using standard gig-e today, however, if you can afford an option like Myrinet or Infiniband, the differences are huge (particularly due to the fact that network comm time is usually the limiting factor in all but 'embarrassingly parallel' HPC applications). I spent most of my time last night removing cables from this rack. Whoever originally cabled it did a really good job, and except for a nasty bundle of power cables near the bottom, the process was relatively painless. For those of you out there who are counting servers, there is actually one missing. A power supply in one of the systems failed over the summer, and it was really a very expensive part to try to replace. That server is still absent. The large one in the middle is the control server, which is of a slightly different model. At the bottom, you can see our 16 port Myrinet 2000 switchbox, and if you look carefully, you can see the Myrinet cards in each of the servers.

I stopped at the point of being able to remove the Myrinet box from the rack. The current plan is to install the SAN right about where that box currently sits (this avoids the need to remove the alphaservers until we can find a real home for them to stay in, or at least a good plan). The switchbox will actually probably just be moved down a few U. I don't want to mount the SAN too close to the bottom, because there are some really monstrously large servers which might get put into this rack, and I want to leave them with enough room that nothing will have to really get moved once its mounted. I diddn't dare move the switchbox on my own, as it has no rails, and dropping it would probably be more expensive than running your car into a tree at 15mph. I also have some additional pictures of the space and some of the equipment in it, but really don't have the time to tell that part of the story right this minute. Hopefully, I'll be able to put some additional time in this Sunday on the cluster, so look back for an update about it then.

Next Cluster

This blog actually started as a project to document the installation, configuration, and management of a massive ~160 processor HPC cluster that was just donated to Trinity. Needless to say, the time required to write my thesis, manage my business, and actually assemble the cluster has sapped all the time that I planned to use to write about the process. Thus, the blog has been renamed, and retooled to just be yet another blog about technology, albeit from an atypical perspective. Stay tuned for information about the cluster, among other things, as soon as I manage to find the time to post them.

In the mean time, you might want to check out 'The QuadCast!' Part radio show, part podcast, part senioritis, part content syndication experiment, and all kinds of tech: it's exactly the kind of project that I participate in when I should really be sleeping more. Aside from the radio show produced by my three roomates (thus, the title) we are also syndicating the content from two other shows broadcast on WRTC Hartford. Never has a few disused and obsolete Sun workstations and a limited quantity of student webspace on an ancient unix machine been so aptly used. More on the tech that powers "The QuadCast" as well as an inside look at my ambitions on content publishing should be coming soon. Again, time permitting.

What's Up?

Enter a bio (make it good!):
My name is Andrew Budd. I am a double major in Computer-Science and Economics at Trinity College. I am a student, and an entrepreneur dedicated to technology and innovation. My biggest issue is that there simply is not enough time in the day to do everything that I want to do. There never has been, and in my experience so far, that issue only becomes more severe as life goes on. So herein lies my motivation for starting a blog: I can write about what it is that I do, and what it is that I see. While I might not be able to create more time, I can at least keep a better record of my endeavors in the hopes that it inspires someone to do something that I never found the time to do.