To the left, a traffic graph from one of our routers serving tow network segments over two ISPs. The segments are aggregated through a common network interface (which can also failover) so this shows the traffic from both. Max transfer limit of the two in aggregate is a little better than 45mbit or so (heh, spare capacity is sometimes a good thing, we have spare capacity). Anyway, I thought that this was kind of a funny looking "Data Mesa" where one of our customers transferred a ton of data over a day or two (apparently limited by the speed of their network link to about 3mbits).
The image to the right is of our (my family's) two dogs. Apparently, it's cold in the house back home and my sister is trying to send a subtle message (to turn up the heat, which the rest of us refuse to do). Of course, I say that form the comfort of my MIT graduate housing apartment, where I can set the thermostat to whatever I want without consequence.
Now that the "business as usual" is done with, I'll cover a little bit about when and how I'll be updating this blog (yes, backwards, I know). Anyway, I maintained this thing fairly well during the latter half of my senior year at Trinity. The summer was extremely busy, and Sloan is almost as busy as the summer was. There has not been much time for blogging, and the reader base of this thing is [fortunately?] quite small anyway. Updates going forward wont be frequent, but they wont be as scarce as they have been for the past few months.
Saturday, December 6, 2008
Data Mesa?
Monday, April 14, 2008
C.O.D.
I'm slowly sliding into the last few weeks at my current institution, and am very much looking forward to the summer, and the resumption of several other projects which I have simply not had the time to properly attend to as of late. I pointed out in my last post that my senior thesis in Economics was finally complete, and I'm also pleased to announce that my research and writeup efforts earned me co-authorship on a paper which will be presented at a conference in London in several months. Assuming the article makes it to some online-accessible medium, I'll post a link to it here.
I haven't really spent any blog time explaining what exactly my thesis is focused on, so I'll provide a brief introduction now. Economics models can be very very difficult to validate. One interesting way to experiment with a model however, is to build a computer simulation of it. However, as soon as you attempt to do so, especially from a comp-sci/programming perspective, you realize just how 'top-down' a lot of the logic inherent in these models really can be. For my thesis research, we started out with a canonical Neo-Kaleckian growth model (a form of demand-led post-Keynesian system) and constructed a computational agent based simulation of it. The results actually proved to be very interesting, and with any luck, our modifications as well as some of our results and conclusions should open up a new line of investigation surrounding this particular model. As this is a technology blog, I wont get into the actual details of the economics. However, I will add that we used the Repast toolkit to implement it. Repast was a fun ride, but in the end it suffered the same frustrating traits that all frameworks ultimately fall victim to; they speed things up right up to the point where you need to do something the framework wasn't designed to allow, at which point they slow you down a lot. In the end, I cannot blame the toolkit for this, as all good software always comes down to good engineering, which always inevitably reverts to good planning, leading to the issue of knowing at the start exactly what it is that you want to do with something. Frameworks can be misleading (in letting you think that you should just follow some simple rules, and extending things later will always be easy) but they are never a substitute for careful specifications. Of course, when creating a research tool, you end up with the unpleasant situation of the software changing its own requirements.
I've also just received word today that another paper I helped to write this semester on methods of software engineering education (as applied to a method we tried here at Trinity) was accepted at a conference focusing on... software engineering education. Last semester I TA'd a course in software engineering which included a practical component focusing on a system that I designed last year with the help of another CS major to log how computer-science students spent their time (in terms of a breakdown between the various critical steps of a process). This simple web application proved to not only be a useful tool for logging time, but also for teaching teams how to carefully plan and extend an existing system without perfect documentation on a timetable while assigned to teams defined by their characteristic asymmetric skill sets. It was a lot of fun to structure, and we ended up getting enough novel and interesting things out of the experience to take a stab at a paper, which then turned out pretty well.
But, as I said, the year is winding down and I am looking more and more to my next two years as an MBA grad student at MIT Sloan, with just a few things standing between me and that inevitable and very thrilling future. Of course, the biggest (in terms of sheer weight) is the new Trinity cluster. As I said in my last post, we received much of the last of the hardware that we need to make everything run. As it stands now, we actually have all the critical hardware components, and are in the process of settling some software issues.
As I said earlier, we received our new ethernet switch as well as its associated fiber optic patch cables (LC/LC duplex multimode) necessary to connect the bladeframes to the network, as well as the FibreChannel switch to the SAN and the bladeframes. The switch is a cisco catalyst 3750 with 6 1000SX-BASE SFPs and 1 gigabit copper SFP. The gaggle of day-glo orange is of course, the fiber. I wish it looked 'neater' but considering that it's six feet off the ground, I'm not too worried about something snagging it. As you can see in this picture, we also have our qLogic SANBox 1400 (FibreChannel switch) located in just about the same place in the rack. An old netgear 10/100 rackmount 12 port hub is serving as a makeshift shelf of sorts. The other box sitting on top of it is a 16 port 10/100 switch that is giving us a few more ethernet ports. We're actually going to be receiving another ethernet switch (copper only this time) to replace this netgear switch as well as the little 16 port sitting on top of it. The new switch will plug into the 3750 and allow us to save a port of our drop and presumably a port on the big switch that serves the MCEC.
To keep everything safe and cozy, we're using RFC1918 subnets for both management and the blades themselves. These networks will only be accessible from specific other hosts and networks on campus alleviating a lot of the security pressure on our end. Both networks will actually be on the same vlan (for various reasons) but each will be on a different logical layer 3 network. These networks are actually both already configured, but as one of them is not plugged into the larger network (namely, the management network which is running over the netgear and it's little trendnet friend) I prefer to consider it the future.
As I mentioned earlier, our AC unit has finally been repaired as well. Apparently the issue was some kind of component failure in one of the two compressors. It's been fixed, and is now pushing cold air into the floor (it's set to 68) however, I am still very nervous about the return air supply. Also, it keeps complaining about 'low humidity' (which makes sense, as computers like a relative humidity level between 40% and 60% to keep static from building up) but I actually see this as a symptom of poor air flow. A lot of the air put out by this unit is going into 61A but at times (if the door to 61A is closed) has serious trouble returning to the liebert unit. A lot of air is getting sucked in through the cracks around the door to the hallway meaning that if you leave the door to 61A closed, the 'low humidity' alarm will go off as the air the unit is getting back is coming around the long way, and is already very dry (or, rather, not humidified). These units were designed to cool one really big room redundantly. Now, the unit that we're using is kind of awkwardly boxed up in a rather small room, and is being asked to cool a bunch of computer equipment in the room next to it. This is less than ideal, but this is also a very big air conditioner. I'm confident that it will be able to fill our needs nicely.
Ultimately though, the system is up and running, the SAN is working, the ethernet blinks, and the pressures is now on getting the software configured. There are a couple of wrinkles involved in this step, which I hope to get into in a future (hopefully the near future) post. Stay tuned!
Sunday, April 13, 2008
Back in action
It's been quite a while since my last post, mostly on account of my senior thesis in economics, which is finally complete and turned in. In the mean time, a lot has happened on several fronts. A messed up data stream at WRTC has pretty much killed the 'online' aspect of The QuadCast (short of plugging a radio in to a computer, there is no easy way to capture the stream, at least, not one as effortless as a cron job and a streamripper). The cluster is progressing quite well, at least in terms of hardware. We finally have had our air conditioner repaired (evidently a component of one of the two compressors had failed) and it's producing cold air (but I am still nervous about how well the hot air will return to it from 61A). We have received our new cisco fiber ethernet switch, and both the fibre channel and ethernet multimode patch cables have all been installed. We have had our network appropriately subnetted and configured logically (more about that later) and the system is actually turned on and running. All in all, we're way behind schedule but looking pretty good going forward. With any luck, it will actually turn out something during my tenure! Oh, as one other last aside, we managed to find a buyer for our older Alphacluster which is kind of cool. It's not often that you get to earn a profit for your department.
I'll fill in all of the gaps of this soon, along with new pictures and details as soon as I find the time to really buckle down. I'll tackle it in a few posts over the course of the next few days.
Thursday, March 6, 2008
DNS destination
So, things have slowed down a bit in terms of side projects over the last few days. We are still waiting on the A/C, the new ethernet switch, and for the two post rack to be mounted on the cluster front. I have a bit of work that I need to get done on my thesis (which I will actually be posting about later today hopefully) and in general, everything else is running smoothly. Sadly, that doesn't generate much material for posts, but there is perhaps something of interest in all of this.
We recently decided to bring a few sites that we maintain which are hosted at shared hosting providers 'in house.' This is a tedious process of freezing, backing up, copying, deploying, reconfiguring, testing, and then updating DNS data so that the website hits the new server (at least eventually). At one point, we used to do our own DNS hosting, but while convenient for some things, this actually proved to be a very inconvenient strategy at some point. Instead, we now use Nettica, which has proven to be very reliable.
Anyway, check back later for hopefully some more interesting content.
Tuesday, March 4, 2008
Pausing to reconsider
Several days ago, I posted that I was looking to displace my linux based router firewalls with 'enterprise' appliance like solutions. Let me start again. A lot of this discontent was fostered by an unstable box in a very critical position, that had a habit of going down when I needed it to stay up. Since I spend most of my days these days 132 miles away from that box, I was somewhat forced to stick to the plan of rebooting it (remotely) when trouble cropped up. This experience made me very bitter, because every time this machine went down I lost the confidence of those who were relying on services that were dependent on it. Eventually, I began to wonder if I was doing right by my customers by using a more versatile and less expensive solution that seemed to be less reliable.
I have come to my senses. Business owners often need to come to the realization at some point that spending money does not increase customer satisfaction. Just because there is a more expensive option that is better marketed, does not mean that you should question the validity of your original strategy. I have to remind myself of this sometimes as well, as I had to in this case. A linux machine with a custom 2.6 kernel, coupled with systems like dhcp3-server, bind, openvpn, ntp (server), of course iptables, built in VLAN support, and any expansion card that can fit in a standard expansion slot, blows almost anything else out of the water in terms of features, and certainly in terms of price. Many of these features are essential to providing a high quality and reliable service. The hardware is really no different than that which runs in any of the leading 'appliance' solutions either. It's all about the software, and with Linux most of the time, that comes down to your ability to intelligently configure it.
It somehow seems appropriate that the pizazz of good marketing is very compelling until you try to justify your persuasion with numbers and common sense.
-AJB.
P.S. Pictures make stories better
While I am often reminded that slides for talks are best without any text on them (a theory that I debate to this day) I do recognize that Blogs are better with images. I think part of it comes from my own desire to look at random images of cool high-tech equipment (try google image searching for things like 'core switch' or 'fiber' one of these afternoons) and share my own pictures with others. Part of it also comes, I think, from a desire to share which somehow always seems more genuine when it involves images.
Pictured at the top to the left is the front of a Cisco PIX-501. It has been sitting in a box for the better part of two years, and I only recently broke it out when I was considering replacing one of our linux routers. I had to go through the process of flashing it to wipe out the enable password (which I could not remember for the life of me) but from there on out it was smooth sailing. I even drew out a nice diagram of how it would work in my revised network layout at that site. I have some other neat stuff coming in which I will photograph at my earliest convenience, as well as a few other images yet to post 'when time permits.'
Monday, March 3, 2008
Phases of understanding
I am not an electrical engineer, nor am I an electrician. By all rights, I am far from qualified to comment on polyphase power, but I have to throw in a few words. Based on my limited understanding of three-phase power, I can surmise that you are provided with three hot leads, each of which carries alternating current electrical power at the same frequency, but with shifts, or phases. Typically, for three phase power (or, perhaps always) this means a shift of 120 degrees between each of the three phases. Now, if you have two 110 hot leads which are 180 degrees apart in waveform, then you have a traditional US residential power potential of 110 to 220 volts (depending on the power of the two hots). However, with three phase power, you actually have 120 degrees of separation between the waveforms, not 180, meaning that you have sqrt(3)*(potential of hot), not double, if you're using two hot poles. (image is from Wikipedia)
So, what got me started on this? At one point last week, I had the opportunity to ask if we had 'three phase power feed, and thus 208v of potential' for the power feeds to the bladeframes. The response that I got was to the effect of 'we don't have three phase power, it's 208v.' Since the Liebert unit (and it's monster compressors) run on three phase power invariably, and since that is really the only normal explanation for having 208v of potential, I had to run out and ensure that my limited EE knowledge had not failed me. I can now rest assured that it hasn't.
-AJB.
P.S. More pictures
I've also, as you might have guessed, taken far more pictures then I've posted to this blog (maybe you wouldn't have guessed that). Our upload bandwidth here at trinity is rather limited, and I don't have time (even batch processing time) to reduce the resolution of the images that I've taken, so, what I'm saying, is that they take forever to upload. But, since I have a few minutes now, I'll throw up a few more.
This first image is utterly insignificant except that it's a Halon 1211 fire extinguisher. These relics should not be used without a breathing apparatus, but it sure beats a bunch of wet computer hardware. The next image goes well with the theme of water on electrical equipment: it's our EPO button. Something about seeing them just forces me to remind myself "look with your eyes andrew, not your hands." I have similar thoughts when getting on ski lifts. It's so unfair that those damn lift operators get to hit those huge buttons and we don't! I might have watched a bit too much MacGyver when I was little. The third image is of a note attached to the upper compressor in our Liebert unit. I don't understand really anything about the way these things work, but I am unsettled by a note saying that something was almost stripped off in 1995. This thing needs to work for at least as long as our new cluster is useful.
The first picture here on the left is a reminder that 61A was a telecom closet (in many ways, it may very well still be). This wall clearly had a lot of stuff mounted on it at some point. The blue wire coming down from the ceiling is a 50? conductor cable that is going to be terminated in the 2 post open telco rack to provide our drop. That black wire coming out of the wall is actually a 12 fiber bundle, which at the very least is disused, but at least two of the fibers appear to be totaled (their ends are cut off). The picture on the right is of a bunch of APC units that we removed from the back of the bladeframes. My understanding is that these things are used for redundant power legs, or possibly redundant UPS units (is there a serious effective difference?) We don't have that many circuits, or that much money (how much would a UPS like that cost anyway?) so these are pending removal off to some other location, like the computing center, or ebay. Oh! I almost forgot. You can just barely see the top of a digital vt320 dumb terminal in the left image. Dunno what's going to happen to that yet.
P.P.S. My post seems to have changed to arial without asking me. I must be hitting tab too much.
Sunday, March 2, 2008
As promised
I managed to get a few more hours in with the cluster this afternoon. The Myrinet box has been safely removed and stored accordingly, and the SAN has been mounted.
Contrary to what I had said earlier, we actually have 7 250gb drives. There isn't really a point in trying to configure the system yet, as we don't have the necessary patch cables to connect the FC switch to the bladeframes and SAN, but they should arrive sometime this week. I also want to add, that this was by far one of the best designed rackmount kits I've ever worked with. What's more, it came with all of the right hardware so everything just worked right out of the box.
-AJB.
Everything that's old is new again
So, I just spent the morning replacing a linux computer that functions as a router with another linux computer that will function as a router. Sure, it appears on the surface that you're saving a lot of money by doing this. It's almost impossible to compare features between a linux system and a commercial hardware router. Then again, after doing this for a few years, and putting a few of these in mission critical applications, you stop looking at them as routers, and start thinking of them as 'the most important server pending the most inconvenient reboot or panic.' Needless to say, I have deployed many a high end hardware router in many an application, but the few linux 'routers' that are left are probably going to be getting the boot sometime soon as I simply cant have downtime as a result of cost savings. I'll try to fully justify this using numbers shortly.
-AJB
Saturday, March 1, 2008
Thoughts on Mainframes
A few days ago the NYTimes ran an interesting article on IBM's new mainframes, and apparently increasing sales. The article and the discussion of mainframes is cast pretty strictly in terms of virtualization. The real problem, is power.
Anyone who has worked in a modern datacenter can tell you that typically half of your electrical consumption (possibly a little less) ends up being consumed by the AC units required to cool the other half of your electrical consumption. Rackspace is very expensive nowadays (depending on where you look) simply because everything else has suddenly become so cheap. I can get a brand new 2U server installed in a rack for only a few k, at the most. At that point, it's usually only a few months before the combined forces of depreciation and the high costs of power, bandwidth, and rackspace, cause the process of provisioning and maintaining a home for that server far exceed the costs necessary to procure it. This is all obvious, and would not be at all surprising to anyone who has ever colocated anything. But that's really my point here. Virtualization isn't a neat technology, it's a method to save money.
The simple costs of shoving a bunch of computers into racks that will have anything less than a high level of utilization (i really like to think of this as throughput, but that's just due to my background) is a huge and unjustifiable waste when solutions like VMWare are available. Balmer I think misses the point when he tries to shrug off VMWare as premature and hard to use. He makes the same essential mistake with linux. What's worse, is that he yells at no end about minimizing the total cost of ownership with regard to both of these products, which is again, exactly the point. IT people are not ignorant of costs. If anything, IT departments are yelled at more than others to pull their weight in terms of expenditures. Innumerable metrics exist to judge a companies performance based on the size of its IT overhead coupled with its customer satisfaction. The use of technologies like VMWare (and it's actually impressive that 5% of the servers out there are virtualized on this platform) represents the most efficient cost savings measure made available to small and medium sized businesses in at least the last four years.
The fact that some technicians and managers might have to put in a few late nights to get everything working exactly right is by no means a deal breaker. It's an opportunity for people to feel like their pulling their weight and really adding to the bottom line.
And so, in some kind of crazy and roundabout way, we have come all the way back to mainframes. If you can afford at least a few of these things, and you're already looking to spend oodles of money on new VMWare deployments, then it certainly appears to make sense on the surface, even just going on the limited information provided by our articles here. When you factor in that some of these machines are some of the most reliable that you can buy nowadays, that's even better. Finally, to know that when you buy a piece of hardware like this, you have a lot more than a 'certified partner' at your disposal. IBM doesn't like to lose customers. Microsoft, doesn't seem to mind sometimes.
-AJB.
Friday, February 29, 2008
CampTrin/Trincoll.Info
During my sophomore year at Trinity, I decided to start a specialized wiki that only members of the college community could edit. For a while, maintaining it and encouraging others to contribute was an interesting and meaningful experience. Unfortunately, my Junior year was extremely busy with a number of more important projects, and despite my pestering several of my friends, no one was really willing to step up and take over the project. Amazingly, the site still gets a fair amount of pageviews. Not a million, by any stretch, but enough to know that it's out there and noticed, as well as indexed with several search engines. The real question is, what do I do with it now that I'm graduating? It's one that I'm going to have to seriously ponder.
-AJB
A glimpse of what's to come
I just received word that the final components for the new cluster have been ordered by the computing center. Hopefully we will receive our new ethernet switch (1000BASE-SX, kind of neat) and necessary patch cables (for both the ethernet and the fiber channel) sometime next week. There is still a lot to do before then, so don't forget to check back this Sunday for an update!
Also, as I promised earlier, I'm thinking about how to write up a post on the "Tech" of the Quadcast. I can't promise something polished right now, but essentially we have a 'retired' Sun Ultra 5 workstation (the CS department is trying to get rid of them, they were formerly arranged as 'cluster 2,' which has been out of service for several years and is now profoundly obsolete) configured with cron and a streamripper (actually called 'streamripper). It's running Debian Etch, and lives on campus here at Trinity. Files are captured automatically by this system, quickly edited with Audacity, and then thrown up on some webspace allocated to students on Shakti (an ancient unix server). We'll probably have to switch to another system soon, as these large MP3s are probably rather conspicuous.
-AJB.
P.S. - A brief interlude
The actual technical specifications of Shakti are a mystery to me. But, it is worth pointing out that at the very least it is impressively old. This suggests that the system has been online at least since 1997. It's uname -a output seems to support this dating as well
SunOS shakti x.x Generic_xxxxx-xx sun4u sparc SUNW,Ultra-2 Solaris.
Am I to believe that Shakti is a sun4u series Ultra 2 machine? So far as I know, there was only one variety of these ever made -- the A14 "pulsar." According to this nifty website, such a machine would be of approximately equivalent if not slightly lower power to the Ultra 5 machines that the CS department is now cannibalizing for our various side projects (such as powering the Quadcast).
Wednesday, February 27, 2008
Night Shift
Last night was a rare night off (ish) for me, so I decided to put in a few hours with the cluster. We just received our SAN and FC switch this past week, along with a 2 post open telco relay rack (generously 'donated' by the computing center). A network drop to the room is in progress, and our ethernet switch should be ordered sometime in the near future. We have also rolled the four post cabinet housing our retired Alpha server cluster into 61A.
So, the next steps are really rather manual. The Alpha rack needs to be stripped, carefully, of all the hardware that we no longer need in it. This includes the alphaservers themselves (8xDS10Ls) along with a 16 port Miyranet 2000 switch, and a splattering of other hardware. We're going to be keeping all of it, but we really need to rack to house the SAN and a few other servers that will eventually be moved to 61A.
So, now to provide you a little context. 61A is located in the basement of the Mathematics, Computing, and Engineering Center at Trinity College (we call it the MCEC for short). This building was constructed in the early 90s, and designed (or so I'm told) to resemble a computer chip. The Comp-Sci, Math, and Engineering departments are homed here, and it is the former home of academic computing).
The basement hallway is very long. Most of the rooms down here are engineering labs and machine shops. When the building was built, Academic Computing's main machine room was located on the left at the end in a nice purpose built room with redundant Liebert units and a very convenient raised floor. After the library was renovated in the mid 90s, Academic Computing vacated the space and turned a lot of it over to the engineers, who then had it subdivided into four rooms. 83, the one at the very end, remained an Academic Computing machine room. 63, the middle section of the former machine room, is now an engineering lab. 61, which is at the far end of the space, is also (according to its sign) an engineering lab, but it appears to be really more storage than anything else at this point.
61A is actually a telecomm closet which the computing center retained control over. It's only accessible through 61.
This is a picture of the Liebert System 3 in 61. It's actually one of two redundant units both of which push cold air into the floor. Originally, there were no walls to the left or right, as this was all one really long room. Now, however, thinking about how air will return to this thing keeps me up at night. Hopefully it will be able to provide us with enough cooling (it almost certainly will) however, the return of the hot air is a bit of an issue. The computing center has granted us use of their space in 61A, but truth be told it would be a lot nicer to use 61.
As you can see here, they're doing some work on the unit presumably to get it operational again. I don't know much about HVAC, but I'm assuming that the vacuum pump is a precursor to adding fresh refrigerants to the system (or possibly a method of checking for leaks). Either way, I hope there's nothing actually wrong with the unit, as without some serious A/C, this is not going to work. I am, however, optimistic that it will all get sorted out in time.
But, enough about the space, it's time to cover the equipment and the effort. First off, as I said, we received our SAN this past Tuesday, along with a SFP FC switch (with 10 SFPs). The SAN is a StoreVault S500 with five 250GB drives (four active, one hot spare). We also got the FC kit, so it comes with a fibre channel card pre-installed. It's a pretty nice unit that fits our needs almost perfectly, while still being affordable and leaving room to grow.
The fibre channel switch, a QLogic SanBox, has 10 SFP ports and comes with 10 SFPs (not pictured). We got ours from the vendor who sold us the SAN. However, a quick search of CDW shows they also sell pretty much the exact same kit retail. All things considered, this is again a great deal. The only real drawback is that the switch itself cannot be rackmounted without an additional rackmount kit, which, as you may or may not have guessed, is not an impulse buy. We'll probably figure out some other solution for keeping it up in the air.
This next picture is kind of a "well duh" shot. Up until this point I have not actually discussed the cluster itself at all. This machine was in use at a major financial institution until they decided to move to different hardware and donated it to Trinity. The system consists of three Egenera BladeFrames with a mix of blades running from Pentium IIIs, to four way Xeon 2.6ghz systems. The system has about 160 processors total, and while not necessarily the most bleeding edge of systems, it provides us with a real opportunity to do some serious research and science. This is really a huge step up for us since our previous cluster consisted of eight DEC/Compaq Alphaservers. Each of the bladeframes has two Control Blades, or cBlades, which aside from a variety of other tasks, provide the external I/O and Network interfaces for the system. Each cBlade has two dual port gigabit SX fiber interfaces, and two dual port Fibre Channel interfaces. According to the technical specifications, at least one of the the ethernet ports and one of the fibre channel ports must be populated on each cBlade. That totals 6 FC ports, and 6 1000BASE-SX ethernet ports for network and I/O. Expensive accessories.
This really provides a good contrast and scale comparison with our old cluster, an 8 node (9 if you count the controller) DEC/Compaq Alphaserver system with a Myrinet 2000 interconnect. This really was not a bad system for its time, and the Myrinet hardware is still really a very nice interconnect. For comparison, Egenera says the interconnect of the baldeframes reaches 'gigabit' speeds. This Myrinet system can sustain bandwidths of nearly twice that and at a far lower level of latency than is possible with traditional ethernet networks. Many commodity clusters are interconnected using standard gig-e today, however, if you can afford an option like Myrinet or Infiniband, the differences are huge (particularly due to the fact that network comm time is usually the limiting factor in all but 'embarrassingly parallel' HPC applications). I spent most of my time last night removing cables from this rack. Whoever originally cabled it did a really good job, and except for a nasty bundle of power cables near the bottom, the process was relatively painless. For those of you out there who are counting servers, there is actually one missing. A power supply in one of the systems failed over the summer, and it was really a very expensive part to try to replace. That server is still absent. The large one in the middle is the control server, which is of a slightly different model. At the bottom, you can see our 16 port Myrinet 2000 switchbox, and if you look carefully, you can see the Myrinet cards in each of the servers.
I stopped at the point of being able to remove the Myrinet box from the rack. The current plan is to install the SAN right about where that box currently sits (this avoids the need to remove the alphaservers until we can find a real home for them to stay in, or at least a good plan). The switchbox will actually probably just be moved down a few U. I don't want to mount the SAN too close to the bottom, because there are some really monstrously large servers which might get put into this rack, and I want to leave them with enough room that nothing will have to really get moved once its mounted. I diddn't dare move the switchbox on my own, as it has no rails, and dropping it would probably be more expensive than running your car into a tree at 15mph. I also have some additional pictures of the space and some of the equipment in it, but really don't have the time to tell that part of the story right this minute. Hopefully, I'll be able to put some additional time in this Sunday on the cluster, so look back for an update about it then.
Next Cluster
This blog actually started as a project to document the installation, configuration, and management of a massive ~160 processor HPC cluster that was just donated to Trinity. Needless to say, the time required to write my thesis, manage my business, and actually assemble the cluster has sapped all the time that I planned to use to write about the process. Thus, the blog has been renamed, and retooled to just be yet another blog about technology, albeit from an atypical perspective. Stay tuned for information about the cluster, among other things, as soon as I manage to find the time to post them.
In the mean time, you might want to check out 'The QuadCast!' Part radio show, part podcast, part senioritis, part content syndication experiment, and all kinds of tech: it's exactly the kind of project that I participate in when I should really be sleeping more. Aside from the radio show produced by my three roomates (thus, the title) we are also syndicating the content from two other shows broadcast on WRTC Hartford. Never has a few disused and obsolete Sun workstations and a limited quantity of student webspace on an ancient unix machine been so aptly used. More on the tech that powers "The QuadCast" as well as an inside look at my ambitions on content publishing should be coming soon. Again, time permitting.
What's Up?
Enter a bio (make it good!):
My name is Andrew Budd. I am a double major in Computer-Science and Economics at Trinity College. I am a student, and an entrepreneur dedicated to technology and innovation. My biggest issue is that there simply is not enough time in the day to do everything that I want to do. There never has been, and in my experience so far, that issue only becomes more severe as life goes on. So herein lies my motivation for starting a blog: I can write about what it is that I do, and what it is that I see. While I might not be able to create more time, I can at least keep a better record of my endeavors in the hopes that it inspires someone to do something that I never found the time to do.