Thursday, February 26, 2009

You're Invited

Introducing a new product to the world is always a special pleasure for me. This past weekend I assembled an ad-hoc rapid development team with the objective of building a task management system that could handle a large number of items spawned from numerous sources (click the link above to read a description of the principal behind the system). I am pleased to announce that we have completed our initial alpha prototype, and would like to invite anyone out there who thinks they have a crazy todo list to join up and give it a spin! Keep in mind this is an ALPHA system whose CONCEPT is not even a week old! That being said, there are a few things to know.

First up, we have a bugzilla install at http://bugzilla.scramdac.com, and would really value feedback and bug reports.

Second, you'll need a few tips to get started:

Queues
This system is designed to track your tasks based on their real world sources, and the relative priorities of those real world sources. For example, a weekend project or personal project would be of a relatively lower priority than say, your work tasks, or coursework. Therefore, it is necessary to create at least one queue with at least one relative priority value. Relative priorities are the reverse of what you would think they are. That is, a priority 0 queue is the HIGHEST priority, not the lowest. I would suggest creating a queue with a priority of 10 first, then you can make your additional queues higher or lower priority than that queue (higher being a lower number, and vice versa). It is easy to adjust these priorities later.

Tasks
The bread and butter of the system is the set of tasks that you enter into it. Each task is associated with a given queue, and assigned a "importance" level. Importances are single letters from A-Z, with A importance items being of the highest priority, and Z items being of the lowest priority. You can also enter a description for each task, and an estimated time to complete it (this does nothing right now, but will soon be leveraged for some upcoming features).

ToDo List
The tasks are aggregated into a ToDo list and sorted by a mixture of their importance, and the relative priority of the queue that they are associated with. That is, an A level item from a low priority queue will be lower down on the list than an A importance item from a high priority list. I could explain the algorithm that is used to generate the list, but it is not easy to do so without a graphic, which currently only exists on notebook paper. I will post one soon with a more thorough explaination of how the algorithm actually works in the near future. In the mean time, just experiment until you end up with a balnace that makes sense!

So please! go ahead and give it a try! We would love to hear your feedback on this. Keep in mind that our objective was to create a useful task management tool. Anything beyond that would be pure coincidence (see the blog link above where the product is described for more information about the motivations).

Take care,

Friday, February 20, 2009

Thought Experiment: Using MySQL for working memory

Several friends of mine have recently expressed interest in creating a new MUD project. MUDs are kind of neat, especially for developing and testing game design and balancing principals, but are generally not commercially useful or very good at retaining value in the long run. However, they do sometimes give rise to a few interesting programming / implementation questions.

Recently I've been wondering if MySQL might be useful for creating a virtualized shared working memory subsystem to replace RAM and flat files in the context of a MUD. Let's consider a typical data structure.

MUDs consist of worlds made up of MOBiles, Players, Objects, and Rooms. Let's consider a Room. Rooms can contain Players an Objects (on the floor) as well as Mobs and room programs. These associations are typically stored as null terminated doubly-linked lists of pointers in memory. If the mud crashes, all of the data which has not been written to one of the flat files that store the players, rooms, etc is lost when the memory is purged. Further, building any kind of redundant or load balanced system is a challenge without having to rebuild the entire architecture of the system.

If, however, a MySQL database were used to store all of this working memory information, crashes would become less painful, parallelization would become easier, tool development would become easier (interface with the database rather than the MUD application) etc. What would be the drawbacks? Obviously performance, loss of control over validation (it's never good when validation code is replicated in more than one part of the system) the potential for invalid states to rise in an effectively persistent memory (the database) leading to a greater chance of data corruption, and the lack of a sane method of event based updates to the data. That is, an event recorded on one system in a parallelized environment may update the data that all of the other nodes would be referring to, but without any of the associated triggers in terms of data to send to connected players.

This then regresses into consideration of an event system configured in a shared memory environment (again, a database could be adapted to this purpose) but then raises the question of which system should be the one to write to the memory, causing a fallback onto a required elected master/slave failover system which has potentially wiped out any complexity savings that previously seemed apparent in using the MySQL database for parallelization in the first place.

So what value is there in the conclusion of this thought experiment? For one, it doesn't seem to make sense to hammer the square MySQL block into this round parallelization issue hole. I for one still wonder if it would be possible to design a subsystem that would achieve some of the desirable features of such a shared memory subsystem using MySQL as the backend datastore. However the point is ultimately moot as the performance of such a system would be so miserable as to be impractical. There already exist a plentitude of solutions for this problem all of which are probably superior to using a shared relational database for virtualized shared memory.



Saturday, December 6, 2008

Data Mesa?


To the left, a traffic graph from one of our routers serving tow network segments over two ISPs. The segments are aggregated through a common network interface (which can also failover) so this shows the traffic from both. Max transfer limit of the two in aggregate is a little better than 45mbit or so (heh, spare capacity is sometimes a good thing, we have spare capacity). Anyway, I thought that this was kind of a funny looking "Data Mesa" where one of our customers transferred a ton of data over a day or two (apparently limited by the speed of their network link to about 3mbits).

The image to the right is of our (my family's) two dogs. Apparently, it's cold in the house back home and my sister is trying to send a subtle message (to turn up the heat, which the rest of us refuse to do). Of course, I say that form the comfort of my MIT graduate housing apartment, where I can set the thermostat to whatever I want without consequence.

Now that the "business as usual" is done with, I'll cover a little bit about when and how I'll be updating this blog (yes, backwards, I know). Anyway, I maintained this thing fairly well during the latter half of my senior year at Trinity. The summer was extremely busy, and Sloan is almost as busy as the summer was. There has not been much time for blogging, and the reader base of this thing is [fortunately?] quite small anyway. Updates going forward wont be frequent, but they wont be as scarce as they have been for the past few months.

Monday, April 14, 2008

C.O.D.

I'm slowly sliding into the last few weeks at my current institution, and am very much looking forward to the summer, and the resumption of several other projects which I have simply not had the time to properly attend to as of late. I pointed out in my last post that my senior thesis in Economics was finally complete, and I'm also pleased to announce that my research and writeup efforts earned me co-authorship on a paper which will be presented at a conference in London in several months. Assuming the article makes it to some online-accessible medium, I'll post a link to it here.

I haven't really spent any blog time explaining what exactly my thesis is focused on, so I'll provide a brief introduction now. Economics models can be very very difficult to validate. One interesting way to experiment with a model however, is to build a computer simulation of it. However, as soon as you attempt to do so, especially from a comp-sci/programming perspective, you realize just how 'top-down' a lot of the logic inherent in these models really can be. For my thesis research, we started out with a canonical Neo-Kaleckian growth model (a form of demand-led post-Keynesian system) and constructed a computational agent based simulation of it. The results actually proved to be very interesting, and with any luck, our modifications as well as some of our results and conclusions should open up a new line of investigation surrounding this particular model. As this is a technology blog, I wont get into the actual details of the economics. However, I will add that we used the Repast toolkit to implement it. Repast was a fun ride, but in the end it suffered the same frustrating traits that all frameworks ultimately fall victim to; they speed things up right up to the point where you need to do something the framework wasn't designed to allow, at which point they slow you down a lot. In the end, I cannot blame the toolkit for this, as all good software always comes down to good engineering, which always inevitably reverts to good planning, leading to the issue of knowing at the start exactly what it is that you want to do with something. Frameworks can be misleading (in letting you think that you should just follow some simple rules, and extending things later will always be easy) but they are never a substitute for careful specifications. Of course, when creating a research tool, you end up with the unpleasant situation of the software changing its own requirements.

I've also just received word today that another paper I helped to write this semester on methods of software engineering education (as applied to a method we tried here at Trinity) was accepted at a conference focusing on... software engineering education. Last semester I TA'd a course in software engineering which included a practical component focusing on a system that I designed last year with the help of another CS major to log how computer-science students spent their time (in terms of a breakdown between the various critical steps of a process). This simple web application proved to not only be a useful tool for logging time, but also for teaching teams how to carefully plan and extend an existing system without perfect documentation on a timetable while assigned to teams defined by their characteristic asymmetric skill sets. It was a lot of fun to structure, and we ended up getting enough novel and interesting things out of the experience to take a stab at a paper, which then turned out pretty well.

But, as I said, the year is winding down and I am looking more and more to my next two years as an MBA grad student at MIT Sloan, with just a few things standing between me and that inevitable and very thrilling future. Of course, the biggest (in terms of sheer weight) is the new Trinity cluster. As I said in my last post, we received much of the last of the hardware that we need to make everything run. As it stands now, we actually have all the critical hardware components, and are in the process of settling some software issues.

As I said earlier, we received our new ethernet switch as well as its associated fiber optic patch cables (LC/LC duplex multimode) necessary to connect the bladeframes to the network, as well as the FibreChannel switch to the SAN and the bladeframes. The switch is a cisco catalyst 3750 with 6 1000SX-BASE SFPs and 1 gigabit copper SFP. The gaggle of day-glo orange is of course, the fiber. I wish it looked 'neater' but considering that it's six feet off the ground, I'm not too worried about something snagging it. As you can see in this picture, we also have our qLogic SANBox 1400 (FibreChannel switch) located in just about the same place in the rack. An old netgear 10/100 rackmount 12 port hub is serving as a makeshift shelf of sorts. The other box sitting on top of it is a 16 port 10/100 switch that is giving us a few more ethernet ports. We're actually going to be receiving another ethernet switch (copper only this time) to replace this netgear switch as well as the little 16 port sitting on top of it. The new switch will plug into the 3750 and allow us to save a port of our drop and presumably a port on the big switch that serves the MCEC.

To keep everything safe and cozy, we're using RFC1918 subnets for both management and the blades themselves. These networks will only be accessible from specific other hosts and networks on campus alleviating a lot of the security pressure on our end. Both networks will actually be on the same vlan (for various reasons) but each will be on a different logical layer 3 network. These networks are actually both already configured, but as one of them is not plugged into the larger network (namely, the management network which is running over the netgear and it's little trendnet friend) I prefer to consider it the future.

As I mentioned earlier, our AC unit has finally been repaired as well. Apparently the issue was some kind of component failure in one of the two compressors. It's been fixed, and is now pushing cold air into the floor (it's set to 68) however, I am still very nervous about the return air supply. Also, it keeps complaining about 'low humidity' (which makes sense, as computers like a relative humidity level between 40% and 60% to keep static from building up) but I actually see this as a symptom of poor air flow. A lot of the air put out by this unit is going into 61A but at times (if the door to 61A is closed) has serious trouble returning to the liebert unit. A lot of air is getting sucked in through the cracks around the door to the hallway meaning that if you leave the door to 61A closed, the 'low humidity' alarm will go off as the air the unit is getting back is coming around the long way, and is already very dry (or, rather, not humidified). These units were designed to cool one really big room redundantly. Now, the unit that we're using is kind of awkwardly boxed up in a rather small room, and is being asked to cool a bunch of computer equipment in the room next to it. This is less than ideal, but this is also a very big air conditioner. I'm confident that it will be able to fill our needs nicely.

Ultimately though, the system is up and running, the SAN is working, the ethernet blinks, and the pressures is now on getting the software configured. There are a couple of wrinkles involved in this step, which I hope to get into in a future (hopefully the near future) post. Stay tuned!

Sunday, April 13, 2008

Back in action

It's been quite a while since my last post, mostly on account of my senior thesis in economics, which is finally complete and turned in. In the mean time, a lot has happened on several fronts. A messed up data stream at WRTC has pretty much killed the 'online' aspect of The QuadCast (short of plugging a radio in to a computer, there is no easy way to capture the stream, at least, not one as effortless as a cron job and a streamripper). The cluster is progressing quite well, at least in terms of hardware. We finally have had our air conditioner repaired (evidently a component of one of the two compressors had failed) and it's producing cold air (but I am still nervous about how well the hot air will return to it from 61A). We have received our new cisco fiber ethernet switch, and both the fibre channel and ethernet multimode patch cables have all been installed. We have had our network appropriately subnetted and configured logically (more about that later) and the system is actually turned on and running. All in all, we're way behind schedule but looking pretty good going forward. With any luck, it will actually turn out something during my tenure! Oh, as one other last aside, we managed to find a buyer for our older Alphacluster which is kind of cool. It's not often that you get to earn a profit for your department.

I'll fill in all of the gaps of this soon, along with new pictures and details as soon as I find the time to really buckle down. I'll tackle it in a few posts over the course of the next few days.

Thursday, March 6, 2008

DNS destination

So, things have slowed down a bit in terms of side projects over the last few days. We are still waiting on the A/C, the new ethernet switch, and for the two post rack to be mounted on the cluster front. I have a bit of work that I need to get done on my thesis (which I will actually be posting about later today hopefully) and in general, everything else is running smoothly. Sadly, that doesn't generate much material for posts, but there is perhaps something of interest in all of this.

We recently decided to bring a few sites that we maintain which are hosted at shared hosting providers 'in house.' This is a tedious process of freezing, backing up, copying, deploying, reconfiguring, testing, and then updating DNS data so that the website hits the new server (at least eventually). At one point, we used to do our own DNS hosting, but while convenient for some things, this actually proved to be a very inconvenient strategy at some point. Instead, we now use Nettica, which has proven to be very reliable.

Anyway, check back later for hopefully some more interesting content.

Tuesday, March 4, 2008

Pausing to reconsider

Several days ago, I posted that I was looking to displace my linux based router firewalls with 'enterprise' appliance like solutions. Let me start again. A lot of this discontent was fostered by an unstable box in a very critical position, that had a habit of going down when I needed it to stay up. Since I spend most of my days these days 132 miles away from that box, I was somewhat forced to stick to the plan of rebooting it (remotely) when trouble cropped up. This experience made me very bitter, because every time this machine went down I lost the confidence of those who were relying on services that were dependent on it. Eventually, I began to wonder if I was doing right by my customers by using a more versatile and less expensive solution that seemed to be less reliable.

I have come to my senses. Business owners often need to come to the realization at some point that spending money does not increase customer satisfaction. Just because there is a more expensive option that is better marketed, does not mean that you should question the validity of your original strategy. I have to remind myself of this sometimes as well, as I had to in this case. A linux machine with a custom 2.6 kernel, coupled with systems like dhcp3-server, bind, openvpn, ntp (server), of course iptables, built in VLAN support, and any expansion card that can fit in a standard expansion slot, blows almost anything else out of the water in terms of features, and certainly in terms of price. Many of these features are essential to providing a high quality and reliable service. The hardware is really no different than that which runs in any of the leading 'appliance' solutions either. It's all about the software, and with Linux most of the time, that comes down to your ability to intelligently configure it.

It somehow seems appropriate that the pizazz of good marketing is very compelling until you try to justify your persuasion with numbers and common sense.

-AJB.

P.S. Pictures make stories better
While I am often reminded that slides for talks are best without any text on them (a theory that I debate to this day) I do recognize that Blogs are better with images. I think part of it comes from my own desire to look at random images of cool high-tech equipment (try google image searching for things like 'core switch' or 'fiber' one of these afternoons) and share my own pictures with others. Part of it also comes, I think, from a desire to share which somehow always seems more genuine when it involves images.

Pictured at the top to the left is the front of a Cisco PIX-501. It has been sitting in a box for the better part of two years, and I only recently broke it out when I was considering replacing one of our linux routers. I had to go through the process of flashing it to wipe out the enable password (which I could not remember for the life of me) but from there on out it was smooth sailing. I even drew out a nice diagram of how it would work in my revised network layout at that site. I have some other neat stuff coming in which I will photograph at my earliest convenience, as well as a few other images yet to post 'when time permits.'