One of the phenomenons of software is that software grows and grows and GROWS. The more popular a tools or framework the larger it becomes. There has been a mild push towards small tools over the last couple of years, but it has been felt entirely within the open source domain. Main stream software vendors continue to believe that bigger is better when it comes to software tools. This is a very sad situation since the best software is usually the simplest. It's as though there is some kind of reverse-Darwinian selection taking place: "the survival of the least fit"; almost like we've been selecting race cars on the basis of how many power adjustments the seats have.
What really surprises me about this particular issue is how frequently people simply don't see why smaller and simpler is better in software unless they are the ones writing it. If they were bridge builders would they buy the universal expands-all bridge that can be placed over any span of water? Sure it costs $2.4 billion regardless of what you are using it for, but why waste the time with anything else since it can be any bridge.
Somehow people think that the rule that "if you don't put it in it can't break" doesn't apple to software written by "other people." Not only can other people's software break, it will. The bigger it is the more certain you can be that it will break, and there are other hidden costs as well.
We had a piece of software that needed a simple LRU (least recently used) cache. The person writing that code naturally used JCS (Java Caching System) since we had used it before and it is a very powerful cache. It's also the Swiss-army-knife of caches. In this case the cache was getting thousands of hits per second; under those loads the massive overhead imposed by JCS was creating a performance bottleneck. We replaced it with LinkedHashMap which is provided by Java, and it sped up by a factor of twenty. In this case, if we had said "We're not going to need it" (which we knew to be true) and used the smaller tool we could have saved ourselves a performance-debugging headache.
Rather than going on about why writing big bloated tools is bad (which I've already done a bit in another blog), what I'd like to do is talk about some of the reasons why we all tend to fall victim to the tool-of-ever-increasing-size problem and how we can avoid this fate.
Indecision matrix
The bane of all reasoned decision making is the decision matrix or feature chart. While there are a number of reasonable methods to produce a good decision matrix, what is usually done instead is that a feature list is used to stand in where a proper analysis is lacking. "Look," says Charlie Manager, "the bloatfish security toolkit comes with added spaghetti. RSB's toolkit doesn't have any spaghetti; we'd better use the bloatfish toolkit instead."
After a while it's not surprising that the makers of the simple and elegant system start to notice that despite being better, faster, and easier to use everyone seems to be selecting the system with extra bells and whistles. In a market economy the customer makes the rules, so why not throw a little spaghetti in your product to make a buck if that's what the consumer wants?
Articles and blogs that compare very different tools (say hibernate and rails or pico and hivemind) and then feel forced to "decide" that A is "better" than B make the problem worse. The component writers may ignore the punditry and realize that they are different and serve different needs, but then again they may be tempted to buy-in and competitively add features so that their tool will win the next time we whip out our yardsticks and compare apples and oranges with them. Even when the articles are fair and talk about when to use A and when B makes more sense the pressure can still be there. It's hard for writers to avoid this, especially when it's a sure fire way to get a lot of attention (read controversy).
Even open source and free tools fall victim to competitive featuring. MySQL is a good example. They had a perfectly good non-transactional database system. It was really fast and worked pretty well. So what if it wasn't Oracle, it was simple and solved problems within a certain domain. While they've continued to focus on simplicity (to a degree) they also added a bunch of things that are relevant in much larger database systems.
Despite its success, people dismissed MySQL (and still do) as not ready for prime time because it didn't have "proper" transactions. Instead of giving in, I'd have preferred to see them stick to their guns and make the best darned simple non-transactional database ever. If you are saying, "yes, but look at all the new features it's got now" then consider how much speed, simplicity, and stability they very likely could have had with the same amount of development. They made a choice and if you didn't need transactions it wasn't a choice that benefited you at all. You'd be surprised how many applications don't need transactions. To me it feels like they gave in to feature envy and made a great product in a smaller space into an okay product in a larger space.
The problem here, though, is primarily with the decision makers. Google will not tell you which tool is best. You cannot escape the requirement to actually try stuff out. If you need to use a decision matrix to pick your tool, at least take the time to do it right by FIRST determining what you need, and then weighting only those factors associated with actual need.
Software never cures
As you may or may not know, cement is never 100% set (or cured). Apparently it continues to cure throughout its lifetime. Similarly software is never "done." There is always one more bug or (sadly) one more feature. This very true statement has led to some odd behaviors.
The first odd behavior associated with the permanent impermanence of software is the COBOL/SNA lock-in rule. Once your customers are dependent on your product (which will always have bugs), you can keep adding features, periodically end-of-life old versions, and force your customers to keep upgrading. To keep margins up you can outsource the coding to Elbonia while you are at it. It really doesn't matter that not a single one of your customers actually needs XML support; they'll sure need it when the old version is no longer supported.
The open source behaviors driven by slow-curing software are a bit different and much less evil. If you've got a cool open source project that people really like, it's just hard to stop tinkering. Simply fixing bugs is boring. Adding double-overhead-cams now that's cool. The secret here is to just let it go. Even though it's popular, you can move on and maybe build an even more popular tool. If you can keep fixing bugs then you are a saint, if not at least you won't create any new ones and other people can fix the bugs if they need to.
Substandards
Standards have been the basis of open source success for years, but they are much hated (secretly) by large software companies. The result is that almost no standard has made any substantial progress after its second major version. Backward compatibility plays a part in this, but political sabotage is probably the larger barrier.
Having said all that, what do standards have to do with bloated software? Two things: first, standards themselves tend to bloat up and become useless; software that chases the standard does likewise. Second, standards are typically big things. As a result small simple components face the whip of not being "standards compliant."
Let's say you want to build a small simple persistent event dispatcher (persistent publish and subscribe) in Java. Easy enough, but after you do people say, "Hey, J2EE defines JMS and you're not compliant." All of a sudden you need search ability and XA transaction support and a host of other features. You are bloated.
My advice to open source developers (and users) is to think about the use first and the standards second (or possibly not at all). If you think your simple solution will meet your need then the odds are it will meet someone else's too. Leave world hunger to committees and feed your neighbor.
Making friends
The last common reason for software bloat in tools is general friendliness. The software company gets a polite request from a customer who uses the tool that goes something like this, "I've been typing my name into the @#% publisher name field every time I have to publish an object, can you let it default that value from the database on Tuesdays and the file system on Wednesdays? I can't see how you view this as reasonable software if it doesn't default properly!" It seems simple enough, so out of politeness you comply. After all, the customer is always right!
The open source version goes like this: "Hey I've noticed that your code uses a properties file, I've coded up a totally-fly XML version that uses XSLT and a SAX parser to replace it. Can you include that in the build?"
"Hmm," you say to yourself "there are only two properties we really don't need to use XML, but he's gone to all that work to code up this thing so I may as well include it." While that's an awfully congenial approach it also means that everyone who uses your software will now need to include that SAX library.
Ultimately, a little benevolent dictatorship is sometimes a good thing. Most of the best projects have a touch of it and while it pisses some people off it keeps the code a whole heck of a lot cleaner. Be nice, be polite, but don't let the place get cluttered up just because someone wants to leave they're stuff in your apartment.
Parting words
If you need a 10mm wrench, you'd never buy a 15mm instead just because it's more popular. Feature lists, size, and multi-letter compliance aren't what make good software. The ability to perform well, be easily understood, require little maintenance or configuration, and run without errors are what counts.









