Welcome to

The Conxsys Blog

Thoughts on Conxsys, Notes/Domino, LotusScript, & Technology

Subscribe to the feed Follow Conxsys on Twitter Follow Corey Davis on Twitter

Conxsys on Facebook

Our Products





Popular Posts


Recent Posts


Categories

Archives

Blogroll

Email Database Size Management – Additional Disk Drives

April 24th, 2007 by Corey Davis

Invariably when an organization begins to run low on available drive space on any of their Domino servers one of the very first suggestions to be uttered is “add more disk�. On first blush it is the quickest, easiest solution. Drop some new disks into the box to double or triple drive capacity and problem solved, right? No need to send out a corporate-wide distribution pleading with the user community to purge old mail. No need to find the environments largest disk offenders and send them a special nasty gram. And you don’t need to hear the taunts of the user community about how they added a terabyte of storage to their multimedia server at home for a few hundred bucks, so why is the company so cheap and IT so inept? But, is it really that easy?

Not usually. Let’s first consider the notion that adding new drives is inexpensive because I have heard this one in almost every company I have been to, no matter the company’s size or market. It is true that drives are cheap today and that some of the bleeding edge users may have over a terabyte of storage in their home media server that was added for three hundred dollars or less. Case-in-point, the Seagate Barracuda 500GB drives. At only $140 from newegg.com you can add two of those into an array and you have an impressive 1TB of storage for less then three hundred dollars. But what your user community does not understand is that these drives, while impressively large, are not server-class products. They achieve such high capacity by adding more platters and increasing the aerial density of the disks. Adding more platters means more heat. Increasing aerial density means more chance for data loss. As much as the user community does not get this, you can not simply toss these into your box and expect to maintain a reasonable amount of uptime or data integrity.

Realistically speaking, you are probably looking more along the lines of a 10K or 15K drive that is made for enterprise level computing, and most of these come in the 36GB, 73GB, or 146GB flavors. According to the RAIDcalc with 146GB drives on RAID 5 (this appears to be the norm out there) you would need 9 drives to achieve approximately 1TB of storage space. At $1,100 a pop, you are looking at just shy of $10,000 for just the disk drives.

Maybe price is not an issue for you. Maybe $10K on drives is chump change. If it is, there are other drawbacks to take into consideration when it comes to adding more drive space. First, while you may have a whopping amount of space now, your users will eat it up. Give them space and they will use it. By simply caving into their appetite for more capacity you will be creating a beast that is difficult to control. Once they realize that they can simply save everything, no matter how mundane or irrelevant to their job, and IT will just add more and more drive space, you will find yourself in a never ending cycle of adding to your disk capacity.

There are other monetary factors to take into consideration besides simply the cost of the drives. There is rack space, electricity consumption, the time to pay your server engineers or a consultant to install the drives, the loss of employee productivity from the server down-time that is almost assuredly going to take place for installation (yes, it will likely happen over night or on a weekend, but employees work during odd hours today with our prevalent VPN and web-based email access), and shipping costs. That once attractive pitch for adding new drives as a cheap solution has become a capital expense of sizable enough proportions to now require sign-off by executive level VPs. To be blunt: if you think adding new drives is cheap, you are wrong.

Unless you’re Google – and even if you are – this cycle will be unsustainable over the long run. You will eventually hit a limit in the OS on how large a single volume can be or you will hit the limit on how large a single file can be. If you are adding more drive space in an attempt to lessen the impact on your users you must realize that when you do hit that single file size limit – and you will eventually reach it – this will impact the end user. If you take the Wimpy approach of paying later for disk space today, you will find that your users will have an even more difficult transition to slimming down their email databases than if you had forced them to do it before they had 2TB’s of email in a single NSF.

Speaking of the user, what about their perspective of all this? Undoubtedly at first they will be thrilled that the company “understands� their needs and simply adds more space rather than limit them. But what they fail to grasp is that by doing this, by giving them what would seem to be limitless space, the company is leading them down a path to lost productivity. Eventually they will have so many years of mail that they may not be able to find anything at all. But even if the user is strict in their email filing they will run into latency issues with view index rebuilds. What happens when they go looking for that message they sent their boss two weeks ago and navigate over to the Sent view only to be faced with a lightening bolt for ten minutes? Why would that happen, you ask? It has been so long since they have accessed their Sent view that the view index has been discarded. Now, Domino must re-index the view – a view that contains pre-turn of the century mail (you know, email from the 1990’s). But the server is already so bogged down by rebuilding indices for views that contain tens or hundreds of thousands of documents for other users that everyone is becoming good friends with the lightening bolt. Even if the monetary cost of adding more drives is insignificant to your organization, you will end up paying for it by other means such as slow response time, longer maintenance windows (fixup, updall, compact), exceedingly long backups, more backup tapes/disk expenditures, and a very likely increase in overall network bandwidth usage possibly leading to latency across the board for all of the organizations network-based computing.

Finally, if you allow your users to retain any amount of email they desire, how will you go about mining all those messages if you should ever become the target of an investigation? Your company does not have to be involved in an Enron-style investigation for this to happen. The news has been filled with stories of employees suing their employer and in turn asking for evidence in the form of old email messages. Legal ramifications of allowing all that old email to set on the corporate email server aside, just how will you go about finding these messages? Obviously there is software available that will dig through the haystack for you (unabashedly blatant self-promoting link), but if you have a controlled environment with small email databases not only will you be able to produce the results faster (keep in mind that the judicial process usually gives organizations only a set amount of time to produce these emails), but a smaller set of data is also less likely to contain a large number of false positives.

In the end, there may be times were adding more drive space is the most favorable option due to time constraints, but it should not be the end of discussion for this situation. The problem will come back and bring many friends along with it that you and your co-workers will likely have no extra time to manage. If you are forced into adding new drives by all means do so, but also take some time after the situation has settled down to continue the discussion by looking at alternatives to avoiding this problem in the future.

Implementation Cost (Admin)

  • Monetary Cost (low is good): High
  • Implementation Complexity (low is good): Low
  • Configuration Complexity (low is good): Moderate (requires engineer with knowledge of your environment’s disk array solution)
  • Outage Required (no is good): Yes (depends on disk subsystem)

Overall Admin Cost (low is good): Moderate

Implementation Cost (End User)

  • Implementation Time (low is good): Low

Long-term Sustainability (Admin)

  • Additional Workload to Maintain Solution (low is good): Moderate
  • Scalability (high is good): Low
  • On-going Monetary Cost of Solution (low is good): High

Overall Sustainability (high is good): Moderate

Long-term Sustainability (End User)

  • Additional Workload (low is good): Moderate (as the user accumulates more email it becomes more time consuming to search and take longer to open)

Overall Cost of Additional Disk Drives: Moderate


Tags: , , , , ,
Posted in Conxsys, Domino, 640 views, 0 Comments
Digg This Submit to del.icio.us Submit to Technorati  

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.

Copyright © 2006-2009 by Conxsys | Login | Powered by Wordpress

Template based on a design by Design4