Over on Edbrill.com there is a post and customer question regarding out of control mailbox sizes. The administrator asks Ed for advice on how to convince his company that they need quotas because he can not get the budget for an automated archiving system nor can he continue to sustain the effort put forth by restoring accidentally deleted email every time he asks his users to clean up and remove unneeded messages.
Rather than answer the question outright, Ed has thrown it out to the audience to answer. Even the most junior of Notes admins knows that the obvious, simple answer is to add drive space. But when people ask these questions they are not looking for band-aid solutions, they are usually looking for advice on how to change their corporate mindset. Any one who has administered Notes/Domino long enough understands that database size is directly proportional to the amount of admin time it will require to administer. This is not to say that Domino can not handle large databases (it can and does handle them quite well). But, larger databases usually require more time to run fixup, more time to run updall, more time to run compact, more time to consistency check, more time to backup, etc. There is another administrative time burglar that is harder to quantify with large databases and that is end-user support time. The larger the database the more corporate knowledge stored within it (theoretically). The more corporate knowledge stored within one location, the more valuable that database becomes and the more time spent handling every aspect of it with kid gloves. If you truly want to make change and thus get out of this hand-holding mode, you really need to take into consideration all the options as well as how sustainable those options are over the long run (such as in administrative overhead), their return on investment, and their limitations or the limitations that they impose. You do not want to implement a solution that will solve the users problem only to leave the admin spending as much managing the solution as he/she did the problem. There are no blanket answers because each environment is different and the limitations of a proposed solution to one environment is a boon to another. There are many generalizations that can be made about the different options in hope if helping to find the one that fits.
Before addressing the different options, let me first go back and explain my “theoretically” qualification of the amount of corporate knowledge contained in a large database. If the database were a teamroom, library, or other such database then it is fairly accurate to relate the size to the amount of knowledge (though how ones goes about equating a mathematical formula to convert megabytes to knowledge is beyond me). The issue with the question raised on Ed’s blog, however, is that the databases in question are mail databases, and we all know that the size of the mail database and number of documents contained therein do not in the least relate to the database’s value to the corporate knowledge-base. With spam (whether true spam or email blasts that the user signed up for), personal email, e-newsletters, the “agreed”, “me too”, and the ever annoying “please take me off this distribution” replies that are cc’d to the entire distribution list, the amount of corporate knowledge contained within mail databases appears to be much less (I have no hard numbers to back this up) than the amount of “noise” contained in these same databases.
Here are some options that should be considered when running into disk space issues on mail servers:
Add more drive space.
Pros: This is a great short-term solution because you throw some hard drives at your box and you instantly solve your problem. This is also an opportunity to get faster drives if you did not purchase the fast one’s previously.
Cons: It does not matter how much drive space you add, you still have a limit. You could add 1TB of new storage to your system and you still have a hard limit. Also, you must take into consideration that available space is not the only limitation that you face. Most operating systems also have a limit on the size that a single volume or partition can contain. Once you reach that limit you will have to look for a new solution. There is a similar limit to the size of individual files on many OS’s. This, again, finds you looking for a new solution when that one VIP’s database hit this limit. You will also be explaining why the company spent X dollars on drive space upgrades only to hit yet another wall.
Implementation Cost (low is good): Medium. If you use server-grade hard drives you can’t just purchase these at the nearest CompUSA or Best Buy so implementation cost can be higher than expected dependent on quality of drives and total storage space needed.
Long-term Sustainability (high is good): Low. As budgets decrease and storage needs increase with larger and larger file attachments becoming the norm, it is unreasonable to assume that IT can sustain the addition of new drives to the Domino servers on an on-going basis.
Archive data to another server.
Pros: Controls disk space usage on the users home server. Depending on how you configure your archive server you can archive data “live” so that you cover yourself for Sarbanes-Oxley related concerns. Backups can be run off of the archive server giving your mail servers one less task to manage.
Cons: Instantaneously double the number of databases the Domino team manages. If you are stretched thin managing 25,000 databases good luck getting a weekend off managing 50,000. If you are having trouble justifying the cost of an archiving system it is unlikely that you will justify the cost of the extra server and potentially the extra FTE you will need to manage the increased workload.
Implementation Cost (low is good): High. You have to buy new hardware and software.
Long-term Sustainability (high is good): Medium. With new hardware to manage as well as a complex archival system can your current staff sustain these new challenges along with projected growth in the email space on a long-term basis? And let’s not forget that the archive server will also run out of disk space one day.
Pros: Built-in functionality of Notes/Domino. Flexible enough to be “kind” by just annoying the user with dialog box messages or to be “hardcore” and severe the users ability to function.
Cons: Puts most of the onus on the users to “fix” the problem. Also, most organizations can not operate functionally with a single quota standard. The CxO’s usually will not find the strict quota’s that are assigned to the worker bees as functionally operational to their day-to-day needs. As soon as you find yourself creating two types of quotas, middle management will state that they, too, need a separate quota. And then what happens when someone is inevitably working on a hot project and needs their quota removed for a few weeks because they will be getting daily presentations of 100MB’s a piece? Quota’s require a very thought-out procedure to handle many different use cases.
Implementation Cost (low is good): Medium. Though the actual quota mechanisim is free with Domino, the cost in man-hours consumed to plan out the different quota levels and use-cases can be high.
Long-term Sustainability (high is good): Low.
Adding more drive space and archiving share a common issue: they are temporary solutions for disk space management issues. By implementing these options you have not actually addressed the real issue which is how to control the unrestricted growth of mail database files over the years. Both options have a built-in hard limit to them in the amount of disk space that is available. At some point you will have to face the issue once and for-all. This can include the unenviable option of bringing up new servers and migrating users or it could be re-evaluating again and choosing from a different data management solution.
All three options mentioned above also have a common issue: high administrative overhead. By adding more disk space you encourage users to continue growing their mail files thus increasing the time to run maintenance on the databases. By archiving data to another server you double the number of database you are responsible for maintaining. And by implementing quotas you find yourself dealing with the “emergencies” of temporary quota removal, as well as in many cases you still have enormous databases to maintain because of unrealistically “loose” quotas for a large percentage of the user population.
A fourth option that was only mentioned in passing is:
Data retention policy.
Pros: Defines a corporate-wide standard of how much drive space is allowed per user/user-role (such as Engineer, Level 1 Manager, Executive). Can define hard standards with little to no “wiggle” room allowing the end-user to clearly understand the policy as well as allowing IT to easily implement (e.g. purge ALL documents (excluding Calendar and To Do’s) over 30-days old every Saturday).
Cons: Requires an extra software component to purchase and configure.
Implementation Cost (low is good): Low-Medium. Configuring the data retention tool is usually quite simple, but the planning phase does take some thought, though it is not nearly as much effort as planning quotas.
Long-term Sustainability (high is good): High.
(Full disclosure: if this is your first time reading my blog, you should make note that Conxsys is the creator of Logsitic for Lotus Domino, a tool that assists in data retention management.)
While it may seem obvious that I would be biased toward data retention as the means to achieve smaller databases, that is not quite accurate. I do believe that data retention holds a significant key to smaller databases, less overhead, and long-term sustainability, but I find that realistically most organizations find themselves needing more than one of the options mentioned above. For example, those organizations that find themselves in the midst of struggling with the requirements of Sarbanes-Oxely or similar regulatory control find value in both data retention and archiving. Organizations that do not have these type of strict requirements will usually find happiness with quotas and data retention. And companies that are simply in a disk space crunch will likely find that adding new drives will be the best immediate solution with data retention or quotas being the long-term solution once past the drive space crisis. However, I would be remiss to leave out that one pass of a data purge tool (yes, even our competitors solutions) can in many cases yield hundreds of gigabytes or even terabytes of free drive space over night. If ordered on the same day, it is unlikely that the new disk drives would even be on-site before the data purge tool finished it’s first pass in many cases rendering the new drives unnecessary.
In the end, there is no silver bullet for the issue of disk space control. Each organization has differing needs with differing levels of knowledge in Domino and time to explore and learn all the facets of each option. What works for Company A may not — and probably will not — work for Company B. But what you need to keep in mind when considering any of these options is that if the solution is put back on the end user — that is, ask the user to delete the data — it will be perceived by the user as one of their lowest priority tasks. For true success, the task must be automated in some form or fashion by the server and should most likely be transparent to the end user without giving the user the impression that they are allowed to eat as much disk space as they want and the company will continue to give them more because that is not only impossible to sustain long-term, it is the complete opposite attitude that you want your users to take.