Suggested DBA Work Plan
One of the appendices in my upcoming book covers the daily, weekly and monthly tasks that a Production DBA should perform. Presented here are those tasks. Note that this is a suggested list only, and will obviously vary depending on the enviornment. Further, a lot of these tasks can obviously be automated using a variety of different tools, so this list is essentialy about the things that should occur, not necessarily how they should occur.
Daily Tasks
- Check for successful backup completion, including tape archives if using the disk then tape backup methodology,
- Confirm SQL Server Agent jobs completed successfully,
- Check free disk space on all disks including system drives and SQL Server data, log, tempdb and backup disks,
- Check free space of each database's data and transaction log files and expand if necessary to avoid autogrow operations,
- Check SQL Server errors logs and Windows event logs,
- Confirm DBCC checks executed without error by opening and inspecting the appropriate log files. Depending on the DBCC check frequency, this may be a weekly task,
- Check site specific tasks as appropriate, such as the success of archive batch job(s),
- Check technology specific tasks as appropriate, such as log shipping or database mirroring status/health,
- Throughout the day, monitor long running queries and general performance of critical servers using a dashboard of key performance monitor counters,
- Stay up to date with SQL Server via magazine/website articles, blogs (using a good RSS reader) and other general research. Good managers understand the importance of allocating time to this task
Weekly Tasks
- Collate and update performance monitor log information for baseline analysis purposes looking for emerging trends amongst counter values. This information will feed into the monthly capacity planning task,
- Review recent wait statistics and include alongside the performance monitor counter information in the performance baseline,
- Execute index and statistics maintenance jobs (during periods of low activity) as appropriate. For systems with large enough maintenance windows, this may be a simple rebuild of all indexes (whilst being aware of the impact on log shipping/database mirroring), or a more targeted approach which selects the appropriate technique (reorganize/rebuild) based on the fragmentation level,
- Record disk usage trends and note for the monthly capacity planning exercise,
- Review server configuration for unauthorized configuration changes. Policy based management is purpose built for this task
Monthly Tasks
-
Review and update documentation and scripts as appropriate, ensuring their accuracy and suitability for use in disaster recovery situations,
- Review and plan the implementation of any service packs and/or hot fixes as appropriate,
- Conduct capacity planning using the inputs from the weekly baseline analysis and disk usage tasks. In addition to using this as a chance to identify upcoming budgetary requirements, this may also serve as an opportunity to consolidate databases and/or instances for a better performance/resource usage balance,
- Conduct "fire drills" to practice recovering from various failure conditions. Ideally these drills are random and unannounced, and involve simulated corruption (preferably not on production databases!) to ensure all staff are capable of recovering from a wide variety of possible failure conditions. The more these events are simulated and practiced, the quicker the recovery in the event of a real disaster
Cheers