I got an email from a user who shall remain nameless to protect the innocent:
“Tomorrow is my first day at work as a DBA . Can you please give me some suggestions. I will really appreciate it.”
Well, I don’t change jobs too often, but the last two times I did, here’s what my first week consisted of:
- Take an inventory of every SQL Server that I’m responsible for. It should be a simple inventory of server names at first – later, we’ll flesh that out with what applications live on what servers.
- Get my manager to sign off on that list – anything off the list, I’m not responsible for (yet).
- Check (NOT FIX, just check) the backups on all of those servers. You need daily full backups at the bare minimum, and preferably transaction logs until you know for sure what the recovery needs are for that application.
- Check the error logs on all of those servers. Look for critical errors like drive failures, SQL Server application errors, security problems, etc.
After that initial round was done, I sat down with my manager and said, “Here’s the ‘before’ picture. It’s going to take me X days to get automated backups set up across all of these servers. Until I make sure the data is backed up for everything, is it okay if I don’t do anything else?”
Of course they’ll agree, because the blood will be draining out of their face when they realize their data isn’t completely backed up. And yes, no matter how good of a shop you walk into, if you’re building your own to-do checklist, that means the other DBAs haven’t got their act together enough to give you a to-do list and the databases probably aren’t all backed up. This goes for large companies too – I went to work for one of the (few remaining) major financial institutions, and out of my 200+ instances, roughly 10% of them were not being backed up.
After you are absolutely sure every server is being backed up, and you’ve tested some restores (if not all), then my next steps were:
- Build a list of applications (not just database names) on every server.
- Build a list of contact names for each application. Sometimes you can reverse-engineer this by looking at the security information on the database, like who the sysadmins are.
- Build a list of sysadmins on each server.
- Set up a brief meeting or conference call with every application owner and get them up to speed.
The meeting should cover:
- Ask them how critical the database is – meaning, how much data can they afford to lose. Be prepared with a few backup schedules so they can see how much it will cost (in terms of maintenance windows and backup tapes) to hit their backup goal.
- Ask them who needs to be called if the server fails. (You’re going to implement monitoring sooner or later.)
- Ask them if they plan to upgrade versions of SQL Server in the future, or if their application will support newer versions. You’re not planning on taking action yet, just trying to get a picture of the environment.
- Ask them if they have any pain points related to the database.
- Show them the list of sysadmins on their server, and explain that these users can stop the server, truncate tables, delete backups, even delete whole databases without any warning. Ask them if that’s okay, or if they would like to shrink that list down.
After these meetings, build a server/database/application inventory for your manager. Show them your progress so far, and you’ll have a list of everything that everybody wants in terms of database management.
That will take you quite a while, but you want to get a picture of the environment before you go making any wild plans about how you want to change the world.