If you start at a new place, as a SQL Server Database Administrator (DBA), what is one of the first things you should do? In my opinion, after figuring out the key servers and instances running you need to support… is setting up alerts.
By setting up alerts you can start to get an idea of what is not working and start focusing on things that are failing, etc first. All the while you can still check on backups and getting everything else set up and working, but if you don’t have alerts, well, you are blind.
Alerts should tell you..
1) When a physical server is down (network)
2) When backups fail
3) When jobs fail
4) When logins fail
5) I/O issues
6) the “critical” 14, 15, 16, 17
7) crazy cpu and memory issues
8) services going up and down
9) if your SAN is up/down
10) Hard drive getting close to 100%
and that is just the beginning. What other alerts should DBA’s set up *right away* to make sure they are on top of things?